Data Intelligence
This analogy though misses a mark explains the importance and hazards of data ; Data, akin to water, is vital and foundational, essential for life. Although it can be murky and hazardous, when managed effectively, it becomes indispensable. Like water, data can be contaminated and lead astray, yet it permeates all facets of existence, fostering growth and prosperity. It underpins operations in every company, enabling market establishment, innovation, and triumph. Within a company, data serves as the lifeblood, coursing through its systems, powering decision-making, driving efficiencies, and fueling progress.
11/9/20232 min read
Steps in Data intelligence journey
Define Business Glossary
The business glossary should be the first one always to be defined. Implementing a Business Glossary acts as a semantic translator for a company, facilitating clear communication and understanding across different departments. Instead of forcing everyone to adopt a new language, it helps individuals learn how to communicate effectively using a common set of terms. This leads to increased clarity, efficiency, and comprehension within the organization. Users can easily access and understand relevant information without needing to navigate complex technical details, promoting transparency and offering a holistic view of business terms, their associated data, metadata, and lineage.
Establish Data Domain Models
In summary, companies, regardless of their type or purpose, need to identify essential elements related to their mission. These elements, often represented as domains, include nouns like Customer, Employee, Product, and Location. For instance, consider the Customer domain. In large organizations offering multiple products or services, various systems or applications collect customer information. However, these systems may not store data consistently or use identical field names. For example, Salesforce and Netsuite CRM might both capture Date of Birth, but they organize and reference it differently in their databases. Salesforce might use ‘DOB’ alongside the customer’s name, while Netsuite CRM might label it as ‘Birth_Date’ in a separate table
Defining Policy Management & Reference Management
Reference Data Management is a crucial concept in the Data Intelligence journey. It involves creating and maintaining a solid solution for managing reference data. Reference Data consists of permissible values used by other data fields. For instance, when you input an address online, you’re typically restricted to selecting from a predefined list of countries rather than entering free-form text. This list of countries serves as an example of reference data. Just as we discussed with Domain Modeling, where different systems and applications may use varying names and structures for data fields, they may also employ different codes or values to define their reference data. By mapping these diverse codes and values to a common set, you enable effective translation and interpretation of data across your ecosystem.
Data Lineage
Data lineage refers to the ability to track the origin, movement, and transformation of data throughout its lifecycle. It provides a clear understanding of where data comes from, how it's processed or modified, and where it's stored or used. Data lineage helps ensure data quality, compliance, and transparency by enabling organizations to trace data back to its source and understand how it's manipulated or utilized along the way.
Record Linkage Principle
Also known as the data matching principle or data linkage, it refers to the task of
Identifying matching records
Merging those records with similar records from the same entities
Comparing data with the source data or robust trustable third-party source
The process involves complex algorithms and techniques such as probabilistic matching, fuzzy matching, and machine learning to handle variations and inconsistencies in data. The goal is to reduce duplication, improve accuracy, and ensure consistency in the data.
Idea Management
This is a structured process for ;
Generating, Capturing, and Evaluating ideas and turning them into tangible outcomes. It’s a critical component of innovation management within an organization.
The process involves various stages such as
idea generation,
idea capture,
idea evaluation, and
idea implementation.
Tools and platforms are often used to facilitate this process, enabling collaboration, transparency, and efficiency. I have to add that; the efficacy of this process hinges significantly on the endorsement and commitment of organizational leadership.
Data Quality Management
In the context of data intelligence, data quality is paramount. It refers to the degree to which data is accurate, complete, timely, consistent, and reliable. Poor data quality can lead to inaccurate insights, flawed decision-making, and loss of trust in data systems. Various data quality management strategies and tools are used to monitor and improve data quality, including data profiling, data cleaning, data validation, and data governance practices.