Data is one of the most valuable assets a business can have and has a tremendous impact on its long-term success.

We have all heard the phrase that data is the new oil, speaking to the value it possesses. But to leverage this great resource an organization needs to ensure that there is a full-fledged plan in place to take advantage of its benefits, and that’s where data management comes in. 

Data management is the practice of collecting, keeping, and using data securely, efficiently, and cost-effectively. The goal of data management is to help people, organizations, and connected devices optimize the use of data within the bounds of ethics, policy, and regulations in order to produce actions that maximize the benefit and value to the organization. 

Data management plays an essential and foundational role for leveraging data at an enterprise level, for business analytics, for artificial intelligence and machine learning, sharing data with stakeholders with proper governance and security in place, for transactional and operational systems, and for monetizing data.

For any organization that wants to implement a proper data management practice, the 11 knowledge areas shown in Figure 1 can help create a framework.

In this article, I will delve deeper into each knowledge area.
1.    Data Governance

Data governance is a collection of processes, roles, policies, standards, and metrics that ensures the effective and efficient use of information, helping an organization to achieve its goals. It establishes the processes and responsibilities that ensure the quality and security of the data used across a business or organization. Data governance defines who can take what action, upon what data, in what situations, using what methods.

While the driver of overall data management is to ensure that an organization leverages value out of its data, data governance focuses on how people and processes interact with and use data. 

2.    Data Architecture

Data architecture translates business needs into data and system requirements and seeks to manage both data and the way it flows through the enterprise. The goal of data architecture is to show the company how data is acquired, transported, stored, queried, and secured. Explicitly, data architecture: is a discipline that documents an organization's data assets, maps how data flows through its systems, and provides a blueprint for managing data. It must include two main components: Enterprise Data Models (e.g., data structures and data specifications)and Data Flow Design.

·      Enterprise Data Model (EDM): The EDM is a holistic, enterprise-level, implementation-independent conceptual or logical data model providing a common consistent view of data across the enterprise. An EDM includes key enterprise data entities (i.e., business concepts), their relationships, critical guiding business rules, and some critical attributes.

·      Data Flow Design: Defines the requirements and blueprint for storage and processing across databases, applications, platforms, and networks. These data flows map the movement of data to business processes, locations, business roles, and to technical components.

3.    Data Modeling and Design

Data modeling is the process of diagramming data flows. Data modeling creates a visual representation of either a whole information system or parts of it to communicate connections between data points and structures. The goal is to illustrate the types of data used and stored within the system, the relationships among these data types, the ways the data can be grouped and organized, and its formats and attributes.

Data modeling is a critical component of data management. The modeling process requires that organizations discover and document how their data fits together. Data models help an organization to understand its data assets. 
used schemes are: Relational, Dimensional, Object-Oriented, Fact-Based, Time-Based, and NoSQL. Models of these schemes exist at three levels of detail: conceptual, logical, and physical. Each model contains a set of components. Examples of components are entities, relationships, facts, keys, and attributes. Once a model is built, it needs to be reviewed and once approved, maintained.

4.    Data Storage and Operations

Data Storage and Operations includes the design, implementation, and support of stored data, to maximize its value throughout its lifecycle, from the moment it is created or acquired to the time it is disposed of. This part of data management includes two sub-activities:

·      Database support focuses on activities related to data lifecycle, from initial implementation of a database environment, through obtaining, backing up, and purging data. It also includes ensuring the database performs well. Monitoring and tuning are critical to database support.

·      Database technology support includes defining technical requirements that will meet organizational needs, defining technical architecture, installing and administering technology, and resolving issues related to technology.

5.    Data Security

Data Security includes the planning, development, and execution of security policies and procedures to provide proper authentication, authorization, access, and auditing of data and information assets. Data security is the practice of protecting digital information from unauthorized access, corruption, or theft throughout its entire lifecycle. It’s a concept that encompasses every aspect of information security from the physical security of hardware and storage devices to administrative and access controls, as well as the logical security of software applications. 

This includes protecting your data from attacks that can encrypt or destroy data, such as ransomware, as well as attacks that can modify or corrupt your data. Data security also ensures data is available to anyone in the organization who has access to it.

 Effective data security policies and procedures ensure that the right people can use and update data in the right way, and any undesired access is restricted. Organizations needs to understand and comply with the privacy and confidentiality interests and needs of all stakeholders is in the best interest of every organization. The following diagram shows examples of data security requirements for various entities.
6.    Data Integration and Interoperability

Integration refers to connecting applications so that data from one system can be accessed by one or more other ones. But integration involves a third party (a middleman or, in software terms, middleware) that translates the data and makes it “work” for the receiving system. It’s not a direct path for information to get from point A to point B in this scenario.

Interoperability is real-time data exchange between systems without middleware. When systems are interoperable, they have the ability to not only share information, but to interpret incoming data and present it as it was received, preserving its original context.

In layman’s terms: interoperable systems speak the same language. On the other hand, integration is more like having a conversation through an interpreter.

Data Integration and Interoperability (DII) solutions enable basic data management functions on which most organizations depend:

·      Data migration and conversion

·      Data consolidation into hubs or marts

·      Integration of vendor packages into an organization’s application portfolio

·      Data sharing between applications and across organizations

·      Distributing data across data stores and data centers

·      Archiving data

·      Managing data interfaces

·      Obtaining and ingesting external data

·      Integrating structured and unstructured data

·      Providing operational intelligence and management decision support

7.    Document and Content Management

Document and Content Management controls the capture, storage, access, and use of data and information stored outside relational databases. Its focus is on maintaining the integrity of and enabling access to documents and other unstructured or semi-structured information, which makes it roughly equivalent to data operations management for relational databases. However, it also has strategic drivers.

The primary business drivers for document and content management include regulatory compliance, the ability to respond to litigation and e-discovery requests, and business continuity requirements. Good records management can also help organizations become more efficient. Well-organized, searchable websites that facilitate searching help improve customer and employee satisfaction.

8.    Master and Reference Data

All organizations have enterprise data on their products, assets, financials, employees, and customers, but where this data is housed can range from siloed databases to spreadsheets to old file cabinets. Bringing the data sources together in an actionable format accessible across the enterprise with consistent definitions and structured organization creates master data.
There are two kinds of reference data: external and internal. External reference data includes rarely changing norms like countries, currencies, languages, and units of measure. Internal reference data is where it can get complicated. It defines and structures master data, mapping it to your business processes. Internal reference data exists to turn other data into business information. 

9.    Data Warehousing and Business Intelligence

The concept of the Data Warehouse has been around for quite some time and emerged in the 1980s as technology enabled organizations to integrate data from a range of sources into a common data model. The integration of data helps to provide insights into new opportunities and open new possibilities for leveraging data to create organizational value.

The primary driver for data warehousing is to support operational functions, compliance requirements, and Business Intelligence (BI) activities (though not all BI activities depend on warehouse data). Business intelligence is the process by which enterprises use strategies and technologies for analyzing data, with the objective of improving strategic decision-making and providing a competitive advantage. BI combines business analytics, data mining, data visualization, data tools and infrastructure, and best practices to help organizations to make more data-driven decisions.

A Data Warehouse (DW) is a combination of two primary components: An integrated decision support database and the related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources.
10.    Metadata

By definition, Metadata is “data about data,” which doesn’t provide justice to the term. The kind of information that can be classified as Metadata is wide-ranging. Metadata includes information not only for technical but for business processes as well, data rules and constraints, and logical and physical data structures.

It describes the data itself (e.g., databases, data elements, data models), the concepts the data represents (e.g., business processes, application systems, software code, technology infrastructure), and the connections between the data and concepts. Metadata helps an organization understand its data, its systems, and its workflows.
Metadata is often categorized into three types: business, technical, and operational. These categories enable people to understand the range of information that falls under the overall umbrella of Metadata, as well as the functions through which Metadata is produced.

11.    Data Quality

Data quality is a measurement of how effectively a data set can serve the specific needs of an organization. High quality data is required for trusted decisions. It refers to the overall utility of a dataset and its ability to be easily processed and analyzed for other uses. A list of popular data quality characteristics and dimensions include timeliness, completeness, uniqueness, consistency, validity, and accuracy.
Conclusion

Now that you have a high-level overview of data management’s 11 components, you can see why data is one of the most valuable assets the modern business can possess, if used properly. There is plenty of reading available on the subject online, including the following references I used in this article.

Good luck with all your data management ventures! 

References

Diamantini, C., Giudice, P. L., Musarella, L., & Ursino, D. (2020, September 24). Fig. 1. the three kinds of metadata proposed by our model. ResearchGate. https://www.researchgate.net/figure/The-three-kinds-of-metadata-proposed-by-our-model_fig1_327314874  

Henderson, D., & Earley, S. (2017). Dama-Dmbok: Data Management Body of Knowledge (2nd ed.). Technics Publications.

Lean Data. Data Quality. (n.d.). https://www.lean-data.nl/data-quality/  

Lithmee. (2019, June 15). What is the difference between master data and reference data. Pediaa.Com. https://pediaa.com/what-is-the-difference-between-master-data-and-reference-data/  

Roberts, B. (2021, April 29). Integration vs interoperability: What's the difference? Surgical Information Systems Blog. https://blog.sisfirst.com/integration-v-interoperability-what-is-the-difference  

Stedman, C. (2021, August 4). What is Data Architecture? A Data Management Blueprint. SearchDataManagement. https://www.techtarget.com/searchdatamanagement/definition/What-is-data-architecture-A-data-management-blueprint  

What is Data Modeling?: Definition, importance, & types: SAP insights. SAP. (n.d.). https://www.sap.com/insights/what-is-data-modeling.html  

https://www.informationweek.com/big-data-analytics/data-management-heads-into-major-transition

Leave a Reply

Your email address will not be published. Required fields are marked *