Master Data Management
Encyclopedia
In computing
, master data management (MDM) comprises a set of processes and tools that consistently defines and manages the non-transactional data entities
of an organization
(which may include reference data
). MDM has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure consistency and control in the ongoing maintenance and application use of this information.
The term recalls the concept of a master file from an earlier computing era. MDM is similar to, and some would say the same as, virtual or federated database management
.
has taken out a mortgage
and the bank begins to send mortgage solicitations to that customer, ignoring the fact that the person already has a mortgage account relationship with the bank. This happens because the customer information used by the marketing section within the bank lacks integration with the customer information used by the customer services section of the bank. Thus the two groups remain unaware that an existing customer is also considered a sales lead.
Other problems include (for example) issues with the quality of data, consistent classification and identification of data, and data-reconciliation issues.
One of the most common reasons some large corporations experience massive issues with MDM is growth through mergers or acquisitions
. Two organizations which merge will typically create an entity with duplicate master data (since each likely had at least one master database of its own prior to the merger). Ideally, database administrator
s resolve this problem through deduplication
of the master data as part of the merger. In practice, however, reconciling several master data systems can present difficulties because of the dependencies that existing applications have on the master databases. As a result, more often than not the two systems do not fully merge, but remain separate, with a special reconciliation process defined that ensures consistency between the data stored in the two systems. Over time, however, as further mergers and acquisitions occur, the problem multiplies, more and more master databases appear, and data-reconciliation processes become extremely complex, and consequently unmanageable and unreliable. Because of this trend, one can find organizations with 10, 15, or even as many as 100 separate, poorly-integrated master databases, which can cause serious operational problems in the areas of customer satisfaction, operational efficiency, decision-support, and regulatory compliance.
, normalization
, rule administration, error detection and correction, data consolidation, data storage
, data distribution, data classification, taxonomy services, item master creation, schema mapping,product codification, data enrichment and data governance
.
The tools include data networks, file systems, a data warehouse
, data mart
s, an operational data store
, data mining
, data analysis
, data virtualization
, data federation and data visualization
. One of the newest tools, virtual master data management (also called virtual mdm) utilizes data virtualization and a persistent metadata server to implement a multi-level automated mdm hierarchy.
The selection of entities considered for MDM depends somewhat on the nature of an organization. In the common case of commercial enterprises, MDM may apply to such entities as customer (Customer Data Integration
), product (Product Information Management
), employee, and vendor. MDM processes identify the sources from which to collect descriptions of these entities. In the course of transformation and normalization, administrators adapt descriptions to conform to standard formats and data domains, making it possible to remove duplicate instances of any entity. Such processes generally result in an organizational MDM repository, from which all requests for a certain entity instance produce the same description, irrespective of the originating sources and the requesting destinations.
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...
, master data management (MDM) comprises a set of processes and tools that consistently defines and manages the non-transactional data entities
Transaction data
Transaction data are data describing an event and is usually described with verbs. Transaction data always has a time dimension, a numerical value and refers to one or more objects Transaction data are data describing an event (the change as a result of a transaction) and is usually described with...
of an organization
Organization
An organization is a social group which distributes tasks for a collective goal. The word itself is derived from the Greek word organon, itself derived from the better-known word ergon - as we know `organ` - and it means a compartment for a particular job.There are a variety of legal types of...
(which may include reference data
Reference data
Reference data are data describing a physical or virtual object and its properties. Reference data are usually described with nouns.Reference data is used in data management to define characteristics of an identifier that are used within other data centric processes...
). MDM has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure consistency and control in the ongoing maintenance and application use of this information.
The term recalls the concept of a master file from an earlier computing era. MDM is similar to, and some would say the same as, virtual or federated database management
Federated database system
A federated database system is a type of meta-database management system , which transparently integrates multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized...
.
Issues
At a basic level, MDM seeks to ensure that an organization does not use multiple (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations. A common example of poor MDM is the scenario of a bank at which a customerCustomer
A customer is usually used to refer to a current or potential buyer or user of the products of an individual or organization, called the supplier, seller, or vendor. This is typically through purchasing or renting goods or services...
has taken out a mortgage
Mortgage loan
A mortgage loan is a loan secured by real property through the use of a mortgage note which evidences the existence of the loan and the encumbrance of that realty through the granting of a mortgage which secures the loan...
and the bank begins to send mortgage solicitations to that customer, ignoring the fact that the person already has a mortgage account relationship with the bank. This happens because the customer information used by the marketing section within the bank lacks integration with the customer information used by the customer services section of the bank. Thus the two groups remain unaware that an existing customer is also considered a sales lead.
Other problems include (for example) issues with the quality of data, consistent classification and identification of data, and data-reconciliation issues.
One of the most common reasons some large corporations experience massive issues with MDM is growth through mergers or acquisitions
Takeover
In business, a takeover is the purchase of one company by another . In the UK, the term refers to the acquisition of a public company whose shares are listed on a stock exchange, in contrast to the acquisition of a private company.- Friendly takeovers :Before a bidder makes an offer for another...
. Two organizations which merge will typically create an entity with duplicate master data (since each likely had at least one master database of its own prior to the merger). Ideally, database administrator
Database administrator
A database administrator is a person responsible for the design, implementation, maintenance and repair of an organization's database. They are also known by the titles Database Coordinator or Database Programmer, and is closely related to the Database Analyst, Database Modeller, Programmer...
s resolve this problem through deduplication
Deduplication
The term deduplication refers generally to eliminating duplicate or redundant information.* Data deduplication, in computer storage, refers to the elimination of redundant data...
of the master data as part of the merger. In practice, however, reconciling several master data systems can present difficulties because of the dependencies that existing applications have on the master databases. As a result, more often than not the two systems do not fully merge, but remain separate, with a special reconciliation process defined that ensures consistency between the data stored in the two systems. Over time, however, as further mergers and acquisitions occur, the problem multiplies, more and more master databases appear, and data-reconciliation processes become extremely complex, and consequently unmanageable and unreliable. Because of this trend, one can find organizations with 10, 15, or even as many as 100 separate, poorly-integrated master databases, which can cause serious operational problems in the areas of customer satisfaction, operational efficiency, decision-support, and regulatory compliance.
Solutions
Processes commonly seen in MDM solutions include source identification, data collection, data transformationData transformation
In metadata and data warehouse, a data transformation converts data from a source data format into destination data.Data transformation can be divided into two steps:...
, normalization
Database normalization
In the design of a relational database management system , the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce smaller, well-structured relations...
, rule administration, error detection and correction, data consolidation, data storage
Data storage device
thumb|200px|right|A reel-to-reel tape recorder .The magnetic tape is a data storage medium. The recorder is data storage equipment using a portable medium to store the data....
, data distribution, data classification, taxonomy services, item master creation, schema mapping,product codification, data enrichment and data governance
Data governance
Data governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...
.
The tools include data networks, file systems, a data warehouse
Data warehouse
In computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...
, data mart
Data mart
A data mart is the access layer of the data warehouse environment that is used to get data out to the users. The data mart is a subset of the data warehouse which is usually oriented to a specific business line or team.- Terminology :...
s, an operational data store
Operational data store
An operational data store is a database designed to integrate data from multiple sources for additional operations on the data. The data is then passed back to operational systems for further operations and to the data warehouse for reporting....
, data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...
, data analysis
Data analysis
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making...
, data virtualization
Data virtualization
Data virtualization describes the process of abstracting disparate data sources through a single data access layer ....
, data federation and data visualization
Data visualization
Data visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information"....
. One of the newest tools, virtual master data management (also called virtual mdm) utilizes data virtualization and a persistent metadata server to implement a multi-level automated mdm hierarchy.
The selection of entities considered for MDM depends somewhat on the nature of an organization. In the common case of commercial enterprises, MDM may apply to such entities as customer (Customer Data Integration
Customer Data Integration
In data processing, customer data integration combines the technology, processes and services needed to set up and maintain an accurate, timely, complete and comprehensive representation of a customer across multiple channels, business-lines, and enterprises — typically from multiple sources of...
), product (Product Information Management
Product Information Management
Product information management or PIM refers to processes and technologies focused on centrally managing information about products, with a focus on the data required to market and sell the products through one or more distribution channels...
), employee, and vendor. MDM processes identify the sources from which to collect descriptions of these entities. In the course of transformation and normalization, administrators adapt descriptions to conform to standard formats and data domains, making it possible to remove duplicate instances of any entity. Such processes generally result in an organizational MDM repository, from which all requests for a certain entity instance produce the same description, irrespective of the originating sources and the requesting destinations.
Criticism of MDM solutions
The value and current approaches to MDM have come under criticism due to some parties claiming large costs and low return on investment from major MDM solution providers.See also
- Reference dataReference dataReference data are data describing a physical or virtual object and its properties. Reference data are usually described with nouns.Reference data is used in data management to define characteristics of an identifier that are used within other data centric processes...
- Master dataMaster dataMaster data, which may include reference data, is information that is key to the operation of business and is the primary focus of the Information Technology discipline of Master Data Management . This key business information may include data about customers, products, employees, materials,...
- Data stewardData stewardIn metadata, a data steward is a person that is responsible for maintaining a data element in a metadata registry. A data steward may share some responsibilities with a data custodian....
- Data visualizationData visualizationData visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form, including attributes or variables for the units of information"....
- Customer data integrationCustomer Data IntegrationIn data processing, customer data integration combines the technology, processes and services needed to set up and maintain an accurate, timely, complete and comprehensive representation of a customer across multiple channels, business-lines, and enterprises — typically from multiple sources of...
- Data IntegrationData integrationData integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...
- Information as a service
- Product information managementProduct Information ManagementProduct information management or PIM refers to processes and technologies focused on centrally managing information about products, with a focus on the data required to market and sell the products through one or more distribution channels...
- Identity resolutionIdentity resolutionIdentity resolution is an operational intelligence process, typically powered by an identity resolution engine or middleware stack, whereby organizations can connect disparate data sources with a view to understanding possible identity matches and non-obvious relationships across multiple data silos...
- Enterprise Information IntegrationEnterprise Information IntegrationEnterprise Information Integration , is a process of information integration, using data abstraction to provide a unified interface for viewing all the data within an organization, and a single set of structures and naming conventions to represent this data; the goal of EII is to get a large set of...
- Linked dataLinked DataIn computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...
- Semantic WebSemantic WebThe Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
- Data governanceData governanceData governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...
- Operational data storeOperational data storeAn operational data store is a database designed to integrate data from multiple sources for additional operations on the data. The data is then passed back to operational systems for further operations and to the data warehouse for reporting....
- Form, fit and functionForm, fit and functionThe term "Form, fit and function", sometimes called F3, in the manufacturing and technology industries is a description of an item's identifying characteristics...