Established data modeling methods were engineered in the late 1970’s to design a single database. These data modeling methods function well when instantiated as an individual database. Multiple disparate databases instantiated from the same data model are best characterized as incompatible information silos. The simple reason for these information silos is that the interfaces between these databases were never adequately designed as the established data modeling methods were never engineered for this purpose.
While databases often contain common data such as addresses, organization, persons, and products, for example, established data modeling methods do not adequately address designing compatibility between the common data of these databases. The database incompatibility that results is an unintended artifact of established data modeling methods and is the essence of the data integration problem. The lack of compatibility between databases is the main reason that the established data modeling methods needed to be enhanced.
Data Integration by Design methods are a simple enhancement of the established data modeling methods that we named Integrated Data Modeling. Integrated Data Modeling methods focus upon designing interoperability between “integrated” databases where the proper data compatibility and data management functionality is provided. Integrated databases are a set of databases designed so that data from any of the set of databases may be dynamically combined in real-time.
Take two data models that represent two different data systems and place them next to each other. You may label them “Information Silo 1” and “Information Silo 2”. Within each designed information silo data model, there exists data entities connected by data entity relationships, which, when each data model is database instantiated and populated with data values, will each form a consistent database of data values. However, being information silos, there is no designed data entity relationships between silos, and as such, the data from one silo may never be reliably joined to data from the other silo.
This post is in response to a question about whether data integration by design is database centralization and about business agility.
Click "Read More" to see the rest of this post!
There have been some questions as to whether data integration by design actually represents a data integration solution. Data integration by design actually represents multiple data integration solutions and goes beyond what prior art data integration methods provide. (Please click "Read More" for the rest of this post) ---->
When a database is recast, the added master data database tables are not dependent upon any of the databases tables that existed before the database was recast. However, several of the previously existing database tables will be modified as they inherit foreign key columns from the added master data database tables. Obviously, if the inherited foreign key columns were mandatory, they could cause problems as the application software would have no way of addressing these inherited columns. The best option is to make all inherited foreign key columns optional. In this way, the software application does not need to address these optional database columns and, as such, the software application will be unaffected. These optional database columns will eventually be populated when the master data is later reconciled.
Entity-Relationship data model development methods are somewhat antiquated. They were developed in the late 1970’s and have not evolved to keep pace with the business needs in the area of data integration.
It is the natural state of all data to be integrated! It is the development of disparate data models that are unintentionally isolating data sets into disparate databases.
Using data federation and given two disparate databases populated with disparate
data sets, how does one insure that a return set of data combined from both
databases is valid and accurate?
When data modeling for an operational data system, what consideration is given when developing the logical data model for supporting data integration with existing data systems? When I learned data modeling, many years ago, integrating the developing data model with other existing data models was not an issue to consider. Has this changed or is the integration of data models still not a consideration in the methodology?
Is data naturally integrated or does it naturally exist as islands of disparate data? Our investigated in search of the natural state of data began with our universe of data. Our universe of data is composed from all metadata and all data from all data sources. Data models are perhaps our best depiction of our universe of data. Within each well-developed data model, every data entity is connected by entity relationships to other data entities. No data entity or cluster of data entities is isolated or partitioned from any other cluster of data entities. There are no islands of disparate data represented within a single data model and there is no sign of data isolation.
Our universe of data is composed from all metadata and all data from all data sources. Have you ever envisioned the major properties of this universe of data? Well, don’t feel too bad if you haven't as you certainly are not alone! However, having a universe of integrated data is a very big deal and the boundary of master data plays an extremely import part.