What is data integration?
Data integration is a set of practices, tools, and architectural procedures that allow companies to consume, combine, and leverage all types of data. Along with consolidating data from disparate systems, the process ensures data is clean and free of errors to optimize its usefulness to the business.
Integrated data is especially helpful for organizations with a diverse and distributed landscape, with a range of data sources and assets generating information. In these instances, data is often siloed and disconnected from other business data, leaving the organization without a unified view of its business.
Data integration allows the business to achieve its true potential. Important decisions are based on accurate information, and new technology that relies on clean data can be implemented and optimized, helping the company to innovate and prosper.
Data integration history
Combining different data sources has been a problem since business systems started collecting data. It wasn’t until the early 1980s that computer scientists began designing systems that supported the interoperability of heterogeneous or different databases.
One of the first data integration systems was launched by the University of Minnesota in 1991 – its objective was to make thousands of population databases interoperable. The system used a data warehousing approach that extracted, transformed, and loaded data from disparate sources into a view schema to make the data compatible.
Integrated data became a business imperative in the early 2010s with the advent of the Internet of Things (IoT). Suddenly a wide range of devices, applications, and platforms were generating enormous amounts of data – companies were drowning in it. Big Data became a thing, and businesses needed to find a way to harness the power of all the information.
Today companies of all sizes and industries use data integration to extract value from data that is stored across applications and platforms within the enterprise.
Data integration use cases
If a company generates data, it can be integrated and used to build real-time insights that benefit the business. An organization that spans diverse geographies can consolidate views across its entire operation to understand what’s working and what’s not. A singular view of the business makes it easier to understand cause and effect, allowing organizations to course-correct in real time and minimize risk.
Data integration allows companies to:
- Optimize analytics: Access, queue, or extract data from operational systems – commonly known as data warehousing – then transform and deliver it to the business in the form of trusted analytics.
- Drive consistency between operational applications: Ensure database-level consistency across applications (intra- and interenterprise), on a bi- and unidirectional basis.
- Share data outside your organization: Provide trusted data to external parties such as customers, suppliers, and partners.
- Orchestrate data services: Deploy all runtime data integration functionality as data services to ensure speed and accuracy.
- Support data migration and consolidation: Address data movement and transformation needs relative to data migration and consolidation, for example, when replacing legacy applications or migrating to new environments.
Benefits of integrated data
Data integration is a critical element to the overall data management strategy of any organization. Data integration helps deliver the right information and bring the organization together – coordinating all activities and decisions in support of the enterprise’s purpose, which is to effectively and efficiently deliver quality products and services to customers.
After data is gathered from across the enterprise, it is cleansed and validated to ensure it is free of errors and inconsistencies before it is integrated into a single data set or orchestrated across numerous data sets – which is often referred to as a data fabric methodology.
A comprehensive, accurate source of integrated data helps business support the innovative processes and technologies it needs to succeed. For example, artificial intelligence, machine learning, and Industry 4.0 initiatives would not be sustainable without access to large stores of integrated data.
Without data integration, data remains siloed within disparate applications and platforms. This hinders the operational and strategic capabilities of the organization. For example, important business decisions would be based on inaccurate analytics due to limited data sets.
See how these organizations are reaping the benefits of data integration:
- Evonik Industries: Active in more than 100 countries, Evonik Industries AG provides specialty chemicals that help improve the performance of everything from tires to mattresses. Learn how the company slashed system administration tasks by 50% and streamlined the handling of complex materials data.
- The Costain Group: A partner to government agencies in the UK, the Costain Group consolidates and accesses siloed data to make transportation projects more efficient while lowering emissions and saving public funds. The group relies on data integration to access more of its data, providing faster data-driven decisions to maximize outcomes.
How does data integration work?
The most commonly used data integration models rely on an extract, transform, load (ETL) process.
- Extract: Data is moved from a source system to a temporary staging data repository where it is cleaned and the quality is assured.
- Transform: Data is structured and converted to match the target source.
- Load: The structured data is loaded into a data warehouse or some other storage entity.
After the information is integrated, data analysis is carried out, providing business users with information they need to make informed decisions.
Types of data integration
There are different types of data integration, often depending on the source and kind of data.
- Bulk/batch data movement: This is the most common style, involving data extraction, data transformation, and data load.
- Data replication: Data is copied from one database to another, using only changed data, which is replicated into a secondary database.
- Data virtualization: This is a single view of all data in a database using a virtual abstraction layer, providing real-time access to data regardless of location, source system, or type.
- Stream data integration: This is used for data created in a constant flow or stream where transformation must occur on the fly.
- Message-oriented data movement: Chunks of data are grouped into messages that are read by applications, with data exchange happening in real time.
The challenge is choosing the right data integration style for your unique landscape and business needs. Most organizations need more than one. Understanding how to bring these data integration tools together into a coherent whole is critical.
More in this series
Data integration FAQs
Data intelligence is the value an organization gets from data integration. During the integration process, data is consumed, combined, and provisioned into data sets to satisfy the requirements of all business processes and applications that rely on access to data. Innovative and new technologies such as artificial intelligence and machine learning tools can analyze and transform these massive data sets into intelligent data insights, which are used to inform strategic business decisions.
Data orchestration extends beyond data integration, combining data discovery, preparation, integration, processing, and the connection of data across multiple and complex landscapes. Data integration is used for data in one place, while data orchestration processes and combines data in a flexible manner to enable new and/or improved business processes.
Big Data, by its very name, is composed of massive sets of unstructured data spread across disparate sources within and outside of the enterprise. Traditional databases and integration mechanisms are not equal to handling these volumes. Instead, in-memory databases, software, and storage solutions built for Big Data are necessary to acquire, store, and analyze the data. These powerful components support the velocity needed to ensure Big Data insights are actionable and valuable.
SAP Insights Newsletter
Gain key insights by subscribing to our newsletter.