What is data mesh?
Data mesh is an approach to data management that uses a distributed architectural framework.
default
{}
default
{}
primary
default
{}
secondary
Data mesh overview
Data mesh represents a new way of looking at information. It is born from the growing concept that data is actually itself a product, a tool, a means to an end—not simply something businesses gather and analyze later in a backward-looking attempt to understand things that have already happened.
Data mesh definition
Data mesh is an approach to data management that uses a distributed architectural framework. In other words, it spreads ownership and responsibility for specific datasets across the business to users with the specialist expertise to understand what that data means and how to make the best use of it.
Data mesh architecture connects and draws data from various sources like data lakes and warehouses. It then distributes the relevant datasets to the appropriate human experts and domain teams across the business. Essentially, a voluminous jumble of data in a central data lake is sorted and distributed into manageable chunks to those best suited to understand and leverage it.
Data mesh origins
Data mesh originated around 2009 in response to the challenges of scaling data architectures in large, complex organizations. The core idea behind data mesh is to decentralize data ownership and architecture, treating data as a product and assigning responsibility to domain-oriented teams. Data mesh combines principles from domain-driven design, product thinking, and self-serve infrastructure, enabling organizations to scale data systems without creating monolithic bottlenecks.
Centralized data management models often fail in large organizations due to:
- Bottlenecks in delivery: A single central team becomes overloaded, slowing down data access and analytics.
- Ownership gaps: No clear accountability for data quality across domains causes inconsistent standards and trust issues.
- Scalability issues: As data volume and complexity grow, centralized systems struggle to scale without massive overhead.
- Poor domain knowledge: Central teams lack deep understanding of business domains, leading to low-quality or misaligned data products.
- Limited agility: Changes requiring coordination through one team slow down responsiveness to evolving business needs.
Benefits of data mesh
Legacy databases and limited data management infrastructures have contributed to the sense that data is something to be held in a single vault and meted out at the discretion of a few data managers. Now, data is the fuel that drives your business; it should be given freely to those subject specialists who best know how to make it work and drive profit in competitive times.
The main advantages of data mesh architecture can be summarized in three categories:
Scalability and agility
Increased data accessibility: Data mesh ensures that all the right people across your organization can access the data they need—to be the absolute best at their jobs.
Customizable data pipelines and processes: Many of the best and potentially most profitable projects get shelved due to the enormous hassle of curating the unique and customized datasets needed to achieve success. With a data mesh, teams can quickly access and test new project models without the traditional loss of time or resources.
Reduced bottlenecks: This is an obvious win/win for both IT teams and data owners. Furthermore, by reducing a source of frustration and irritation businesses can help to break down silos that stand in the way of healthy business development.
Quality and trust
Improved analytics capabilities: When organizations see data as a product to be used every day, teams start to take a data-first approach to planning and strategy. This leads to a reduction in errors and a more objective, less opinion-driven approach to business development.
Cross-domain collaboration and reuse
Reduced strain on central data management teams: This means not only reducing backlogs and frustration but also freeing up countless hours for your talented IT teams to devote to more specialized, interesting, and profitable pursuits.
By decentralizing ownership and treating data as a product, data mesh empowers organizations to move faster, build trust in insights, and scale seamlessly across domains.
Core principles of data mesh
When we talk about data lakes and data mesh, we’re essentially talking about big data. What makes data “big” is not simply its huge volume. Among other criteria, big data is also defined by being complex, variable, rapidly generated, and unstructured.
A linear database is like a spreadsheet: it has columns and rows and immutable categories into which all the data components must fit. Some of the data generated from machinery, sensors, and industrial sources is structured and fits neatly into a linear database. No matter how much data volume you have to deal with, if it’s 100% structured it doesn’t meet big data criteria and can be housed in a linear database, making it relatively straightforward to filter and extract.
But increasingly, modern big data is unstructured and consists of visual components, open-ended text, and even video and rich media. This crucial data can comprise thousands of terabytes of information for many companies, and it simply can’t be stored in a standard linear database.
Enter the data lake. As big data volumes began to increase, data lakes were developed as a place in which complex data could be stored and accessed from a central repository in its raw format. While data lakes represent an excellent solution to the big data problem, they nonetheless have weaknesses. Data lakes lack certain analytic features, making them dependent on other services for retrieval, indexing, transformation, querying, and analytics functionality.
Four data mesh principles address the challenges presented by data lakes:
1. Domain ownership
Ownership in data lakes is complex to define when too many players generate and access data. In the absence of clearly defined roles and responsibilities, the same set of data can be managed differently by different parties, creating inconsistencies that make it difficult to use. Likewise, other data ends up being neglected when it is not actively managed by those who will ultimately be using it.
Data mesh architecture solves this by decentralizing ownership. It ensures data governance is clearly distributed by domain so that each team or domain expert governs the data they produce and use. To back this up, data meshes also use a federated governance structure to also allow for central control of data modeling, security policies, and compliance. Data mesh ownership creates accountability and improves data usability.
2. Data as a product
Data lakes can fail to ensure data quality when the volume of data becomes too large or when central data managers themselves do not understand it. Data mesh architecture fundamentally treats data as a valuable product, which puts the quality and completeness of data at the forefront of data management. Presumably, each team knows the most important criteria and issues that they wish to extrapolate from the data they are collecting. By integrating these criteria and priorities into the architecture, data mesh can help to ensure the continuous and prioritized delivery of clean, fresh, and complete data, even when larger datasets are involved. And of course, when machine learning algorithms are applied, these criteria and resultant datasets become increasingly accurate and useful over time.
3. Self-serve data platform
Data lakes can create bottlenecks because of their centralized architecture and traditionally difficult data retrieval processes and protocols. This typically means that the control of a large amount of consolidated data comes down to a single IT or data management team. And, as volumes of data (and demand for its retrieval) increase, these IT teams get over-taxed.
Furthermore, the data must be reviewed and structured properly to ensure compliance and adherence to data governance principles. When facing undue pressure, there can be a tendency to rush through these compliance stages, which generates potential risk and loss to the company. Data mesh principles address this by enabling a self-serve data platform. It gives access and control to authorized specialized users who have a greater vested interest in the data—all while employing stringent, baked-in security protocols. This reduces bottlenecks and accelerates data delivery.
4. Federated governance
While decentralization is key, organizations can’t abandon governance. Data mesh uses a federated governance model to balance autonomy with consistency. This means domains manage their own data products, but must adhere to shared standards for security, compliance, and interoperability across the organization. This hybrid approach of data mesh governance ensures agility without sacrificing trust or regulatory adherence.
While data mesh challenges exist, decentralized and democratized data management architecture has made businesses smarter, more agile, and more accurate. How? By ensuring the right data is immediately available to the right people, wherever and whenever they need it. Data mesh makes data-as-a-product an actual reality, reducing barriers and prioritizing the value of information so that teams can get faster, unobstructed access to essential data.
Data mesh architecture and frameworks
We've discussed how data mesh is a decentralized form of data architecture that treats data as an essential business management tool. And importantly, how independent teams are responsible for handling the data within their domains of work and expertise, while still ensuring compliance with centrally-determined data management practices. This change in mindset is at the core of data mesh.
A bird’s-eye view of a data mesh architecture
In a data mesh, domains are the core producers and consumers of data, each owning its data as a product to ensure quality and relevance. The self-serve platform provides the infrastructure for publishing, discovering, and consuming these data products, along with automated security and compliance features. Governance operates in a federated model, balancing global standards for interoperability and security with local autonomy, so domains can innovate while maintaining trust and consistency across the organization.
To better understand how the data mesh architecture fits together, let’s dive into its three main components.
Data sources
These represent the repository—like a data lake—into which the primary raw data is being fed. Whether it’s collected from cloud IIoT networks, customer feedback forms, or scraped web data, this is the raw input data that users will reference and process as needed across the network. While a data lake approach would funnel all this data into one central place, the data mesh methodology instead distributes the responsibility for intake, storage, processing, and extraction of this raw data within a series of responsible domains.
Data mesh infrastructure
Information is not solely isolated within individual departmental domains but can also be shared at will across the organization’s operational network while remaining compliant with established data governance guidelines. This is a direct result of two of the key pillars of data mesh: A self-serve data platform and federated governance. The self-serve data platform provides the tooling and infrastructure needed by each domain to universally ingest, transform, process, and serve their data. Meanwhile, the federated governance principles ensure standardization across an organization, allowing for effortless interoperability of data between all domain teams.
Data owners
As the final component of a data mesh, data owners are responsible for applying the compliance, governance, and categorization protocols for their departments’ data. For example, HR files must be stored using certain security protocols, they must not be used for this or that purpose, they must only be released to such-and-such a person. Of course, each department will have categories and types of data unique to their department or purposes. In a data lake system, IT teams must grapple with all these different protocols and categories for all the different data owners who have dumped stuff into the lake. Whereas data mesh architecture gives domain owners full authority and control over these matters because again, who better than subject area experts to manage their own data and ensure that it meets quality standards?
The data mesh operating model
The data mesh operating model brings together people, processes, and technology to enable decentralized data management at scale. This collaboration ensures that data flows seamlessly across the organization, fostering trust, agility, and reuse without relying on a single centralized team. Data mesh enables interoperability and discoverability by enforcing shared standards and providing a common platform, consistent formats and search terms, and governance rules for publishing and consuming data products. Data mesh tools like data catalogs and registries allow teams to quickly find, securely access, and use data products across the organization.
Think of a data mesh as a modern city: Each neighborhood (domain) manages its own utilities and services—like water, electricity, and waste—because they know their local needs best. The city provides shared infrastructure, such as roads and public transportation (self-serve platform) and safety standards (governance), so neighborhoods can connect, access city resources, and collaborate without chaos. This way, resources flow freely across the city, everyone follows common rules, and innovation thrives locally while the whole city functions smoothly.
Data mesh in practice: Examples and use cases
For data management solutions to evolve and become more successful, they have to be usable and relevant for a wide range of applications and operations. As data mesh architecture and user friendliness improve, we see an increased range of businesses functions organizations can enhance with a secure and distributed approach to data as a product and a tool.
Let’s explore some common data mesh business use cases.
Sales
For sales teams, it all comes down to acquiring, nurturing, and closing leads. The more time your sales team members spend at their desks doing administrative tasks, the less time they have to build relationships with new customers. With data mesh architecture, sales team users don’t need to be data management and retrieval experts to have the most powerful and relevant datasets and combinations at their fingertips. When sales departments have all the right data to analyze, it translates into more actionable insights and strategies.
Sales data mesh example: Regional or product-specific sales teams can own their CRM and pipeline data domains, enabling accurate forecasting and real-time dashboards without waiting on a central IT team.
Supply chain and logistics
Modern supply chains are vulnerable to an enormous range of disruptions. A competitive edge comes when companies can pivot quickly and respond to both threats and opportunities with equal agility. Today’s global supply chain data is coming in thick and fast—from customer feedback, to IIoT networks, and digital twins. When experienced and savvy supply chain managers are themselves able to curate and drill into any of those datasets in real time, businesses get a powerful source of insight and acumen.
Supply chain data mesh example: Supply chain optimization requires real-time visibility into inventory levels, supplier performance, and logistics data. Data mesh gives each domain—procurement, warehousing, transportation—ownership of its data products, enabling faster decisions and cost-efficient operations.
Manufacturing
As part of the supply chain, a company’s manufacturing operations are equally vulnerable to rapid market shifts and volatile customer demands. In the past, design and R&D teams would have to rely on historical customer data, fed to them from other departments. Today, the data mesh brings live data access to users behind the drafting table, on the R&D and testing teams, and all the way to the manufacturing floor. Real-time customer feedback can inform product development in an instant, and up-to-the-minute intel from IIoT networks and digital simulations can help factories run safer, faster, and more efficiently.
Manufacturing data mesh example: Plant-level teams can own sensor and machine performance data, enabling predictive maintenance and reducing downtime through decentralized analytics.
Marketing
Today, customer demands and expectations are shaping the future and changing and growing at an unprecedented pace. A single brand typically has myriad consumer touchpoints across social media, targeted digital ads, and online and omnichannel shopping portals. The current market sees the growing desire for rapid customization, shorter product lifecycles, and enormous levels of choice and competition. To understand and get ahead of these trends, modern marketers need real-time and simultaneous access to a wide variety of datasets. In the past, this has meant requesting (and waiting for) this data from other departments. With a data mesh setup, however, marketers can curate and access this data in the moment, on their own terms.
Marketing data mesh example: Building a customer 360 view requires integrating data from multiple channels like email, social, and paid ads. Data mesh enables each channel to own its data product, ensuring accurate, real-time insights for personalized campaigns and better customer experiences.
Human resources
HR teams must manage large amounts of extremely complex and sensitive data. And with the growing trend toward remote and hybrid workplaces, that data is getting more complicated and geographically diverse every day. Not to mention the ever-changing set of compliance and legal issues that HR teams must so urgently stay on top of. From hire to retire, HR leaders must be able to validate, assess, and analyze some of the most broadly disparate datasets in any organization. Data mesh architecture allows for the appropriate security protocols and tightly restricted access. At the same time, it enables authorized HR users to access data and information quickly and without dependence upon complex internal protocols and multi-departmental bureaucracy.
HR data mesh example: Recruiting, payroll, and performance management teams can govern their own data domains, improving compliance and enabling real-time workforce analytics for strategic decision-making.
Finance
As with HR, finance and accounting teams are also responsible for enormously crucial and sensitive data. Modern ERP systems are revolutionizing finance, using in-memory database technology to customize up-to-the-moment reports, analyses, and projections. Yet even when finance teams are using the best databases and ERPs, they often still face obstacles due to longstanding and rigid cultures, heavy silos, and bureaucratic, old-school processes. Data mesh architecture brings a fundamental shift in how finance data is looked at and managed. It can even shake up stagnant thinking that can happen when organizations empower teams to own and revise their own aging data processes.
Finance data mesh example: Financial planning teams can own revenue, expense, and investment data domains, ensuring accurate forecasting and agile scenario modeling without relying on a single central team.
It's clear that data mesh is not just another buzzword and is a data strategy trend that needs to be taken seriously. Companies of all sizes and industries are using data mesh, looking for ways to use data to create insights and value.
Data mesh alternatives
While data mesh offers a decentralized approach to data management, it is not the only option. Traditional architectures such as data lakes and data warehouses remain widely used for centralizing and storing large volumes of data, often paired with data lakehouses that combine structured and unstructured data capabilities. Other models, like data fabric, focus on creating a unified layer for data integration and orchestration across diverse systems. Each alternative addresses scalability, governance, and accessibility differently, making the choice dependent on organizational needs and maturity.
Let’s look at the data mesh alternatives and how they compare.
Data mesh vs. data lake/lakehouse
Data mesh vs. data warehouse
Data mesh vs. data fabric
Implementing data mesh
Implementing a data mesh requires a strategic approach that balances decentralization with shared standards. Here are the key data mesh steps:
- Identify pilot domains: Start small by selecting two or three domains with clear business value and strong data maturity. These teams will serve as early adopters, proving the data mesh model before scaling across the organization.
- Establish the platform: Build a self-serve data platform that provides common tooling for publishing, discovering, and consuming data products. This includes data catalogs, APIs, and automated security features to reduce friction for domain teams.
- Define federated governance: Create governance policies that enforce global standards for security, compliance, and interoperability while allowing domains autonomy. Governance should include clear roles, data product definitions, and quality expectations.
Anti-patterns to avoid
When data mesh is done incorrectly by not following natural organizational patterns, it can lead to confusion and discord. An anti-pattern in data mesh is a recurring approach or practice that seems helpful but ultimately undermines the core principles of the architecture. Anti-patterns to avoid include:
- Treating data mesh as just another centralized data lake.
- Ignoring cultural change—technology alone won’t solve ownership issues.
- Over-engineering the platform before proving business value.
- Lack of clear accountability for data quality.
- Scaling too quickly without validating the data mesh model in pilot domains.
Five best practices for data mesh
- Start small and iterate: Use pilot domains to refine processes before scaling.
- Treat data as a product: Define ownership, SLAs, and usability standards for every dataset.
- Invest in shared tooling: Make publishing and discovery easy for domain teams.
- Embed governance early: Balance autonomy with compliance from the start.
- Focus on business outcomes: Align data products with measurable value, not just technical goals.
By combining domain ownership, a robust platform, and federated governance, organizations can improve agility, trust, and cross-domain collaboration—without the bottlenecks of traditional centralized models.
Measurement and metrics
Evaluating success requires data mesh metrics that balance technical performance with business outcomes. These metrics can include:
-
Data product quality SLOs/SLAs: Essential, but must be tailored to each domain’s context rather than applied uniformly. Example data product KPIs are:
- Data freshness: Percentage of data products updated within agreed time window—for example, hourly or daily
- Completeness: Percent of required fields populated across datasets
- Availability: Uptime of data products—for example, 99.9%
-
Consumer adoption and reuse: Can be a strong indicator of value, but measuring it accurately often involves tracking usage patterns and feedback across teams. Example consumer adoption and reuse KPIs are:
- Number of unique consumers per data product
- Cross-domain reuse rate: Percentage of data products consumed by multiple domains
- Consumer satisfaction score from surveys or feedback
-
Time-to-insight and cost-to-serve: Highlight efficiency gains compared to centralized models, but these improvements depend on organizational maturity and baseline processes. Example time-to-insight and cost-to-serve KPIs are:
- Average time from data request to actionable insight
- Reduction in operational cost compared to centralized model
- Percent of decrease in backlog for data requests
-
Common competitor gap to capture: Focus on areas where competitors struggle and use data mesh principles to outperform them. Example competitor gap to capture KPIs are:
- Number of identified competitor weaknesses addressed through data product capabilities—for example, improved discoverability, faster data access
- Time-to-market advantage for new data products versus competitors
- Increase in self-service adoption rate compared to competitor estimates
Together, these metrics provide directional insight into whether data mesh is delivering agility, trust, and scalability without assuming one-size-fits-all benchmarks.
Data Mesh FAQs
Interoperability is defined as the ability of a system or a product to work with other systems or products without special effort on the part of the user. Techtarget adds that it helps organizations achieve higher efficiency and a more holistic view of information and data. For more detailed information, this Open MOOC lesson provides the basics of data interoperability as well as the different types and layers of interoperability of data.
In the context of data, interoperability goes beyond simple connectivity to include discoverability (making data products easily found across domains through catalogs or registries); contracts (clear, machine-readable agreements on data schemas, APIs, and SLAs to help ensure consistent consumption); and shared standards (common governance, metadata, and security practices for frictionless data exchange among domains).
An example of interoperability is when the Customer domain publishes a data product with customer profiles, then the Sales domain consumes this data to enrich pipeline analytics. Interoperability ensures the Sales team can discover the customer data product in a catalog, rely on its contract for schema and quality guarantees, and integrate it using shared standards without manual work.
Data mesh and data fabric are different architectural approaches within a company’s data management strategy.
Data fabric is a technocentric approach that seeks to find increasingly seamless ways to manage complex metadata and unstructured information by merging AI, machine learning, and advanced analytics. Data mesh on the other hand, while dependent upon all the technological developments within the data fabric, is more focused on integrating data management processes with the human users who depend upon them—and finding ways to streamline and simplify data access and usefulness from a people perspective.
There is something of a chicken-and-egg relationship between data mesh and data fabric: ever-advancing data fabric technologies are needed if data management is to evolve at the speed it needs to. Yet, without an accompanying evolution in human processes and organizational strategies, people will not be able to properly leverage the advancing data fabric technologies. Just as DOS and complex interfaces gave way to the more seamless computer operating systems we enjoy today, data mesh and data fabric architectures are destined to grow increasingly seamless as these processes and technologies advance.
SAP PRODUCT
Connect data, drive innovation
Learn how SAP Business Data Cloud accelerates data-driven insights across your enterprise.