With microservices-based e-commerce architecture, data and API orchestration are important for connecting and managing disparate systems.
Data orchestration provides a standardized, automated layer for collecting, preparing, and delivering data.
API orchestration goes one step further and involves an abstraction layer that unifies APIs from multiple services and prepares them for users.
We live in an analytics-oriented business world. Organizations spend billions to ensure that every single data point facilitates informed decisions. In fact, Gartner projects that by 2024, 75% of organizations will have deployed multiple data hubs to drive mission-critical data and analytics sharing and governance.
However, the massive increase in overall data and data diversity requires new compute and storage technologies. Nowhere is this more apparent than in e-commerce, where companies are moving away from monolithic software architectures and towards modular, microservices-based architectures.
Data orchestration establishes a standardized and unified data layer by tying together data from various services and platforms within an enterprise. It’s especially important in e-commerce where headless and modular commerce have become the backbones of successful e-commerce businesses. With microservices in particular, data often resides in decoupled locations and a unified picture is frequently lacking.
With a strong data orchestration layer, it becomes possible for organizations to unify data inside a technology ecosystem. Not only can companies extract meaningful patterns, they can also gain much more visibility and control over disparate systems within a technology stack.
In the first article, you’ll learn about what data orchestration is, why it’s essential, and how it works. We’ll also explore the related concept of API orchestration and explain why it’s crucial for building today’s modern e-commerce technology stacks.
Data orchestration provides a standardized, automated layer for collecting, preparing, and delivering data to eliminate duplications and corruption.
For instance, many e-commerce organizations control a large number of independent platforms that keep their data in a separate database. Examples of these platforms include order management systems (OMS), product information management (PIM) platforms, pricing and promotions engines, cart and checkout services, and account platforms. A data orchestration layer unifies the data from all these platforms and makes it available for analysis.
All orchestration systems need a few components to function, mainly the control plane and action plane.
The control plane is responsible for overall operations. These include starting and stopping data pipeline tasks, tracking progress, restarting failed jobs, taking alternative courses of action, monitoring resource usage, scheduling, and load balancing, error reporting, and managing metadata.
For example, if a customer places an order online, the control plane can start a data pipeline task that retrieves customer data from the customer relationship management (CRM) system and product data from the product information management (PIM) system. The control plane can then send this data to the order management system (OMS), which can handle inventory management and process the order and update inventory levels. In short, the control plane allows businesses to achieve a centralized and consistent view of data, optimize data processing for performance and resource utilization, and ensure data quality and governance.
The action plane is where data tasks run. These jobs use metadata and business rules contained in the control plane. There can be multiple task types for data collection, cleansing, or enrichment; and data engineers are responsible for creating and testing action plane jobs.
The action plane generally has an execution environment where the tasks are run. This execution environment would often integrate with either an on-premise data orchestration framework, or an enterprise licensed offering, or a completely managed cloud-based offering like Amazon Web Services (AWS) Glue or Google Dataflow.
Before we dive into data orchestration’s role in e-commerce, it first helps to gain a basic understanding about how it works. Data orchestration automates the manual processes of collecting, validating, converting, and transferring data at a basic level. It’s divided into four broad phases:
This phase is for data collection and validation from various sources. It also involves labeling the data for easy recognition and accessibility.
Implementing scheduled jobs allows systems to automatically capture metadata in periodic intervals and move it to temporary locations. Data collection can be done in real-time or batch-based depending upon the latency requirements.
Data cleansing involves:
Pre-defined business rules and interface contracts that are saved in the orchestrator control plane dictate the cleansing criteria.
Clean and correctly sourced data is the foundation of accurate analytics. It helps to provide a single source of truth that can be used by all other systems. In addition, cleansed data is integral if there are downstream machine learning models that use this data for decisioning.
This phase transforms data according to set business rules. It can include grouping, aggregating, correlating, or converting to different formats. For example, if you have an online store that shows the pricing of goods in the user’s local currency, the actual pricing may be in one currency, but the orchestrator will need to convert it and apply local taxes and other rates before making it available to users.
Data transformation also aggregates data to provide input for downstream systems. For instance, a downstream system that deals with demand forecasts may require individual sales records and instead may only need an aggregated daily or hourly sales report. The data conversion phase executes such transformations.
The tasks in this phase add more contextual information to the transformed data in order to enrich it. For example, an e-commerce site may depend on third-party sources for product reviews so that the display page has everything needed for a customer to make an informed decision.
When enriching data you can also add browsing behavior information from Google Analytics or third-party providers to the user accounts. This is a key aspect of e-commerce recommendation systems and an orchestrator can do most of this heavy lifting behind the scenes.
Data orchestration brings data elements from disparate systems together. In today’s enterprises, growing work empowerment drives people’s decisions about their work, and for that, they need data.
At a practical level, such decision support systems can be data marts, data warehouses, or data lakes. These central data repositories help employees run all kinds of workloads, ranging from simple spreadsheet reporting to complex natural language processing (NLP) code to advanced notebooks. A data orchestration layer is necessary to ensure these data repositories are regularly refreshed with data.
For example, a typical modular e-commerce system has a product information management (PIM) system, which is a central storage repository that stores product details and item codes. Data can be imported from different sources such as databases or flat files, and item descriptions can be enhanced, validated, and refined with business rules.
With a data orchestration platform, a user could create, import, enrich, validate, and manage complex item information. It would also allow product data to be distributed externally to various channels.
Let’s say a business requires a report connecting the product attributes of low sales volume products. Such a report is helpful in deciding the inventory strategy, product recommendation strategy, etc. If the organization needs to make such decisions using data that spans across multiple modules and platforms, data orchestration is a necessity.
Without a data orchestration system, developers would need to run from pillar to post to combine data from all these platforms and implement the report. A data orchestration system automates parts of this process and provides a unified view to support business decisions.
Some of the core functions of data governance include:
These are possible if data source and destination details, ownership, sensitivity, and other critical aspects of data are also captured. A data orchestration tool can record such metadata as it collects and transforms data. When interfaced with a data governance platform, this metadata can form the basis of governance workflows.
The e-commerce domain deals with a variety of data elements in the form of product data, user account data, and purchase behavior data. Many of these elements belong to the Personally Identifiable Information (PII) category and require strict governance policies.
Typically, strict governance policies around PII limits the individuals who can access the data. For example, if someone is working in business intelligence, he doesn’t need to know what a specific customer has bought. He only needs to know the aggregated information of sales around specific customer groups or products. But a person in customer support may need to know what a specific customer bought to address his concerns. A data orchestration platform helps in enforcing such governance policies.
Beyond the general data protection guidelines, e-commerce organizations have to comply with region-specific tax and payment regulations. Such regulations stipulate that companies should keep their data secure, track their usage, and allow consumers or third parties to access, modify, or delete their data.
A data orchestration platform can help maintain regulatory compliance because it bakes these business regulations into its tasks. These, along with data governance, can help businesses answer questions on what, where, how, and when.
With enterprise-wide data orchestration, businesses can create an abstraction layer that unifies and controls data feeds across a modern headless and modular commerce architecture. It can be hugely beneficial if your organization:
Application program interfaces (APIs) allow modern systems to talk to each other, exchange data, and share services. API orchestration is the process that combines multiple APIs into a single system, ensures their efficient organization and management, and automates and coordinates their interactions with each other.
For example, in a microservices-based e-commerce architecture, output from different platforms like an OMS, PIM, pricing and promotions, and cart and checkout needs to be aggregated to provide the required response for the frontend website.
API orchestration unifies all these APIs under a single orchestration API. This orchestration API acts as the middleman, interacting with and binding together all the other APIs for different service calls. This allows developers to build the library management application by interacting with this orchestrator API, simplifying the development effort.
Usually, the API orchestration tool generates an authenticated token to establish trust between the participating APIs. It secures the application’s resources and prevents any loss due to cyberattacks.
For example, an authentication process can validate the identity of an app or its end user. Once confirmed, an access app token is generated which can be used by apps as a bearer token to call APIs. Meanwhile, authorization will restrict access to certain APIs after successful authentication and access control can verify that the user has permission to access the requested resource.
Data orchestration and API orchestration are terms that are often used interchangeably. Both are necessary for headless and modular commerce systems, but they do differ and serve different purposes.
Data orchestration generally combines and organizes data from multiple data storage locations and makes it available for data analysis, which may or may not involve APIs. It deals with creating unified data products and allows for data discovery by finding the data schema and lineage by scanning through the files or by accessing a metadata store.
API orchestration takes the data orchestration approach further by coordinating multiple API services. Instead of making separate API calls to separate systems, the abstraction layer can make multiple calls to multiple different services to respond to a single API request. Service discovery focuses on finding the exposed APIs using a service registry.
Data and API orchestration are critical for today’s modern businesses. This is especially important in domains like e-commerce where data and APIs need to be organized and connected due to the nature of headless and modular commerce architectures. Orchestration brings order to chaos and helps companies to achieve business outcomes, comply with regulatory requirements, ensure data privacy, and measure operational performance.
Tech advocate and writer @ fabric.
Tech advocate and writer @ fabric.