E-commerce databases track unpredictable human actors across a complicated monetary system. At the end of the day, you want to deliver a product to a person you never see after getting a payment authorized by a third party.
Processing payments, adding products, tracking orders, and handling rebates are all complicated features your e-commerce database has to handle in order to reach that goal. This article will build on previous articles, talking about a scalable SQL data model and a NoSQL data model, to show how you can build a hybrid data model.
It’s important to utilize the strengths of various database engines specifically to handle more complicated e-commerce interactions.
Imagine a user returning one item out of a dozen that were discounted after being purchased in bulk. The user has forgotten their password, and the returned package has arrived at the wrong supplier warehouse.
When this return arrives, your database model has to handle a complicated web of working relationships to put everything back in the right place. You might also need to trace the chain of events that led to this outcome so you can send the customer a satisfaction survey.
Let’s see how you might create a hybrid database that can shine in this e-commerce situation.
The foundation of any e-commerce site are its users. They perform every action, and each one is an unpredictable person. Users move, change payment methods, forget to inform the businesses they frequent, then request a refund.
Fortunately, you only need to know a very fixed set of facts about users. There are rarely new types of things you need to know about people, but they are queried in a variety of ways. Some common access patterns for user data are adding new addresses or updating email addresses or passwords.
Because they feature a fixed set of data with a wide variety of access patterns, user data is a great candidate for a structured SQL data store. Also, user information is one of the best sources of analytical data for your business. For example, you’re likely to want to perform ad hoc queries about the users buying your products, further cementing the suitability of SQL for your user data.
The following user model implements some complicated patterns that an e-commerce site has to support, like resetting a password, adding multiple shipping and billing addresses, as well as tracking how long users spend on various pages and what they clicked to get there.
When you’re building out your model, be aware that you don’t need to collect all the information upfront. You don’t need to collect a user’s shipping or billing address when they first sign up for your application.
For example, fabric advocates keeping the onboarding process simple, only asking for an email, name, and password when absolutely necessary. However, as we’ve seen in our e-commerce site breakdowns, many businesses complexify this step, making it hard for people to checkout online.
Next, let’s look at what you’ll show users.
A product catalog represents every item that’s available for your users to purchase: objects, subscriptions, or donation tiers. All of these options have different structures and need to keep track of slightly different things. Subscriptions have features or access levels, while objects have sizes, colors, and images. In short, there’s a lot of flexibility and variety as to what can live in a product catalog.
Unlike SQL data, which is structured, normalized, and optimized to save on storage, NoSQL data is denormalized and optimized to save on CPU. This means it’s a bad choice for data analysis applications (like tracking users on a page) but a great candidate for high throughput applications when you fully understand the access patterns.
NoSQL can be perfect for hosting product information. Most of the time, product information is simply displayed to a user, and the stock available might be updated if an item is purchased. NoSQL databases are great at creating instantiated views populated with data from a single round trip query.
With NoSQL, your goal is to store all the data that your application uses for a page inside a single document. Here’s an example:
With this data model, when a customer requests a product detail page (PDP), your application requests and receives everything it needs to render the page in a single round trip. Images, prices, options, and dimensions for every variant that a customer can inspect are all stored in a single document. This means the data layer doesn’t spend CPU time finding and mashing together various tables, making NoSQL requests very horizontally scalable.
One of the other benefits of NoSQL is that data is returned in a format the web understands (JSON), which can save you the trouble of writing a custom Object Relationship Mapper (ORM) for your products.
One of the data storage options available to developers is in-memory storage, like Redis. With any caching strategy, the goal is to preserve CPU and disk I/O by moving transactions formerly performed in your data layer into faster access hardware like RAM.
The features that make the product catalog a good candidate for a NoSQL data store also make it a good candidate for caching in an in-memory data store. The data is read-heavy and rarely changes, so caches will save disk time while maintaining accuracy.
Caching is also a good solution for session management, and in the e-commerce context, storing a user’s current cart. Rather than looking up a cookie in your database or returning all the items in their cart every time a user makes a request, you can push that workload to your caching layer and achieve a request-response time that would require significantly more resources if handled by your database.
Payment systems allow users to purchase and receive your products by linking your website to various services that enable payment. This is a vital moving part of any e-commerce website, and it has to function well to ensure customer satisfaction. A broken checkout flow is often a dealbreaker.
This step also has the most stringent security requirements for what kinds of data are stored and how. In general, don’t store much credit card information, and be sure you encrypt any identifiable payment information you do save, like the card identifier.
The less data your payment model stores, the better. It reduces your legal liability and helps you stay in compliance with Payment Card Industry-Data Security Standards (PCI-DSS).
The ideal payments table does not contain much information. When a user finally clicks Buy after entering their payment information, your application can mostly bypass your data layer. Instead, it will pass the request to a payment processor, which will handle it and return the result.
If you think you need to store payment information for future use, your payment processor very likely has an option for storing that data and will provide you a unique, not personally identifiable token to use next time.
Order models are the part of your database that tracks the process of getting a product to a customer, from the moment they first express interest all the way through getting the item into their hands. An order can involve ongoing access to a website or handling a discrete event, like sending a product to a client.
A smooth order flow is vital for creating a user experience that people are willing to repeat. The order is where all our other database schemas get put to use.
Here’s an order for a product:
You’ll notice that discount and price information have been duplicated from our NoSQL data model into the cart_item
table. The NoSQL model represents the discounts that are shown to a customer when they enter a page, and the saved price and discount are the discounts that were actually applied to the item.
If a price or discount changes, you’ll want to be sure to keep a record of what prices were actually paid. Be aware of fault lines like this between your NoSQL and SQL components.
Revisiting the initial scenario presented in this article, let’s see how the model fares all stitched together.
The user has forgotten their password, but our user’s model handles updating it. The returned package has arrived at the wrong supplier warehouse, but we tracked the original source in the order, so we can forward it to the correct destination. Calculating the refund is tricky, but because we tracked discount information on an order level, we can run an ad hoc query to figure out the appropriate rebate.
When planning your data model, remember that SQL is optimized for efficiently storing structured data and performing arbitrary queries on data, while NoSQL is optimized for efficiently delivering unstructured data.
There are dozens of challenges that an e-commerce website can face that aren’t covered by this model, like handling subscriptions, applying a discount to an item based on user behavior, and tracking inventory.
Fortunately, fabric has a team of experts who’ve built e-commerce data models and websites for some of the world’s largest e-commerce companies. They also partner with agencies and system integrators with similar experience.
Tech advocate and writer @ fabric. Developer @ Tough Pixels.