Overview

Transactional databases are optimized for running production systems--everything from websites to banks to retail stores. They excel at reading and writing individual rows of data very quickly while maintaining data integrity.

Transactional databases aren’t specifically built for analytics, but often become de facto analytic environments because they’re already in place as production databases. Because they’ve been around for decades, they’re familiar, and accessible, and ubiquitous.

If your organization does not have a pre-existing separate analytics stack, one of the quickest way to start doing analytics is to create a replica of your transactional database. This ensures that analytic queries don’t accidentally impede business-critical production queries while requiring minimal additional setup. The downside is that these databases are designed for processing transactions, not analysis. Using them for analytics is a great place to start, but you may run into limitations and need workarounds sooner than you would on a analytics-specific setup.

Transactional databases are row-stores, which means that data is stored on disk as rows, rather than columns. That’s great, for example, when you need to know everything about one customer in the user table since you can grab only the data you need. But it’s not so great when you’re trying to count the customers in a particular ZIP code, since you have to load not just the ZIP column, but the name, address and user_id columns as well.

What are Transactional Databases really great for?

Ensuring data integrity

Transactional databases are architected to be ACID compliant, which ensures that writes to the database either succeed or fail together, maintaining a high level of data integrity when writing data to the database. Transactional databases are therefore critical for business transactions where a high-level of data integrity is necessary (the canonical example is banking where you want a whole transaction--debit from one account and credit to another--to either succeed or fail).

Low-latency

Because transactional databases are designed to run production systems, they’re very good at operations that must complete in milliseconds. If you’re doing analytics on a transactional replica of your production database, the replica will likely be nearly in sync with the master database (i.e. sub-second latency).

Monitoring operational systems

Working with data from transactional databases to provide a real-time operational snapshot is a perfect analytical use case for transactional databases because there’s so little latency introduced by the replica. If you’re trying to monitor support workloads or inventory or another operational system and need to make decisions based on data that’s as fresh as possible, replicating a the production database may be the best option.