This post summarizes results of performance benchmarking some databases. For our test, we use a dataset similar to the Star Schema Benchmark and evaluate the performance on following queries:
- what is the time to insert a line order item?
- what is the time to fetch all the line orders for a customer?
we created a dataset with 44 M line orders. All databases (except Datastore which is serverless) use 8 vCPU and 32 GB RAM. For MySQL this resulted in a table with data_length = 7,269,679,104 bytes. below are the results:
| Task | MySQL | AlloyDB | MongoDB | RocksDB | Google Datastore | BigTable |
| Load all 44 M rows | 30 min 35.524 sec (24,491 rows/s) | 17 min 53.975 s (41,857 rows/s) | 10 min 5s | 8 min 12 s | 140 min | 17 min 39 s |
| Time to insert new row | 5 ms | 5 ms | 1 ms | < 1 ms | 115 ms | 1000 ms |
| Time to fetch all line orders corresponding to a customer | 1-2 ms | 2 ms | 9 ms | 8 ms | 134 ms | 700 ms |
| How data was loaded | LOAD DATA LOCAL INFILE | \copy FROM ... | Node.js | |||
| How tests were done | MySQL CLI | AlloyDB CLI (Postgres) | MongoDB CLI | RocksJava | Node.js | Java |
Overall, MySQL, AlloyDB, RocksDB and MongoDB perform close to each other and the numbers are inconclusive so as to suggest one over the other. The results confirm the belief that B-Tree based databases are optimized for minimizing the seek time to read whereas Log Structured Merge Tree (SS Table) databases are optimized for write throughput. The times for Datastore and BigTable are significantly higher presumably because they include the overhead of a network call. However, the network call happens in case of MySQL also when using the CLI (MySQL client calls remote MySQL server) and when we benchmarked MySQL with a Node.js application using the mysql2 driver, response times were practically the same as when using the CLI.
For MySQL there is almost a 6x improvement (in time) if the records are inserted in PK order, not to mention reduced data size (data_length). See this for example.
The results are not intended to suggest any database over another. There are many other factors to consider when making a decision.