~/

Published

- 5 min read

Part 2: Benchmarking UUID Formats, Data Types and Engines

Link Copied!

Share away.

img of Part 2: Benchmarking UUID Formats, Data Types and Engines

Recap

In Part 1, I explained why picking the right UUID format upfront matters more than it might seem. It affects your choice of SQL engine, the shape of your schema, and how well your system scales.

Now it’s time to break down the options, test them out, and land on a decision that fits.


Relevant Data

Before looking at specific formats, it’s worth reviewing the data types that impact how these IDs behave in relational databases.

Storage Data Types

TypeProsCons
CHAR(36)Human-readable, works out of the boxLarge (36 bytes), inefficient indexing, hyphens add extra weight
BINARY(16)Compact (16 bytes), fast indexingNot human-readable, encoding/decoding overhead
BIGINTSuper compact (8 bytes), fast ops, great for Snowflake or incrementNot globally unique, limited range (2⁶⁴), can’t store text-based IDs
UUIDNative in Postgres, compact (16 bytes), optimized opsPostgres-only, not portable
TEXT/VARCHARFlexible, supports alphanumeric formats like NanoIDSlower queries, inefficient indexes
INTEGERSmallest size (4 bytes), fastest indexing and sortingNot globally unique, predictable/guessable, capped at 2³²

Formats Considered

Time-Based ID Formats

These are naturally sortable and often better suited for distributed systems:

  • UUIDv7 — Time-based UUIDs designed for better indexing and ordering
  • Snowflake ID — Widely adopted format offering high throughput and ordering

Non-Time-Based ID Formats

These focus on uniqueness, simplicity, or flexibility:

  • UUIDv4 — Classic random UUIDs. Simple, but fragmentation-prone
  • NanoID — Short, URL-safe string format with good entropy
  • Auto-increment Integer — Fast and easy, but doesn’t scale across distributed systems

Benchmarks

This section covers results from two dimensions of testing:

  1. Client-side ID generation performance (no database interaction)
  2. Database-integrated performance across real operations like inserts, reads, updates, deletes, pagination, and concurrency

All benchmarks were recorded and visualized using matplotlib.

ID Format Generation Times

To start, I wanted to measure how long it takes to generate a single IDs for each format. Since ID generation happens client-side (with zero database interaction), this gives us a raw view of how efficient each format is — independent of the SQL engine we choose.

It’s worth noting that Auto-increment IDs are typically generated by the database itself, so they don’t appear in this benchmark. More importantly, auto-increment isn’t really suitable for distributed systems — which makes it a poor fit for our use case anyway.

As we can see, Snowflake IDs were the fastest to generate by a wide margin. UUIDv4 and UUIDv7 were both reasonably fast, with UUIDv7 offering the added benefit of sortability. NanoID, while secure and compact, had noticeably slower generation times.


Functional Performance Testing

So next up, we’re adding database interactions into the mix.

After benchmarking raw ID generation, I wanted to see how things hold up when you start doing “real application” work — insertions, reads, updates, deletes, and concurrent operations.

To test this, I built a small Python harness that defined table schemas for each format and ran the same set of operations across two local SQL engines — Postgres and MySQL — each spun up in its own container. The goal was to keep things consistent, repeatable, and as close to real-world conditions as possible.

👉 TODO: [Link to the test repo or notebook when ready]

Storage Notes:

  • UUIDv4 & UUIDv7 are stored using Postgres native UUID type
  • In MySQL, both UUID versions are stored as CHAR(36)
  • Snowflake ID is stored in both engines as a BIGINT
  • NanoID is stored in both engines as a VARCHAR(21)
  • Auto-increment is stored in both engines as an INTEGER

CRUD Times (per engine)

Read

Overall reads are overall the same with some faster results on postgres!

Insertion

As we can see, all formats performed similarly for insertions in both Postgres and MySQL — with one exception: Auto-increment was significantly slower in MySQL. This is likely because the database itself is responsible for generating the ID, introducing additional overhead compared to formats like UUIDv7 or Snowflake, which are generated client-side before insert. It’s a good reminder that engine-managed IDs don’t always guarantee faster writes.

Update

Update performance was fairly consistent across all formats in Postgres, with only slight variation. In MySQL, NanoID was notably slower to update — likely due to its longer string format and variable indexing cost. UUIDv7, UUIDv4, and Snowflake all performed similarly and reliably across both engines.

Delete

For deletes, we can see all formats showed similar deletion performance in Postgres, with only minor variation. In MySQL, UUIDv7 and UUIDv4 performed slightly better than the others, while Auto-increment was consistently the slowest. Overall, Postgres offered faster and more stable deletion times across all ID types.


Pagination Times

Pagination evaluates the performance of retrieving smaller subsets of data across multiple queries using LIMIT and OFFSET.

Snowflake and Auto-increment delivered the fastest pagination times overall, especially in Postgres. UUIDv7 performed solidly but slightly slower than expected, while NanoID and UUIDv4 lagged behind in both engines. Time-ordered formats like Snowflake likely benefit from more cache-friendly and index-aligned data access during pagination.


Batch Retrieval

Batch retrieval measures the speed of fetching a large, fixed number of rows in a single query.

UUIDv7, Snowflake, and Auto-increment all delivered fast batch read times, particularly in Postgres. NanoID showed a sharp slowdown in MySQL, likely due to its string-based indexing overhead. Overall, UUIDv7 offered strong, balanced performance across both engines.


Concurrent Updates

Postgres is a clear winner here due to MVCC

Here, UUIDv7 and Snowflake performed the best under concurrent write pressure in both MySQL and Postgres. Thanks to MVCC, Postgres handled concurrency consistently well. UUIDv7 maintained both index health and write speed — making it ideal for high-concurrency systems.


Potential Options

Based on this testing, the shortlist came down to:

  1. Postgres with native UUID type + UUIDv7
  2. Postgres with BIGINT + Snowflake IDs
  3. MySQL with BIGSERIAL + UUIDv7 or UUIDv4

End Decision

In the end we chose Postgres + UUIDv7, because it gave us:

  • Time-ordered UUIDs with great performance
  • Native storage support (UUID) and compact indexing
  • No coordination overhead
  • Better query consistency under load
  • zero cost differnce compairing it to MySQL hosted instances on ASW

Closing Thoughts

This research helped us make a small but foundational decision early, and avoid expensive rewrites later. UUIDv7 might not be for everyone, but if you care about pagination, performance, and clean indexing in Postgres, it’s a strong default.