Mehdi Valette - 12 février 2026

Integer or UUID primary keys in relational databases?

Integer or UUID?

Database primary keys are integers by default, but they can leak information and make merging databases more challenging. To address these issues, some people use random UUIDv4 for primary keys. UUIDv4 come with their own drawbacks, but there's a way to get the best of both worlds!

TL;DR

Use integers for primary keys as they are more performant. Add a UUIDv4 in an additional column if necessary. In any case, avoid exposing primary keys outside the database.

Integers

Most relational databases default to integers for primary keys, whether it's integer, unsigned integer, big integer, etc.

erDiagram

Student {
  int student_id
  string full_name
}

Course {
  int course_id
  string name
}

Student_Course {
  int student_id
  int course_id
}

Student ||--o{ Student_Course : ""
Student_Course }o--|| Course : ""

// seen by the client
{
  "student": 
  {
    "student_id": 1,
    "full_name": "Alice",
    "courses": [
      {
        "course_id": 1,
        "name": "Mathematics"
      },
      {
        "course_id": 2,
        "name": "Physics"
      }
    ]
}

Advantages

They come in ascending order. That helps the database to organize its index, which improves insertion and search performance. When we query one record in the main table, we possibly query tens of records in linked tables. Therefore, optimized primary keys are an important factor in the database's performance.
They have a small size. That's especially useful for many-to-many relationships, where a bridge table contains mainly primary keys. If each key is half the size, the overall table is half the size, which not only saves space, but also enables caching more records for faster lookup.

For example, to get a student with their courses, we need to get just one record from Student, but many records from Student_Course, and many records from Course.

Drawbacks

They leak information. Ascending numbers can expose information that we prefer to keep hidden. For example, if a student has id = 1500 when other students in their class have id > 1680, and we know that each year about 100 students join the school, we deduce that the student with id = 1500 is probably repeating the grade. Disclosing more information than intended is a risk for data protection and security (collecting a lot of benign information can facilitate an exploit).
Merging databases is more challenging. When merging two databases together, we must change all the primary keys and all the foreign keys from at least one of the databases. Discovering and rectifying a mistake can prove difficult because nothing identifies the records outside of their initial table.
Uniquely identifying records. To keep track of an entity's events, for example for auditing, we need to log the table name alongside the data for the primary key to be meaningful. It becomes especially challenging when the table names change over time, or if two databases are merged, resetting all the primary keys.

UUIDv4

To avoid the drawbacks of integers, some developers use UUIDv4 as primary keys. These randomly generated, universally unique identifiers resolve most of the drawbacks of integers, but they also lack their advantages.

erDiagram

Student {
  uuid student_id
  string full_name
}

Course {
  uuid course_id
  string name
}

Student_Course {
  uuid student_id
  uuid course_id
}

Student ||--o{ Student_Course : ""
Student_Course }o--|| Course : ""

// seen by the client
{
  "student": 
  {
    "student_id": "0943745f-3618-4b79-8dab-9ee516cae5d3",
    "full_name": "Alice",
    "courses": [
      {
        "course_id": "25a8da31-27e8-439a-8f29-b2eb0d7981d2",
        "name": "Mathematics"
      },
      {
        "course_id": "f0bb78eb-426b-4d18-ac64-ce6ee443de2e" ,
        "name": "Physics"
      }
    ]
}

Advantages

No information leakage. The version 4 of UUID is random, so reading the primary key doesn't disclose any information. Note that this is only true for UUIDv4; other versions of UUID contain information such as the timestamp of its creation.
Merging databases is easier. Merging two databases becomes much easier because the primary keys are very likely unique across all tables of all databases.
Identify each record uniquely. Each primary key identifies a record not only in its own table, but in the world. That comes in handy for logging, audit, or global search.

Drawbacks

No natural order. The lack of order in UUIDv4 means updating the index is harder, thereby slowing down the database. It isn't a problem in small databases, but the cost increases at scale.
Larger than integers. UUID are four times as large as 32-bit integers (INT), and twice as large as 64-bit integers (BIGINT). The difference becomes noticeable at scale in bridge tables.

Each student and course get their own UUID, giving them a unique identifier in the entire world, and no further information can be inferred from the primary key. However, there's a performance penalty, especially for traversing the Student_Course table.

Best of both worlds

Primary keys and unique identifiers serve different purposes. The primary key (I name the column id) is used by the database to organize its records. The unique identifier (I name the column code) is used by external applications to identify the records.

Important: the primary key is used only by the database, it shouldn't be given to any other party. It cannot leak information if it doesn't leave the database. As far as other applications are concerned, the code is the record's primary key.

erDiagram

Student {
  int student_id
  uuid student_code
  string full_name
}

Course {
  int course_id
  uuid course_code
  string name
}

Student_Course {
  int student_id
  int course_id
}

Student ||--o{ Student_Course : ""
Student_Course }o--|| Course : ""

// seen by the client
{
  "student": 
  {
    "student_code": "0943745f-3618-4b79-8dab-9ee516cae5d3",
    "full_name": "Alice",
    "courses": [
      {
        "course_code": "25a8da31-27e8-439a-8f29-b2eb0d7981d2",
        "name": "Mathematics"
      },
      {
        "course_code": "f0bb78eb-426b-4d18-ac64-ce6ee443de2e" ,
        "name": "Physics"
      }
    ]
}

This method solves all the problems exposed so far:

leaking information
- The result from the client's perspective is similar to having UUID primary keys. They can uniquely identify each record, but no further information can be inferred.
database performance
- Despite increasing the size of the tables Student and Course, the impact on performance is minimal. Every primary and foreign keys are integers, so the most frequently used indexes are optimized.
merging databases
- To merge databases, we use the code columns to distinguish each record. Additionally, we can more easily verify the correctness of the merged database.
uniquely identifying each record
- Each record is identified uniquely in the world, facilitating logging, audit, and global search.

Conclusion

I advise the use of integer primary keys. If you need to obfuscate the primary key or to uniquely identify each record, you can easily add a column of type UUID. Whenever possible, avoid exposing integer primary keys to users.

Criteria	Integer Primary Key	UUIDv4 Primary Key
Size	4 bytes (INT) or 8 bytes (BIGINT)	16 bytes
Index performance	Excellent (sequential inserts, minimal fragmentation)	Slower (random inserts, more page splits in B-tree indexes)
Insertion speed	Very fast (append-friendly)	Slower at scale due to random distribution
Cache efficiency	Better (more rows per page)	Lower (fewer rows per page)
Information leakage	Can enable unintended inferences	Hide meaningful information
Global uniqueness	Unique only within a table and a database	Practically globally unique
Database merging	Complex (ID collisions)	Easy (collisions extremely unlikely)

Notes

You may be wondering:

Why UUID instead of random strings?

UUID is a well supported standard. Many databases and applications handle the generation, indexation, and storage of UUID much more efficiently than random text.

Why UUID instead of random bytes?

Even if you put enough effort to match the performance of UUID, the size, serialization and representation of UUID is standardized. If you use random bytes, you reinvent the wheel and lose the interoperability with other systems.

What about other types of UUID or ULID?

Other types of UUID and ULID facilitate indexation, but reveal some information. For example, several UUID versions and ULID start with a millisecond timestamp. The key is easier to index because the timestamp always increases, but it reveals the date and time of the record's creation. It wouldn't solve the problem of the student id = 1500 described in this article.