The document is fairly technical, but the key points are as follows:
A UUID is a 128 bit (16 byte) identifier that can guarantee uniqueness across "Space" and "Time".
By Space, we mean that any machine generating the UUID at exactly the same time would generate the same UUID.
By Time, we mean that the same machine would generate a different UUID at any given point in time.
RFC 4122 specifies that each UUID should encode its "Variant" (the rules by which it was generated), and the "Version" (more accurately the sub-type), of which there are 5.
A source of confusion is that Version numbers are not in fact incremental revisions of the RFC 4122 specification. Versions merely indicate the methodology used to construct the UUID.
UUID Versions
Version 1 uses the MAC address of the host machine to guarantee Space. However, MAC addresses are not actually globally unique as they can be user-modified. Their inclusion in UUIDs also poses a threat to privacy and security. The creator of the Melissa virus that impacted Windows in the late 90s was caught because their MAC address was included in a UUID found in the virus’ code.
For V1, Time is computed in units of 100 nanoseconds, starting from 00:00:00.00, 15 October 1582. V1 Time is also considered insecure because it is possible for a user to manipulate the system time counter. For other Versions, Time is an input parameter to the generator algorithm.
Version 2 combines the MAC address with a constant ID (usually a POSIX user ID) to create a unique Space. However, this is still relatively insecure since the last 6 bytes of the UUID will always be the same, meaning that it will be trivial to identify UUIDs that have been generated in the same Space.
Versions 3 and 5 are used to generate UUIDs from "names". They use a constant UUID of any Version as an input to provide constant Space and Time values. Both Versions combine this Space and Time with a user supplied Input and a hash algorithm, such that the same name Input will always result in the same UUID. Version 3 hashes the data using the MD5 algorithm and Version 5 uses the more secure SHA1 algorithm.
Version 4 UUIDs are an exception in that both Space and Time are guaranteed using the generating machine's entropy. They are essentially 16 random bytes.
Identifying the Version and Variant
7 bits of each UUID are used to identify the Version and Variant.
Variant
|
6ce343fa-544b-42c8-b516-acab0b52d61b
|
Version
To specify the Version, bits 49-52 of the 128 bit UUID are set to represent the version number. Valid values are:
0 0 0 1
(Version 1)0 0 1 0
(Version 2)0 0 1 1
(Version 3)0 1 0 0
(Version 4)0 1 0 1
(Version 5)
These values conveniently render the Version in a human readable form as the first number in the 3rd block when the UUID is formatted as specified by the RFC.
To specify the Variant, bits 65-67 are set to reflect the specification used. Valid values are:
0 * *
(Reserved for NCS backward compatibility)1 0 *
(RFC 4122)1 1 0
(Reserved by Microsoft for internal backward compatibility)1 1 1
(Reserved for future use)
NCS is a methodology that preceded the UUID specification and is similar to V1 in that it reveals a hardware MAC address. It is not considered secure and should therefore not be used.
Setting bits 65 and 66 to 1 0
(RFC 4122) results in an octet between 10000000
and 10111111
. In hexadecimal, this is 0x80
- 0xbf
. Therefore, valid values for the first digit of the 4th block of an RFC 4122 UUID are 8
,9
,a
and b
.
Further Development
Currently, there is a proposal to introduce 3 more UUID sub-types, V6, V7, and V8. These new Versions are designed to address challenges when using UUIDs as database keys. Should the proposal be approved and RFC 4122 updated, we will change our validation to match.