How to generate 8 bytes unique id from GUID?

You cannot distill a 16-bit value down to an 8-bit value while still retaining the same degree of uniqueness. If uniqueness is critical, don't "roll your own" anything. Stick with GUIDs unless you really know what you're doing.

If a relatively naive implementation of uniqueness is sufficient then it's still better to generate your own IDs rather than derive them from GUIDs. The following code snippet is extracted from a "Locally Unique Identifier" class I find myself using fairly often. It makes it easy to define both the length and the range of characters output.

using System.Security.Cryptography;
using System.Text;

public class LUID
{
    private static readonly RNGCryptoServiceProvider RandomGenerator = new RNGCryptoServiceProvider();
    private static readonly char[] ValidCharacters = "ABCDEFGHJKLMNPQRSTUVWXYZ23456789".ToCharArray();
    public const int DefaultLength = 6;
    private static int counter = 0;

    public static string Generate(int length = DefaultLength)
    {
        var randomData = new byte[length];
        RandomGenerator.GetNonZeroBytes(randomData);

        var result = new StringBuilder(DefaultLength);
        foreach (var value in randomData)
        {
            counter = (counter + value) % (ValidCharacters.Length - 1);
            result.Append(ValidCharacters[counter]);
        }
        return result.ToString();
    }
}

In this instance it excludes 1 (one), I (i), 0 (zero) and O (o) for the sake of unambiguous human-readable output.

To determine just how effectively 'unique' your particular combination of valid characters and ID length are, the math is simple enough but it's still nice to have a 'code proof' of sorts (Xunit):

    [Fact]
    public void Does_not_generate_collisions_within_reasonable_number_of_iterations()
    {
        var ids = new HashSet<string>();
        var minimumAcceptibleIterations = 10000;
        for (int i = 0; i < minimumAcceptibleIterations; i++)
        {
            var result = LUID.Generate();
            Assert.True(!ids.Contains(result), $"Collision on run {i} with ID '{result}'");
            ids.Add(result);
        }            
    }

No, it won't. As highlighted many times on Raymond Chen's blog, the GUID is designed to be unique as a whole, if you cut out just a piece of it (e.g. taking only 64 bytes out of its 128) it will lose its (pseudo-)uniqueness guarantees.


Here it is:

A customer needed to generate an 8-byte unique value, and their initial idea was to generate a GUID and throw away the second half, keeping the first eight bytes. They wanted to know if this was a good idea.

No, it's not a good idea. (...) Once you see how it all works, it's clear that you can't just throw away part of the GUID since all the parts (well, except for the fixed parts) work together to establish the uniqueness. If you take any of the three parts away, the algorithm falls apart. In particular, keeping just the first eight bytes (64 bits) gives you the timestamp and four constant bits; in other words, all you have is a timestamp, not a GUID.

Since it's just a timestamp, you can have collisions. If two computers generate one of these "truncated GUIDs" at the same time, they will generate the same result. Or if the system clock goes backward in time due to a clock reset, you'll start regenerating GUIDs that you had generated the first time it was that time.


I try to use long as unique id within our C# application (not global, and only for one session.) for our events. do you know the following will generate an unique long id?

Why don't you just use a counter?

Tags:

C#

.Net