Good way to replace invalid characters in firebase keys?

We've dealt with this issue many times and while on the surface it seems like using an email as a key is a simple solution, it leads to a lot of other issues: having to clean/parse the email so it can actually be used. What if the email changes?

We have found that changing the format of how the data is stored is a better path. Suppose you just need to store one thing, the user name.

[email protected]: "John Smith"

changing it to

randomly_generated_node_name
   email:  "[email protected]"
   first:  "John"
   last:   "Smith"

The randomly_generated_node_name is a string that Firebase can generate via childByAutoId, or really any type of reference that is not tied directly to the data.

This offers a lot of flexibility: you can now change the persons last name - say if they get married. Or change their email. You could add an 'index' child 0, 1, 2 etc that could be used for sorting. The data can be queried for any child data. All because the randomly_generated_node_name is a static reference to the variable child data within the node.

It also allows you to expand the data in the future without altering the existing data. Add address, favorite food, an index for sorting etc.

Edit: a Firebase query for email in ObjC:

//references all of the users ordered by email
FQuery *allUsers = [myUsersRef queryOrderedByChild:@"email"];

//ref the user with this email
FQuery *thisSpecificUser = [allUsers queryEqualToValue:@“[email protected]”]; 

//load the user with this email
[thisSpecificUser observeEventType:FEventTypeChildAdded withBlock:^(FDataSnapshot *snapshot) {
  //do something with this user
}];

In the email address, replace the dot . with a comma ,. This pattern is best practice.

The comma , is not an allowable character in email addresses but it is allowable in a Firebase key. Symmetrically, the dot . is an allowable character in email addresses but it is not allowable in a Firebase key. So direct substitution will solve your problem. You can index email addresses without looping.

You also have another issue.

const cleanEmail = email.replace('.',','); // only replaces first dot

will only replace the first dot . But email addresses can have multiple dots. To replace all the dots, use a regular expression.

const cleanEmail = email.replace(/\./g, ','); // replaces all dots

Or alternatively, you could also use the split() - join() pattern to replace all dots.

const cleanEmail = email.split('.').join(','); // also replaces all dots

I am using the following code for converting email to hash and then using the hash as key in firebase

public class HashingUtils {
    public HashingUtils() {
    }

    //generate 256 bits hash using SHA-256
    public String generateHashkeySHA_256(String email){
        String result = null;
        try {
            MessageDigest digest = MessageDigest.getInstance("SHA-256");
            byte[] hash = digest.digest(email.getBytes("UTF-8"));
            return byteToHex(hash); // make it printable
        }catch(Exception ex) {
            ex.printStackTrace();
        }
        return result;
    }

    //generate 160bits hash using SHA-1
    public String generateHashkeySHA_1(String email){
        String result = null;
        try {
            MessageDigest digest = MessageDigest.getInstance("SHA-1");
            byte[] hash = digest.digest(email.getBytes("UTF-8"));
            return byteToHex(hash); // make it printable
        }catch(Exception ex) {
            ex.printStackTrace();
        }
        return result;
    }

    public String byteToHex(byte[] bytes) {
        Formatter formatter = new Formatter();
        for (byte b : bytes) {
            formatter.format("%02x", b);
        }
        String hex = formatter.toString();
        return hex;
    }
}

code for adding the user to firebase

public void addUser(User user) {
    Log.d(TAG, "addUser: ");
    DatabaseReference userRef= database.getReference("User");

    if(!TextUtils.isEmpty(user.getEmailId())){
       String hashEmailId= hashingUtils.generateHashkeySHA_256(user.getEmailId());
        Log.d(TAG, "addUser: hashEmailId"+hashEmailId);
        userRef.child(hashEmailId).setValue(user);
    }
    else {
        Log.d(TAG,"addUser: empty emailId");
    }
}

I can think of two major ways to solve this issue:

  1. Encode/Decode function

Because of the limited set of characters allowed in a Firebase key, a solution is to transform the key into an valid format (encode). Then have an inverse function (decode) to transform the encoded key back as the original key.

A general encode/decode function might be transforming the original key into bytes, then converting them to a hexadecimal representation. But the size of the key might be an issue.

Let's say you want to store users using the e-mail as key:

# path: /users/{email} is User;
/users/[email protected]: {
    name: "Alice",
    email: "[email protected]"
}

The example above doesn't work because of the dot in the path. So we use the encode function to transform the key into a valid format. [email protected] in hexadecimal is 616c69636540656d61696c2e636f6d, then:

# path: /users/{hex(email)} is User;
/users/616c69636540656d61696c2e636f6d: {
    name: "Alice",
    email: "[email protected]"
}

Any client can access that resource as long as they share the same hex function.

Edit: Base64 can also be used to encode/decode the key. May be more efficient than hexadecimals, but there are many different implementations. If clients doesn't share the exact same implementation, then they will not work properly.

Specialized functions (ex. that handles e-mails only) can also be used. But be sure to handle all the edge cases.

  1. Encode function with original key stored

Doing one way transformation of the key is a lot easier. So, instead of using a decode function, just store the original key in the database.

A good encode function for this case is the SHA-256 algorithm. It's a common algorithm with implementations in many platforms. And the chances of collisions are very slim.

The previous example with SHA-256 becomes like this:

# path: /users/{sha256(email)} is User;
/users/55bf4952e2308638427d0c28891b31b8cd3a88d1610b81f0a605da25fd9c351a: {
    name: "Alice",
    email: "[email protected]"
}

Any client with the original key (the e-mail) can find this entry, because the encode function is known (it is known). And, even if the key gets bigger, the size of the SHA-256 will always be the same, therefore, guaranteed to be a valid Firebase key.

Tags:

Firebase