DynamoDB - Put item if hash (or hash and range combination) doesn't exist
You can't. All items in DynamoDB are indexed by either their hash
or hash
+range
(depending on your table).
A sort of summary of what is going on so far:
- A single hash key can have multiple range keys.
- Every item has both a
hash
and arange
key - You are making a
PutItem
request and must provide both thehash
andrange
- You are providing a
ConditionExpression
withattribute_not_exists
on either thehash
orrange
attribute name - The
attribute_not_exists
condition is merely checking if an attribute with that name exists, it doesn't care about the value
Let's walk through an example. Let's start with a hash
+range
key table with this data:
hash=A,range=1
hash=A,range=2
There are four possible cases:
If you try to put an item with
hash=A,range=3
andattribute_not_exists(hash)
, thePutItem
will succeed becauseattribute_not_exists(hash)
evaluates totrue
. No item exists with keyhash=A,range=3
that satisfies the condition ofattribute_not_exists(hash)
.If you try to put an item with
hash=A,range=3
andattribute_not_exists(range)
, thePutItem
will succeed becauseattribute_not_exists(range)
evaluates totrue
. No item exists with keyhash=A,range=3
that satisfies the condition ofattribute_not_exists(range)
.If you try to put an item with
hash=A,range=1
andattribute_not_exists(hash)
, thePutItem
will fail becauseattribute_not_exists(hash)
evaluates tofalse
. An item exists with keyhash=A,range=1
that does not satisfy the condition ofattribute_not_exists(hash)
.If you try to put an item with
hash=A,range=1
andattribute_not_exists(range)
, thePutItem
will fail becauseattribute_not_exists(range)
evaluates tofalse
. An item exists with keyhash=A,range=1
that does not satisfy the condition ofattribute_not_exists(range)
.
This means that one of two things will happen:
- The
hash
+range
pair exists in the database.attribute_not_exists(hash)
must betrue
attribute_not_exists(range)
must betrue
- The
hash
+range
pair does not exist in the database.attribute_not_exists(hash)
must befalse
attribute_not_exists(range)
must befalse
In both cases, you get the same result regardless of whether you put it on the hash or the range key. The hash
+range
key identifies a single item in the entire table, and your condition is being evaluated on that item.
You are effectively performing a "put this item if an item with this hash
+range
key does not already exist".
For Googlers:
- (a)
attribute_not_exists
checks whether an item with same primary key as the to-be-inserted item exists - (b) Additionally, it checks whether an attribute exists on that item, value does not matter
- If you only want to prevent overwriting, then use
attribute_not_exists
with primary key (or partition key, or range key), since the key must exist, check (b) will always pass, only check (a) will be in effect
Reasoning:
- The name
attribute_not_exists
suggests that it checks whether an attribute exists on an item - But there are multiple items in the table, which item does it check against?
- The answer is it checks against the item with the same primary key as the one you are putting in
- This happens for all condition expressions
- But as always, it is not properly and fully documented
- See below official document about this feature, and taste its ambiguity
Note: To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
Link
This version of explanation taken from amazon aws forum says that a search will look an item that matches a provided hash key and then only checks if the attribute exists in that record. It should works the same if you have a hash and a range keys, I suppose.
If a request try to find an existing item with hash key "b825501b-60d3-4e53-b737-02645d27c2ae". If this is first time this id is being used there will be no existing item and "attribute_not_exists(email)" will evaluate to true, Put request will go through.
If this id is already used there will be an existing item. Then condition expression will look for an existing email attribute in the existing item, if there is an email attribute the Put request will fail, if there is no email attribute the Put request will go through.
Either way it's not comparing the value of "email" attribute and it's not checking if other items in the table used the same "email" value.
If email was the hash key, then request will try to find an existing item with hash key "[email protected]".
If there is another item with same email value an existing item will be found. Since email is the hash key it has to be present in the existing item and "attribute_not_exists(email)" will evaluate to false and Put request will fail.
If "email" value is not used before existing item will not be found and "attribute_not_exists(email)" will evaluate to true hence Put request will go through.