Lets imagine a scenario where you would have an endpoint used to create a user. This would be within a restful application, so lets imagine that a rich client calls this API endpoint.
exports.createUser = function(req,res){
if(req.body){
//Check if email has already been used
db.User.find({where:{email:req.body.email}}).success(function(user){
if(user === null || user === undefined){
//Create user
res.send(201);
} else {
res.json(409,{error: 'User already exists'});
}
});
} else {
res.send(400);
}
};
If I were to call this endpoint multiple time really fast, it would be possible to create multiple records with the same email in the database, even though you queryied the user table to make sure there woul be no duplicate.
As a matter of fact, I've managed to recreate this little experiment with a little stress tester I made and a node app that uses PostgreSQL.
I'm sure this is a common problem, but how would one go about preventing this issue? I tough limiting the number of request to a certain endpoints, but that doesn't seem like a very good solution.
Any ideas? Thank you very much!
The simplest option is to LOCK TABLE "users" IN EXCLUSIVE MODE
at the beginning the transaction that does the find then the insert. This ensures that only one transaction can be writing to the table at a time.
For better concurrency, you can:
Define a UNIQUE
constraint on email
, then skip the find
step. Attempt the insert
and if it fails, trap the error and report a duplicate; or
Use one of the insert-if-not-exists techniques known to be concurrency-safe
If using a unique constraint, one thing to consider is that your app might mark users as disabled w/o deleting them, and probably doesn't want to force email addresses to be unique for disabled users. If so, you might want a partial unique index instead (see the docs).