Unique database fields
Is it possible to create a unique constraint / index in Convex? Similar to what you'd find in a standard RDBMS.
33 Replies
it's not currently possible to create a unique constraint that is enforced by Convex. however, you can enforce it yourself:
Am I at risk of race conditions here? I've read In https://docs.convex.dev/database/advanced/occ but it's not clear to me how Convex handles that kind of scenario.
OCC and Atomicity | Convex Developer Hub
In Queries, we mentioned that determinism
Actually, no, I think I get it
It's basically optimistic locking. If that fails then retry the function.
Isn't a limitation of this approach that I can't call another function from a mutation? Since I might pass it a value from the DB that turns out to be stale - and I assume I can't roll it back if I fail.
you're correct that it's optimistic locking. the entire mutation is a transaction, so it will be committed or fail atomically
what kind of function do you want to call? you can call any function that is deterministic and has no side effects from a mutation. if you want nondeterminism or side effects you can use an Action https://docs.convex.dev/functions/actions -- either
runMutation
from an action, or schedule the action to run in the background with scheduler.runAfter(...)
from the mutation.Actions | Convex Developer Hub
Actions can call third party services to do things such as processing a payment
the determinism & no-side-effect conditions aren't something you need to verify yourself. the entire mutation runs in a javascript environment that isn't allowed to have side effects or be nondeterministic. so as long as the code runs, it will work and have no race conditions
@yarrichar it's helpful to hear this question, this is a special thing about Convex the docs could make this clearer. You can run arbitrary JavaScript (including arbitrary function calls) in a mutation and it will all be rolled back if necessary. Scheduled non-deterministic functions will not be scheduled if the transaction doesn't succeed. JavaScript functions are tied so tightly to the DB that you can use JavaScript to enforce constraints you'd express in SQL in other systems.
Ahhh, that makes sense. So each call to another mutation / function is basically treated as an event / message that is only processed / sent once the current mutation is committed? I assume these mutations are covered by different transactions, so I can have the first succeed and the second fail?
If that's right it might be worth clarifying that bit, since it's not hard to refactor that scenario to include the logic for both mutations in a single mutations / transaction - if I know I need to.
Function calls are normal function calls, but yes if you use
scheduler.runAfter(0, api.another.mutation)
to schedule another mutation that scheduling will only occur if/when the current mutation transaction is committed.In terms of function calls being normal function calls I understand - if I already have mutationA though, and I want to include that logic in mutationB then I might be tempted to just do runMutation(api.example.mutationA) instead of extracting the logic of mutationA out into a separate function and calling that.
I also want to say - I love what you guys have done. As I've been telling people Convex is going to be my first choice for new projects 🙂
runMutation is specific to Actions, it doesn't exist in mutations. It's difficult to accidentally do something nonatomic in a mutation.
scheduler.runAfter(0, othermutation)
is one of the only waysAhh, cool (sorry still learning Convex) - and I guess the point is that if people are going to that much effort to run otherMutation then hopefully they know what they're doing.
no worries. Thanks for the great feedback! Convex does things a little differently, and we're still learning how to describe it 😅
You can call the other mutation directly like await mutationB(ctx, args) and it’ll be included in the same transaction, like other helper fn calls. In case that isn’t clear. You don’t have to factor the shared logic out to call the same logic
Can you also call queries from within mutations or other queries? If so, does caching apply to those inner query calls?
You can, but query-level caching does not apply. Also note that argument validation will not run for the query/mutation called like this.
Consider this a (happy?) design accident: Convex queries and mutations are just functions, which you aren't prevented from calling directly — this is especially useful in testing. But the special properties of Convex queries and mutations come from invoking them in an official capacity, via one of the supported methods like from the client.
Talking it over with colleagues, calling one mutation from another directly is unsupported behavior: it happens to work today but could be deprecated.
We've very allowing here to give users flexibility when writing tests but since there are important parts of a normal query/mutation call that happen before the first line of code in that function runs (e.g. argument validation), behavior when doing this could be surprising.
Actually, just thinking about this again, I don't think it'll work when we have two queries that are both trying to insert the same thing, since there is nothing to optimistically lock on yet.
Only one if the mutations will be committed / will succeed if both transactions check for something already existing. Optimistic locking doesn't require that something already exists.
(rereading I'm not sure I understand your message, curious to hear more)
Normally the way optimistic locking works is to maintain some sort of version field on the entity, and then make sure that that hasn't changed since the entity was read from the database when doing the update. But in the case of an insert, where the document / entity doesn't exist yet we can't check the version field.
And unless I'm misunderstanding something that means the suggested approach for enforcing unique constraints doesnt work in the case where we have two threads trying to insert the same record
(and that record doesn't existing yet)
Ah ok I follow. Convex OCC doesn't work quite like this.
The way OCC works in Convex is that whichever mutation finishes first will commit, which will invalidate reads done by the second mutation when it finishes and checks this, so will need to be retried. The reason the write and read sets overlap is based on the way the db.query is written.
In Lee's example upthread, the "on read" query uses the by_my_unique_id index to query for a myUniqueId value of myId. This same index is written to when writing the record. The read set of the first transaction includes this entry in the index, and the write set of the conflicting transaction includes it too.
The entry in the index doesn't need to exist to be able to reason about it. More generally a read set might be a range in a table or an index, 100 < ts < 200 or "a" < name < "b", and for a equality check it's a very thin range of just the sliver where my myUniqueId = myId. Instead of optimistic locking checking the version number on an entity, think checking the logical timestamp of anything overlapping a range.
It's simpler to reason about this for ranges on tables directly but the same applies to indexes. The take-home message is that you get real transactions, that run atomically and as though they took place at a point in time — because after a transaction runs but before it commits Convex checks for any writes to things it read and reruns if so. These checks are complete, there's no way to smuggle in DB data into a mutation without adding it to its read set unless you e.g. read the db yesterday and passing in stale data as an argument.
@yarrichar our chief scientist @sujayakar is typing up a "how convex works" article we'll publish soon on stack. but the short answer is, we do the right thing here. as tom has said, we actually express dependencies on read ranges not just read ids. so ranges where a new entry was interted will cause an invalidation or conflict detect correctly, etc
so it's safe to just write your transactions the simple way and convex will take care of it
until that article is published, there is a short info admonition in the tutorial here that describes in brief how this works: https://docs.convex.dev/tutorial/reactor#realtime-is-all-the-time
1: The reactor | Convex Developer Hub
Learn about Convex's reactive database accessed through TypeScript cloud
(the same mechanism used to invalidate/update subscriptions is used to detect mutation conflicts)
Is the index required for this to work? E.g. if I did the query as:
Instead of:
Would that tracking still work?
It would, it’s just less efficient.
Yup
Looking forward to the article. None of this is me not trusting that you guys / convex will do the right thing btw. But I like to understand what's going on / the constraints of the environment I'm working in.
Definitely! And people are owed that. 100% why we’re writing the “how convex works” article.
Since that query reads the entire table "tbl" any write to that table while that transaction was running would be a conflict and the mutation would be run again, at which point it would see this new record.
Yep without an index, the conflict range is the whole table.
Ah, cool, that's even simpler than I was thinking about it
With an index, it’s narrowed to the index predicate.
Pagination affects it as well.
Understood.
Thanks for all your help guys!
Thanks for the great questions. ☺️