punn•3y ago

Taking latest timestamp for duplicates

For a specific conversation, our app fetches messages from an external service and stores them in our convex messages table. Before storage, we query messages to see if there are any existing instances with the same userId and messageId. If it exists, we patch(), otherwise we insert(). As we're running through this workflow for 1000s of messages, we split them and call our storage mutation for each batch of messages. We also use the scheduler to run this message fetching for multiple conversations. Since we're reading and mutation the same fields, this raised the following error:

OptimisticConcurrencyControlFailure: Data read or written in this mutation changed while it was being run. Consider reducing the amount of data read by using indexed queries with selective index range expressions (https://docs.convex.dev/database/indexes/).

As @presley mentioned, we should either delay each scheduled fetching/storage flow, or ensure we aren't mutating/reading the same fields across scheduled actions. Not querying the table before storage and just inserting the userId, messageId field is one possible solution. This would slightly increase complexity (and delay?) when fetching messages for specific conversation (i.e. some userId, messageId, convoId). Since we're fetching new messages frequently this might bloat our tables even if we have some daily cleanup cronjob. Which solution approaches are recommended for this kind of workflow?

4 Replies

presley•3y ago

Before storage, we query messages to see if there are any existing instances with the same userId and messageId

Is that doing a linear scan over all messages. Can you use an index on userID + messageId. Then different batches should not conflict with each other?

punnOP•3y ago

Can we use 3 indexes for one of our tables and not negatively affect inserts

presley•3y ago

yeah, the overhead on inserts of adding more indexes should be negligable (so small can't be measured unless you go crazy and go to hundreds, which we don't allow you to).

punnOP•3y ago

Sweet sounds good thanks. Will give that a shot

Taking latest timestamp for duplicates

Did you find this page helpful?