Taking latest timestamp for duplicates
messages table. Before storage, we query messages to see if there are any existing instances with the same userId and messageId. If it exists, we patch(), otherwise we insert().As we're running through this workflow for 1000s of messages, we split them and call our storage mutation for each batch of messages. We also use the scheduler to run this message fetching for multiple conversations.
Since we're reading and mutation the same fields, this raised the following error:
OptimisticConcurrencyControlFailure: Data read or written in this mutation changed while it was being run. Consider reducing the amount of data read by using indexed queries with selective index range expressions (https://docs.convex.dev/database/indexes/).As @presley mentioned, we should either delay each scheduled fetching/storage flow, or ensure we aren't mutating/reading the same fields across scheduled actions.
Not querying the table before storage and just inserting the
userId, messageId field is one possible solution. This would slightly increase complexity (and delay?) when fetching messages for specific conversation (i.e. some userId, messageId, convoId). Since we're fetching new messages frequently this might bloat our tables even if we have some daily cleanup cronjob. Which solution approaches are recommended for this kind of workflow?
