Most efficient way to count (more than 16k entries)?
I want to count items within a table with more than 16384 entries that fit the query. However, the maximum amount of entries that can be read within a single query is 16384. What is the solution to getting the count for such a query?
21 Replies
Additional note: I need to do this server-side, and the result of the count is important (i need to compare this count as a condition for an action)
I would "denormalize" (in other words "duplicate") this state: Store the count in some other table and increment it / decrement it as needed.
In the future Convex might make it easier to maintain derived/computed aggregates like this, but no concrete plans yet.
Example in this template:
https://github.com/get-convex/convex-nextjs-app-router-demo/blob/main/convex/posts.ts#L104-L107
GitHub
convex-nextjs-app-router-demo/convex/posts.ts at main · get-convex/...
Demo showing a Next.js App Router app powered by Convex backend - get-convex/convex-nextjs-app-router-demo
Ah, I see, I was hoping for some shortcut, but it seems like this is the only way then. So I will have to edit every function that touches this table to update the count whenever it makes changes to this table.
all database operations within a mutations are guaranteed to be atomic right? so for example, if i Promise.all 3 inserts somewhere in my mutation, then await increment 3 below, if any of the 3 inserts fail the increment will not happen and vice versa right?
@Michal Srb I implemented this but I ran into some problems. For example, when I promise.all multiple inserts, and use
It does not accurately update the count due to the nature of promise.all
Yeah you'll need to do it subsequently, eg. via for loop. It's quite fast, though.
Personally, I've found it really helpful to have dedicated functions for each table that handle insert, patch, and delete. Nice to have one place to update for this kind of functionality.
That's what I do as well, I also try to abstract out the common helpers that can be used across all tables. I write these dedicated functions that deal with all the underlying logic, but it's just that I almost always use promise all instead of sequentially so there is a bit of logic to update all around
I abstract as much as I can but the logic for updating counts is slightly different for each one, and also the IDs that I need to use to identify items on each table is different too, so those parts I will define for each table.
Found a good intermediate solution, which is replacing all the offending Promise.alls with
Nice đź‘Ť
I still have lots of problems. For context, I am dealing with large data streams, and in order to stay within the 16k reads/writes per query, I have to batch them out via multiple actions. I found that these scheduled batch actions still end up running at the same time and breaking my artificial unique constraint yet again.
If you’re logically ensuring against concurrency within each transaction, convex provides ACID guarantees between the transactions themselves. Is the reducer you shared above running in an action or a mutation?
Okay wait - I just saw your mention that “all database operations within a mutation are guaranteed to be atomic”, and it was thumbed up.
@Michal Srb I have not found this to be the case - within a mutation, if I concurrently query and update the same record, say incrementing a number by one, the result is always a single increment - OP here is seeing the same. I take this to mean that ACID applies to an entire query or mutation, but not to individual database reads and writes.
What are we missing here?
Yes, you should not
Promise.all
"read+writes" operations, as we do not provider subtransactions inside a mutation. Either run them serially or each as an individual mutation.So to clarify, when @milk enjoyer said:
all database operations within a mutations are guaranteed to be atomic right?This is not the case, correct?
There's an important thing you're right about here but to step back, there are a few different relevant properties here. "Atomic" generally means "either applies fully or does not apply at all."
Oops got distracted before writing more, I'll write something better up here.
You're right that within a mutation it's not safe to do multiple concurrent multi-step reads/writes.
The example above of Promise.all of denormalized inserts is good, here's another example:
This isn't safe for async things in JavaScript in the browser or in Node.js or in Convex.
Since the unit of transaction / the unit of retry is a mutation, concurrent DB reads and writes within a transaction are still an issue.
In the example above, OP is synchronously executing promises via reducer, so it's effectively a loop. Any thoughts on why that wouldn't work? I've written this same kind of functionality using a for loop and, as far as I can tell, it works without issue. But I don't have constraints like uniqueness in place so it's possible my implementation isn't perfect.
Indeed actions can run at the same time — but if the actions are invoking mutations, any given mutation will not interleave with reads and writes of other mutations.
I was assuming the promise reducer above is running inside a mutation
Is this the same code? The reduce pattern should be ok here.
I get that this works, but what I am saying instead is that in a realistic scenario, we will have mutations that spawn actions that spawn more mutations etc. When we promise.all multiple actions and each action spawns the mutations, it seems that these mutations are clashing with each other and breaking my unique clause for the counts.
I think therein lies the problem: they might not interleave, but it will not know or care that the other mutations might have created another count object that is meant to be unique.
If the uniqueness check you shared above is in your mutation, that check will be run within an acid transaction, so they will 100% know and care. I haven't witnessed transactionality failing yet personally, I suspect there's a logical issue here.
I see you have it throwing if
countChange
is at or below 0, but that happens after the patch. Is it possible that existingCount
is 0, allowing the record to exist but fail the existence test? Checking for typeof existingCount === 'number'
instead of just existingCount
in the existence check would ensure this isn't the case.Actions spawning mutations is fine, every mutation will be applied transactionally, promise.all-ing multiple actions is fine. The issue arises when a mutation promise.alls multiple async functions which read and write.
If you can share this or similar code we can track this down, this kind of denormalization to enforce constraints should be safe as long as these operation run serially within a given mutation.
Thoughts on adding a flag in the docs about running mutations serially / avoiding Promise.all for this case? In my time using Convex, I've generally never thought much about concurrency because of the OCC and Atomicity page saying "there's no need to worry about conflicts, locking, or atomicity"
This is interesting, yes we've thought about this but we're taking a subtransactions approach.