milk enjoyer
milk enjoyer•11mo ago

Most efficient way to count (more than 16k entries)?

I want to count items within a table with more than 16384 entries that fit the query. However, the maximum amount of entries that can be read within a single query is 16384. What is the solution to getting the count for such a query?
21 Replies
milk enjoyer
milk enjoyerOP•11mo ago
Additional note: I need to do this server-side, and the result of the count is important (i need to compare this count as a condition for an action)
Michal Srb
Michal Srb•11mo ago
I would "denormalize" (in other words "duplicate") this state: Store the count in some other table and increment it / decrement it as needed. In the future Convex might make it easier to maintain derived/computed aggregates like this, but no concrete plans yet.
Michal Srb
Michal Srb•11mo ago
GitHub
convex-nextjs-app-router-demo/convex/posts.ts at main · get-convex/...
Demo showing a Next.js App Router app powered by Convex backend - get-convex/convex-nextjs-app-router-demo
milk enjoyer
milk enjoyerOP•11mo ago
Ah, I see, I was hoping for some shortcut, but it seems like this is the only way then. So I will have to edit every function that touches this table to update the count whenever it makes changes to this table. all database operations within a mutations are guaranteed to be atomic right? so for example, if i Promise.all 3 inserts somewhere in my mutation, then await increment 3 below, if any of the 3 inserts fail the increment will not happen and vice versa right? @Michal Srb I implemented this but I ran into some problems. For example, when I promise.all multiple inserts, and use
const existingCount = await db
.query(COUNTS_TABLE_NAME)
.withIndex("by_of_creator_id", (q) => q.eq("ofCreatorId", ofCreatorId))
.unique();

if (existingCount) {
await db.patch(existingCount._id, {
count: existingCount.count + countChange,
});

return existingCount._id;
}

if (!(countChange > 0)) {
throw new Error("Count does not exist");
}

const newCountId = await db.insert(COUNTS_TABLE_NAME, {
ofCreatorId,
count: countChange,
});

return newCountId;
const existingCount = await db
.query(COUNTS_TABLE_NAME)
.withIndex("by_of_creator_id", (q) => q.eq("ofCreatorId", ofCreatorId))
.unique();

if (existingCount) {
await db.patch(existingCount._id, {
count: existingCount.count + countChange,
});

return existingCount._id;
}

if (!(countChange > 0)) {
throw new Error("Count does not exist");
}

const newCountId = await db.insert(COUNTS_TABLE_NAME, {
ofCreatorId,
count: countChange,
});

return newCountId;
It does not accurately update the count due to the nature of promise.all
erquhart
erquhart•11mo ago
Yeah you'll need to do it subsequently, eg. via for loop. It's quite fast, though. Personally, I've found it really helpful to have dedicated functions for each table that handle insert, patch, and delete. Nice to have one place to update for this kind of functionality.
milk enjoyer
milk enjoyerOP•11mo ago
That's what I do as well, I also try to abstract out the common helpers that can be used across all tables. I write these dedicated functions that deal with all the underlying logic, but it's just that I almost always use promise all instead of sequentially so there is a bit of logic to update all around I abstract as much as I can but the logic for updating counts is slightly different for each one, and also the IDs that I need to use to identify items on each table is different too, so those parts I will define for each table. Found a good intermediate solution, which is replacing all the offending Promise.alls with
export async function sequentialPromiseAll<T>(promises: Promise<T>[]): Promise<T[]> {
return promises.reduce<Promise<T[]>>(
async (previousPromise, currentPromise) => {
const results = await previousPromise;
const currentResult = await currentPromise;
return [...results, currentResult];
},
Promise.resolve([])
);
}
export async function sequentialPromiseAll<T>(promises: Promise<T>[]): Promise<T[]> {
return promises.reduce<Promise<T[]>>(
async (previousPromise, currentPromise) => {
const results = await previousPromise;
const currentResult = await currentPromise;
return [...results, currentResult];
},
Promise.resolve([])
);
}
erquhart
erquhart•11mo ago
Nice đź‘Ť
milk enjoyer
milk enjoyerOP•11mo ago
I still have lots of problems. For context, I am dealing with large data streams, and in order to stay within the 16k reads/writes per query, I have to batch them out via multiple actions. I found that these scheduled batch actions still end up running at the same time and breaking my artificial unique constraint yet again.
erquhart
erquhart•11mo ago
If you’re logically ensuring against concurrency within each transaction, convex provides ACID guarantees between the transactions themselves. Is the reducer you shared above running in an action or a mutation? Okay wait - I just saw your mention that “all database operations within a mutation are guaranteed to be atomic”, and it was thumbed up. @Michal Srb I have not found this to be the case - within a mutation, if I concurrently query and update the same record, say incrementing a number by one, the result is always a single increment - OP here is seeing the same. I take this to mean that ACID applies to an entire query or mutation, but not to individual database reads and writes. What are we missing here?
Michal Srb
Michal Srb•11mo ago
Yes, you should not Promise.all "read+writes" operations, as we do not provider subtransactions inside a mutation. Either run them serially or each as an individual mutation.
erquhart
erquhart•11mo ago
So to clarify, when @milk enjoyer said:
all database operations within a mutations are guaranteed to be atomic right?
This is not the case, correct?
ballingt
ballingt•11mo ago
There's an important thing you're right about here but to step back, there are a few different relevant properties here. "Atomic" generally means "either applies fully or does not apply at all." Oops got distracted before writing more, I'll write something better up here. You're right that within a mutation it's not safe to do multiple concurrent multi-step reads/writes. The example above of Promise.all of denormalized inserts is good, here's another example:
function increment(ctx, args) {
const existing = await ctx.db.get(args.id);
await ctx.db.patch(args.id, { counter: existing.counter + 1 });

const incrementTwice = mutation((ctx, args) => {
const promise1 = increment(ctx, args);
const promise2 = increment(ctx, args);
await Promise.all([promise1, promise2]);
});
function increment(ctx, args) {
const existing = await ctx.db.get(args.id);
await ctx.db.patch(args.id, { counter: existing.counter + 1 });

const incrementTwice = mutation((ctx, args) => {
const promise1 = increment(ctx, args);
const promise2 = increment(ctx, args);
await Promise.all([promise1, promise2]);
});
This isn't safe for async things in JavaScript in the browser or in Node.js or in Convex. Since the unit of transaction / the unit of retry is a mutation, concurrent DB reads and writes within a transaction are still an issue.
erquhart
erquhart•11mo ago
In the example above, OP is synchronously executing promises via reducer, so it's effectively a loop. Any thoughts on why that wouldn't work? I've written this same kind of functionality using a for loop and, as far as I can tell, it works without issue. But I don't have constraints like uniqueness in place so it's possible my implementation isn't perfect.
ballingt
ballingt•11mo ago
Indeed actions can run at the same time — but if the actions are invoking mutations, any given mutation will not interleave with reads and writes of other mutations.
erquhart
erquhart•11mo ago
I was assuming the promise reducer above is running inside a mutation
ballingt
ballingt•11mo ago
Is this the same code? The reduce pattern should be ok here.
milk enjoyer
milk enjoyerOP•11mo ago
I get that this works, but what I am saying instead is that in a realistic scenario, we will have mutations that spawn actions that spawn more mutations etc. When we promise.all multiple actions and each action spawns the mutations, it seems that these mutations are clashing with each other and breaking my unique clause for the counts. I think therein lies the problem: they might not interleave, but it will not know or care that the other mutations might have created another count object that is meant to be unique.
erquhart
erquhart•11mo ago
If the uniqueness check you shared above is in your mutation, that check will be run within an acid transaction, so they will 100% know and care. I haven't witnessed transactionality failing yet personally, I suspect there's a logical issue here. I see you have it throwing if countChange is at or below 0, but that happens after the patch. Is it possible that existingCount is 0, allowing the record to exist but fail the existence test? Checking for typeof existingCount === 'number' instead of just existingCount in the existence check would ensure this isn't the case.
ballingt
ballingt•11mo ago
Actions spawning mutations is fine, every mutation will be applied transactionally, promise.all-ing multiple actions is fine. The issue arises when a mutation promise.alls multiple async functions which read and write. If you can share this or similar code we can track this down, this kind of denormalization to enforce constraints should be safe as long as these operation run serially within a given mutation.
eric
eric•4mo ago
Thoughts on adding a flag in the docs about running mutations serially / avoiding Promise.all for this case? In my time using Convex, I've generally never thought much about concurrency because of the OCC and Atomicity page saying "there's no need to worry about conflicts, locking, or atomicity"
ballingt
ballingt•4mo ago
This is interesting, yes we've thought about this but we're taking a subtransactions approach.

Did you find this page helpful?