Fetching a random subset of documents without having to collect() over the entire set?
Hi team, I'm implementing an app where a core user flow is to be able to answer a subset of questions from a collection of documents I have stored in Convex. Right now, I have it implemented in a way where I fetch ALL of the documents (this can bloat to thousands eventually), and then apply filters related to which questions I'd like to exclude, and then return only a random subset of these documents.
This makes my db bandwidth usage skyrocket over time and I'd really like to optimize this. Is there any simpler way to get a random subset of docs from Convex? If not, is there a more efficient way of implementing this in general? I've attached a code snippet below.
25 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
hi! you can use the aggregates component to fetch random documents in O(log n) time and without reading the whole table. Here's the component: https://www.convex.dev/components/aggregate
here's an example of it in use, to shuffle songs: https://github.com/get-convex/aggregate/blob/main/example/convex/shuffle.ts
GitHub
aggregate/example/convex/shuffle.ts at main · get-convex/aggregate
Component for aggregating counts and sums of Convex documents - get-convex/aggregate
@jamwt oooh cool, will try that!
@jamwt this looks great, but I'd also like to insert a step before (or after) the fetching of random indices to ensure that I'm not fetching any ids within the list of exclusionIds I provide (essentially don't want ot show users questions they've already answered). Is there a way to support that out of the box?
thanks for the speedy replies btw! convex is great
if the list of questions is stable, you can just keep using the same seed and move the offset along
and you'll know you won't show the same question again
if the list of questions is dynamic, things get trickier...
hmm, the list of questions can change over time (we add more on a weekly basis).
they aren't expected to change too often though
and you'll know you won't show the same question againhow would i know this?
if you're paging through the questions in a stable order, and each question only exists once in the shuffled list, then (until you hit the end of the list and wrap around?) you wouldn't show the same question again
this is actually the behavior the "shuffle songs" example does
ah wait i think i may have miscommunicated. the idea here is that users will have answered questions on previous days (this is a daily question challenge app), and each day we give them a random list of 30 questions they can mouse through (but answer only 3 of them).
I just wanna ensure that if a user has answered a question already (I have these ids), I don't show those as an option to the user again. I'm using a list of exclusion ids for this.
gotcha. yeah, I think you'll need to keep maintaining your exclusion list then
cool, would this random aggregate approach help me limit db bandwidth usage though?
just wanna make sure I'm not sending back and forth the entire list each time :/
yes, it would still help. you can fetch a random 30 + N records, where you have N excluded, and then just take the first 30 that aren't excluded
to make the exclusion list not grow without bound, if you "EOL" some questions (after a month? year?) you could also run migrations to remove those ids from the exclusion lists since it's no longer needed. but you probably don't have to worry about that for a long time, if this is a daily question 🙂
solid idea! lemme try that sir
btw, just applied for the YC deal and have been loving my experience so far
which company is this?
thank you for building this
no problem, glad to have you on. keep the feedback coming!
we were building www.shopencore.ai during the batch
ah yep... welcome!
but recently pivoted to https://trycandle.app
candle | the app designed to be kept.
Candle is an app for modern couples looking to stay connected and grow closer.
haha
thank you! feel free to download the app too, we just launched
would love feedback!
nice. as someone in a 29 year relationship, very cool to see building happening in this space.
i am in a 2 year one :p but would love feedback from you (seriosuly)
havent talked to too many ppl in very long term relationships
for sure, I usually try convex apps if I'm allowed in!
would be a valuable perspective
so I'll do it
you're for sure allowed in!!!
hey Jamie, noticing that randomize.count(ctx) is returning 0 for some reason.
I did the following:
at the top of my functions file.
and added to convex.config.ts