pagination
trying to find a way to randomly access a different page
getPage doesn't seem to work
how can I go straight to the 100th page
how can I filter some out so pages only include the desired results, there are to many bytes for me to collect the whole table
36 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
Could you describe your use case in more detail? How many documents are in your table?
(I'll be offline the rest of the night, but I'll check back tomorrow)
getPage is useful if you know which index keys you want to look at, so if you have an
incrementingIndex: v.number()
on your documents (or something similar) it can be used for that.
If you want to jump to a certain place in your table with just an offset (number of documents to skip), you'll need to add some structure. For example, https://www.convex.dev/components/aggregate#offset-based-paginationI have this schema
timeline: defineTable({
podcast_id: v.id("podcast"),
episode_id: v.id("episode"),
episode_number: v.number(),
start: v.string(),
end:v.string(),
geoname: v.string(),
chart: v.string(),
rank: v.number(),
})
.index("start", ["start", "chart", "rank","episode_number"]),
with 30-40,000 records
need to be able to filter on a range for start, rank and also filter on 1-3 values for chart
the tools are starting to make me doubt the scalability of convex for large data sets
for instance
await ctx.db
.query("timeline")
.withIndex("start", (q) => q.gte([cursor.start, cursor.chart, cursor.rank, cursor.episode_number]))
doesn't work, I can actually filter on everything after a particular record, on an index, which makes paginating difficult
this is a great point. it should be possible, but we don't expose a nice interface for it. the closest we have is streamQuery here https://github.com/get-convex/convex-helpers/blob/b69982431f592cb3405326e9122054ad66d10adb/packages/convex-helpers/server/pagination.ts#L127 , but you would have to set up the pagination cursor yourself.
GitHub
convex-helpers/packages/convex-helpers/server/pagination.ts at b699...
A collection of useful code to complement the official packages. - get-convex/convex-helpers
Everything I’ve tried with stream query seems to have the same problem as anything just to count the records pulls everything into memory and gets an error
Oh right, there's no way to get >16k records in a single query or mutation. You can call it repeatedly from the client, with
loadMore()
, or you can call it repeatedly from an actionI dont want all the records, I need a way to skip ahead, and to get the indexes of different pages
oh then you probably want https://www.convex.dev/components/aggregate
This is a little light on details. Seems like it should be supported directly in the schema
None of these answers is very satisfying
aggregate is promising but not well documented
the fact I cant do withindex(?).gt([x,y,z]) so greater than the cursor is frustrating
also skip seems necessary as well as take
you want the section labeled "offset-based pagination"
this actually can't be built in b/c it requires a special index
normal databases can't do it either
GitHub
GitHub - get-convex/aggregate: Component for aggregating counts and...
Component for aggregating counts and sums of Convex documents - get-convex/aggregate
I think I get why it can't be default, but it would be nice to be able to configure it from the schema
@Jamie
I have implemented this yet, but it sounds like it should work correctly
const { key } = await photos.at(ctx, offset, { namespace: album });
but than this part doesn't work how I would like, my table has a multiple column index, and I would like to just get everything between two offsets from the table
return await ctx.db.query("photos")
.withIndex("by_album_creation_time", q=>q.eq("album", album).gte("_creationTime", key))
.take(numItems);
something like
q.gt(key)
@Jamie
What is the index actually implemented as?
I am confused why it would be difficult to do gt(key)
also confused why it would be difficult to do skip(?) startingAtOffset(?)
@Jamie
on most platforms I'd probably just dive into the aggregate, but it seems like a lot of mess and maintenance headache that convex so far has helped me
it sounds like you want to call
streamQuery
to do the gt(key)
query. can you share what you have tried for streamQuery
? i'm considering making a more intuitive helper, but would like to know what you would find intuitive
the idea would be you first do const { key } = await photos.at(ctx, offset, { namespace: album });
, then call streamQuery
on that key to get as many photos as you want, starting at that offset@burnstony#1975 for some background on why this is hard, just two main things to know. (1) in databases, indexes are represented as btrees; and (2) databases attempt to be efficient, so [by default] indexes only store the minimum possible information--that is, how to find the position of the indexed value only (not the rank within the index) within the tree in O(log n)
claude is better at typing faster than me, so here is a bit more expansion on the basis of this: https://claude.ai/share/a53e3de5-a627-4e63-843b-e82bfa646329
Claude
Talk with Claude, an AI assistant from Anthropic
basically, you need a tree structure that keeps track of how many children each subnode has, and then you can find the nth quickly. that's what the aggregate component is helping with
re
OFFSET
with a traditional databases, it's kind of sneaky but they have the same limitation. so OFFSET-based pagination is really really expensive ( https://claude.ai/chat/858340c6-38b9-470f-932b-ac45f99fdd7b )Claude
Talk with Claude, an AI assistant from Anthropic
@Jamie will @lee 's idea work
@Jamie the second Claude link doesn't work
Properly setting up the aggregate still seems a little tricky and being able to tell that it is up to date seems unclear
this does make me understand the issue
fixed second link: https://claude.ai/share/18c34ace-af40-41c8-a5e9-260bfcaa9f73
Claude
Talk with Claude, an AI assistant from Anthropic
I'm sure it will, Lee understands this stuff really well. he built the aggregate component
@lee let me find it
@lee I ran this
export const processLargeDataset = query({
handler: async (ctx) => {
console.log("processLargeDataset")
let count = 0;
const stream = await streamQuery(ctx, {
table: "timeline",
index: "start_index",
order: "asc",
schema: schema, // Your schema definition
}
)
for await (const [] of stream) {
count++;
}
return count;
},
});
cool. but i thought you were trying to read documents between two offsets
i would expect it to look like
this runs and seems correct
but I dont have the aggregate populated yet
Thank you lee, this looks like it should work, basically allowing a take starting at a given index
the equivalent of .gt(key).take(n)
the AI basically thinks you can do that directly using withindex it also thinks skip is supported
great that it's working!
ai hallucinations are a problem for sure
btw there's also
getPage
(described here https://stack.convex.dev/pagination ) which is like streamQuery
but returns an array instead of an async iterableTake Control of Pagination
Convex offers robust control over pagination with a powerful function, getPage, enabling complex edge cases. In this article, we go over how to use th...
both seem to work the same, and be reactive, which is pretty cool
wondering how
ctx.db.query
getPage
streamQuery
are related
nice!
getPage
just calls streamQuery
under the hood. streamQuery
calls ctx.db.query
multiple times to piece together the interval that you want to query.@lee I've accidentally inserted everything in the aggregate twice, how do I clear everything from the aggregate or otherwise fix it
Docs page has "Repair incorrect aggregates" section
@lee thank you (there are no anchor tag links to the sections on this page - which would be nice when sharing them)
https://www.convex.dev/components/aggregate
Repair incorrect aggregates
If some mutation or direct write in the Dashboard updated the source of truth data without writing to the aggregate, they can get out of sync and the returned aggregates may be incorrect.
The simplest way to fix is to start over. Either call await aggregate.clear(ctx) or rename the component like app.use(aggregate, { name: "newName" }) which will reset it to be empty. Then follow the instructions from above.
There is an alternative which doesn't clear the aggregates: compare the source of truth to the aggregate table. You can use db.query("mytable").paginate() on your Convex table and aggregate.paginate() on the aggregate. Update the aggregates based on the diff of these two paginated data streams.
They do work, but i agree they're not discoverable https://www.convex.dev/components/aggregate#repair-incorrect-aggregates
thanks
@lee I'm confused where the aggregate is, where is the data stored? If I rename it, how do I no the old one is gone?
@lee if I move it fromone file to a different file is it a different aggregate
In your project dashboard in the data view, you'll see a dropdown above the tables list. "App" is selected by default, but you can select a component from that list. If you select your Aggregate component you'll see it's tables.
I'm not certain of this, but I believe moving aggegate related code around doesn't change the underlying data, as the component is configured centrally in convex.config.ts.
If you rename it I suspect you'll start a new aggregate. Again, haven't tried this, but I believe if you do this, the old one will show in the dropdown as a greyed out "unmounted" component. You can't do anything with data from an unmounted component, but you can delete the unmounted component from project settings in the Components section.
All that ^ is correct.
Also note that if you call
new TableAggregate(components.myaggregate, ...)
multiple times from different places, it's referring to the same aggregate. You can create multiple aggregates in convex.config.ts with different names, and then they would be instantiated separately with different new TableAggregate(components.<name>)
callsThank you