Using custom query with typescript filtering and pagination. Performance and limitation issues.
Hello together,
I have a question regarding quering with streams and
.filterWith
. We have this table structure
We want to perform a search by
title and
content` query, which does a include filtering based on JavaScript. We use this mutation:
I set the index "title" but there are problems which I need to figure out and would like to ask you.
First, Convex blocks if the size of report table is bigger then 16MB
Second, the performance.
I was thinking that there most be a way that filterings needs to be performed in batches so that the search and looping is stopped until the pagination limiting is finished.
What would be the correct convex way to solve this problem?
Thank you7 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
To make this more efficient, I would suggest having a field
title_content_lowercase
where you store ${title.toLocaleLowerCase()} ${content.toLocaleLowerCase()}
and then define a text search over that field and use the lowercase of the query as the search term.
In fact you likely don't need to do case changes, as I believe it's already case-insensitive
Doing the filtering the way you're doing it is scanning every row of the database unfortunately.Hi @ian thanks. Ok cool. So basically my approch is not wrong, right? With your proposal it could be done better. I saw, that when I use a search index, then I only can use one "withSearchIndex" statement per query, so when I want to have a search in "title" and "content" (because of multisearch and I will have up to 30 filters on my website) then I need to merge these query streams together, right? Would that be the convax-way?
Your approach is not technically wrong, provided you are ok paginating over the whole table from the client.
FYI
.index("title", ["title"])
and .index("content", ["content"])
are not affecting your query at all right now - they aren't automatically used in any way by filterWith
For searches you might find other approaches useful:
- Move all the search terms into one field so there's one search index
- Do separate queries for each and combine it on the client
- Kick off multiple one-off queries (convex.query
) for different results from the client
- Have the search be an action that calls multiple search queries (also enables vector search) and combines the results in a clever way (e.g. https://github.com/get-convex/rag/blob/main/src/client/hybridRank.ts#L1-L6)GitHub
rag/src/client/hybridRank.ts at main · get-convex/rag
Document search component to aid RAG. Contribute to get-convex/rag development by creating an account on GitHub.
Nice, I will have a look at searchIndex and the multiple search. But I would like to dig a bit deeper regarding the streaming concept, if that's ok. Considering my app is about some "contact" / "adress book" management.
Having
report
table a bit extended now, where also zip
and city
as fields are included.
My query would look in SQL something like
My focus is here that I do a AND
condition. Because the user of my app want to specify a range to reports by these conditions. I want to intentionally not use a searchIndex
because for zip code and city I want to match all reports where the match is not fuzzy but exactly. (title
and content
can be fuzzy).
My question is now, if I set up a stream query for every search term. I can merge them together
which represents an OR
conjunction. I unfortunately did not found the representative Stream operation in the helper tools regarding a intersection
, like the result are these reports which are in all of the streams.
Did I was silly to found it or can you give me some hint how to solve that problem :)?
Thank you
(The code is more or less pseudo code as you will see, but I hope I descriped my problem well)Gotcha - the equivalent to that SQL query is a table scan (likely what you'd see with
describe
on that query but lmk if not).
You have two options:
- Make a field with `${title} ${content} ${content_summarized} ${tags.join(" ")} ` and do a text search with a filter field on zip & city
- Do what you did initially with a paginated query over the whole table doing a custom js function to filter. If you do that, look into the options for paginator / stream to limit how many bytes you read per pagination round. Worst case you'd try to query the whole table in the first page if there's no matches.ok nice, Thank you, Ian