v 💻
v 💻•3mo ago

is there no good way to do paginated queries with advanced filtering like multiple field searches, s

is there no good way to do paginated queries with advanced filtering like multiple field searches, string array filtering. and also aggregating count based on filters. easily with convex? chose convex for db in my company but without these its becoming hard to use
28 Replies
Convex Bot
Convex Bot•3mo ago
Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!
Jamal
Jamal•3mo ago
Paginated queries have all the same options as normal query filtering https://docs.convex.dev/database/reading-data/filters. Convex recommends using indexes over filters in the database query object
Jamal
Jamal•3mo ago
Do you have a code example of what your trying to do?
Jamal
Jamal•3mo ago
Convex
Aggregate
Keep track of sums and counts in a denormalized and scalable way.
v 💻
v 💻OP•3mo ago
yes but in every place i edit the db row i have to use that
leads: defineTable({
name: v.string(),
email: v.string(),
phone: v.string(),
tags: v.array(v.string()),
userId: v.id("users"),
})
.index("by_user", ["userId"])
.index("by_user_and_email", ["userId", "email"])
.index("by_user_and_phone", ["userId", "phone"])
.searchIndex("search_email", {
searchField: "email",
filterFields: ["userId"],
})
.searchIndex("search_phone", {
searchField: "phone",
filterFields: ["userId"],
})
.searchIndex("search_name", {
searchField: "name",
filterFields: ["userId"],
}),
leads: defineTable({
name: v.string(),
email: v.string(),
phone: v.string(),
tags: v.array(v.string()),
userId: v.id("users"),
})
.index("by_user", ["userId"])
.index("by_user_and_email", ["userId", "email"])
.index("by_user_and_phone", ["userId", "phone"])
.searchIndex("search_email", {
searchField: "email",
filterFields: ["userId"],
})
.searchIndex("search_phone", {
searchField: "phone",
filterFields: ["userId"],
})
.searchIndex("search_name", {
searchField: "name",
filterFields: ["userId"],
}),
see i got this i need to have paginated query. with search filter - which should search on email, phone and name. also with tags filter. and also userId filter.
Jamal
Jamal•3mo ago
When most convex devs have really complex querys that convex does not directly support we tend to use normal typescript for filtering. What i would do is query the data, use indexes anywhere you can, and collect. Then in the typescipt code filter out based on your search filters. This way you have the flexibility of typescript iterators and array methods. This convex helper may or maynot be useful as well https://github.com/get-convex/convex-helpers/blob/main/packages/convex-helpers/README.md#manual-pagination
GitHub
convex-helpers/packages/convex-helpers/README.md at main · get-con...
A collection of useful code to complement the official packages. - get-convex/convex-helpers
v 💻
v 💻OP•3mo ago
yeah but that makes it not super optimized for queries with millions of documents and super advanced filtering can't these things be added directly to db
Jamal
Jamal•3mo ago
Well having big amounts of data always has some overhead. You have index on your table so unless you expect a single user to have millions of leads themselfs then when you filter by userId + pagination it should be very performant. The idea of the index is to cut out a lot of the data before it runs in the js code javascript is very fast, even if you have to filter 5k documents it wont take over a second and with convex caching
v 💻
v 💻OP•3mo ago
yes but having the queries filtered in the db itself makes it even performant also whhat about doing things like count on filtered table
Jamal
Jamal•3mo ago
Filtering in Convex’s .filter() and filtering in TypeScript after .collect() have essentially the same performance characteristics—they both scan all documents in the table. The main difference is syntactic and where the filtering code runs (Rust vs. JavaScript), but both are equivalent to an unindexed SQL WHERE clause Performance of TypeScript filters. https://stack.convex.dev/complex-filters-in-convex#performance-of-typescript-filters-is-the-same-as-sql-unindexed-where-clauses
Using TypeScript to Write Complex Query Filters
There’s a new Convex helper to perform generic TypeScript filters, with the same performance as built-in Convex filters, and unlimited potential.
v 💻
v 💻OP•3mo ago
ah there's no way to directly filter in the underlying db?
Jamal
Jamal•3mo ago
Indexes are the only way I know to filter efficiently without doing full table scans @ballingt Could you explain more I might be missing something @v 💻 Why are index not enough for your querys? Can you show me a code example of what you want to do, even if its not preformant? Whats your current query logic?
v 💻
v 💻OP•3mo ago
export const getLeadsPaginated = userQuery({
args: {
paginationOpts: paginationOptsValidator,
search: v.optional(v.string()),
tags: v.optional(v.array(v.string())),
},
handler: async (ctx, args) => {
const { search = "", tags, paginationOpts } = args;

const query = paginator(ctx.db, schema)
.query("leads")
.withIndex("by_user", (q) => q.eq("userId", ctx.user._id))
.filterWith(async (doc) => {
if (search) {
return (
doc.email.toLowerCase().includes(search.toLowerCase()) ||
doc.phone.includes(search) ||
doc.name.toLowerCase().includes(search.toLowerCase())
);
}

if (tags) {
return tags.some((tag) => doc.tags.includes(tag));
}

return true;
});

return query.paginate(paginationOpts);
},
});

export const getLeadsMetadata = userQuery({
args: {},
handler: async (ctx) => {
const allLeads = await ctx.db
.query("leads")
.withIndex("by_user", (q) => q.eq("userId", ctx.user._id))
.collect();

const totalCount = allLeads.length;
const allTags = Array.from(
new Set(allLeads.flatMap((lead) => lead.tags || [])),
).sort();

const currentMonth = new Date().getMonth();
const currentYear = new Date().getFullYear();
const thisMonthCount = allLeads.filter((lead) => {
const leadDate = new Date(lead._creationTime);
return (
leadDate.getMonth() === currentMonth &&
leadDate.getFullYear() === currentYear
);
}).length;

return {
totalCount,
allTags,
thisMonthCount,
};
},
});
export const getLeadsPaginated = userQuery({
args: {
paginationOpts: paginationOptsValidator,
search: v.optional(v.string()),
tags: v.optional(v.array(v.string())),
},
handler: async (ctx, args) => {
const { search = "", tags, paginationOpts } = args;

const query = paginator(ctx.db, schema)
.query("leads")
.withIndex("by_user", (q) => q.eq("userId", ctx.user._id))
.filterWith(async (doc) => {
if (search) {
return (
doc.email.toLowerCase().includes(search.toLowerCase()) ||
doc.phone.includes(search) ||
doc.name.toLowerCase().includes(search.toLowerCase())
);
}

if (tags) {
return tags.some((tag) => doc.tags.includes(tag));
}

return true;
});

return query.paginate(paginationOpts);
},
});

export const getLeadsMetadata = userQuery({
args: {},
handler: async (ctx) => {
const allLeads = await ctx.db
.query("leads")
.withIndex("by_user", (q) => q.eq("userId", ctx.user._id))
.collect();

const totalCount = allLeads.length;
const allTags = Array.from(
new Set(allLeads.flatMap((lead) => lead.tags || [])),
).sort();

const currentMonth = new Date().getMonth();
const currentYear = new Date().getFullYear();
const thisMonthCount = allLeads.filter((lead) => {
const leadDate = new Date(lead._creationTime);
return (
leadDate.getMonth() === currentMonth &&
leadDate.getFullYear() === currentYear
);
}).length;

return {
totalCount,
allTags,
thisMonthCount,
};
},
});
cant use search index on 3 fields at once also string array for tags
Jamal
Jamal•3mo ago
Are you using convex-helpers in this code? I dont reconize the paginator function
v 💻
v 💻OP•3mo ago
yeah
Jamal
Jamal•3mo ago
I still think you can do your filtering in the javascript and avoid this complexity. .withIndex("by_user", (q) => q.eq("userId", ctx.user._id)) this line already eliminates a large amount of data on your table before the data is filtered in js..
ian
ian•3mo ago
Generally you need to define the indexes on the fields you want. Here's some patterns that might help: - defineTable(...).index("by_user_year_month", ["userId", "year", "month"]) and store the year & month explicitly - db.query("leads").withIndex("by_user", q => q.eq("userId", ctx.user._id).gte(monthStartUTC).lt(nextMonthStartUTC) For string array filtering, the best practice is to have a "join table" where you store the tags, then look up by each tag you're looking for.
v 💻
v 💻OP•3mo ago
how
ian
ian•3mo ago
For searching against lowercase strings, you can store a lowercase version of the field in your table, and have an index on that field to do .eq
Jamal
Jamal•3mo ago
Ah right, I forgot its recommended to avoid nested objects and deep arrays
v 💻
v 💻OP•3mo ago
i see
ian
ian•3mo ago
You can also do a text search over the tags that mostly works, but that's a bit of a hack
v 💻
v 💻OP•3mo ago
and for count with filtering? i have to get documents again just for count because i need count of unfiltered and unpaignated also filtered and unpaginated i just want to filter by tags not text search
ian
ian•3mo ago
yeah but text search over a field like "tagA tabB tagC" happens to work ok for searching "tagC" - but the best practice is a join table If you need high efficiency counting, you can: - denormalize the count into another table - use the Sharded Counter component to track counts for various types - use the Aggregate component to track all the leads, and depending on your sort order, allows you to get overall count, count for various subsets, as well as percentiles & more. It does the tracking of all the counts in a btree
v 💻
v 💻OP•3mo ago
ah i can add aggregate component down the line? it will prepopulate?
ian
ian•3mo ago
You're responsible for backfilling & migrating data
v 💻
v 💻OP•3mo ago
ah
ian
ian•3mo ago
But once you do you can set up a database trigger to keep it up to date Something most people don't know about SQL is that doing COUNT is often a full scan of the range. So for application-centric (OLTP) workloads doing some normalization helps a lot I've gott run, but hope this is a good starter And thanks @Jamal for all the help!

Did you find this page helpful?