Kenni
Kenni
CCConvex Community
Created by Kenni on 12/3/2024 in #support-community
How to query items where array field contains a specific value?
I have a paginated query where I need to filter items based on whether an array field contains a specific category ID. Currently, my query only returns items where the array exactly matches [categoryId], but I need it to return items where categoryId is one of potentially many elements in the array. Here's my current query:
const baseQuery = ctx.db
.query("items")
.withIndex("by_owner_and_categories_and_timestamp", q =>
q.eq("ownerId", args.ownerId)
.eq("categories", [args.categoryId]) // Only matches exact array [categoryId]
.lt("timestamp", Number.MAX_SAFE_INTEGER)
)
.order("desc");
const baseQuery = ctx.db
.query("items")
.withIndex("by_owner_and_categories_and_timestamp", q =>
q.eq("ownerId", args.ownerId)
.eq("categories", [args.categoryId]) // Only matches exact array [categoryId]
.lt("timestamp", Number.MAX_SAFE_INTEGER)
)
.order("desc");
For example, if I have these items in my database:
// Item 1 - Currently NOT returned, but I want it to be
{
ownerId: "owner1",
categories: ["category1", "category2"],
timestamp: 123
}

// Item 2 - Currently returned
{
ownerId: "owner1",
categories: ["category1"],
timestamp: 456
}
// Item 1 - Currently NOT returned, but I want it to be
{
ownerId: "owner1",
categories: ["category1", "category2"],
timestamp: 123
}

// Item 2 - Currently returned
{
ownerId: "owner1",
categories: ["category1"],
timestamp: 456
}
When querying for categoryId: "category1", I want both items to be returned since they both contain "category1" in their categories array. Is there a way to achieve this kind of "contains" functionality for array fields in Convex? I need this to work with pagination too. Any suggestions would be appreciated!
4 replies
CCConvex Community
Created by Kenni on 11/20/2024 in #support-community
Handling "Too many reads" Error During Mass Deletion in Convex
Hello Convex team! 👋 I'm encountering a read limit error while implementing parent-child record deletion in our application. Would appreciate guidance on the best approach. The Error
Failed to delete parent: ConvexError: [CONVEX M(parent:deleteParent)]
[Request ID: 4d1ba8b070820b62] Server Error
Uncaught ConvexError: Too many reads in a single function execution (limit: 4096).
Consider using smaller limits in your queries, paginating your queries, or using indexed queries with a selective index range expressions.
Failed to delete parent: ConvexError: [CONVEX M(parent:deleteParent)]
[Request ID: 4d1ba8b070820b62] Server Error
Uncaught ConvexError: Too many reads in a single function execution (limit: 4096).
Consider using smaller limits in your queries, paginating your queries, or using indexed queries with a selective index range expressions.
Current Implementation Here's a simplified version of our deletion code:
export const deleteParent = mutationWithRLS({
args: { parentId: v.id("parents") },
handler: async (ctx, { parentId }) => {
try {
await Promise.all([
// Delete first level children
deleteInBatches({
ctx,
tableName: "childrenA",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
// Delete second level children
deleteInBatches({
ctx,
tableName: "childrenB",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
// Delete associated records
deleteInBatches({
ctx,
tableName: "associatedRecords",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
]);

await ctx.db.delete(parentId);
return { success: true };
} catch (error) {
throw new ConvexError(`Failed to delete parent: ${error.message}`);
}
},
});

// Our batch deletion implementation
async function deleteInBatches({
ctx,
tableName,
indexName,
fieldName,
fieldValue,
}) {
const BATCH_SIZE = 100;
let hasMore = true;

while (hasMore) {
const batch = await ctx.db
.query(tableName)
.withIndex(indexName, (q) => q.eq(fieldName, fieldValue))
.take(BATCH_SIZE);

if (batch.length === 0) break;

for (const item of batch) {
await ctx.db.delete(item._id);
}

hasMore = batch.length === BATCH_SIZE;
}
}
export const deleteParent = mutationWithRLS({
args: { parentId: v.id("parents") },
handler: async (ctx, { parentId }) => {
try {
await Promise.all([
// Delete first level children
deleteInBatches({
ctx,
tableName: "childrenA",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
// Delete second level children
deleteInBatches({
ctx,
tableName: "childrenB",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
// Delete associated records
deleteInBatches({
ctx,
tableName: "associatedRecords",
indexName: "by_parent",
fieldName: "parentId",
fieldValue: parentId
}),
]);

await ctx.db.delete(parentId);
return { success: true };
} catch (error) {
throw new ConvexError(`Failed to delete parent: ${error.message}`);
}
},
});

// Our batch deletion implementation
async function deleteInBatches({
ctx,
tableName,
indexName,
fieldName,
fieldValue,
}) {
const BATCH_SIZE = 100;
let hasMore = true;

while (hasMore) {
const batch = await ctx.db
.query(tableName)
.withIndex(indexName, (q) => q.eq(fieldName, fieldValue))
.take(BATCH_SIZE);

if (batch.length === 0) break;

for (const item of batch) {
await ctx.db.delete(item._id);
}

hasMore = batch.length === BATCH_SIZE;
}
}
Specific Questions 1. Read Limit Management - What's the recommended approach to stay within the 4096 read limit for large deletions? - Should the deletion be split across multiple mutation calls? - Is there a way to monitor read count during execution? 2. Batch Processing - What's the optimal batch size for avoiding read limits? - Is sequential processing better than parallel (Promise.all) for related tables? - How should we handle pagination for large-scale deletions? 3. Best Practices - What's the recommended pattern for deleting deeply nested data structures? - Should we implement a job system for large deletions? - Are there specific indexing strategies for optimization? Current Scale - Average parent record has: - 100-200 type A children - 50-100 type B children - 500-1000 associated records - Each child might have 5-10 nested records What We've Tried 1. Reduced batch size to 50 2. Processed tables sequentially instead of parallel 3. Used indexed queries Would greatly appreciate guidance on: 1. Most efficient way to structure these deletions 2. Best practices for handling read limits 3. Recommended Convex features for this use case Thank you! 🙏
7 replies