samvitr
samvitr2w ago

How to query messages only from a user’s friends (~1000 friends/user, ~100k messages/topic).

I’m considering moving a chat app to Convex but have some concerns about performance I was hoping for some clarity on. In our app, each user only sees messages from people who are their friends, for each channel. Example: a channel has ~100k messages, and a user may have ~1000 friends. I need all messages in that channel written by authors in that user’s friend list. Simplified schema:
friendships:
user: Id<"users">
friend: Id<"users">
// index: by_user
friendships:
user: Id<"users">
friend: Id<"users">
// index: by_user
messages:
channel: Id<"channel">
authorId: Id<"users">
createdAt: number
// index: by_channel_createdAt
messages:
channel: Id<"channel">
authorId: Id<"users">
createdAt: number
// index: by_channel_createdAt
What I expected to do is the equivalent of SQL:
select messages
WHERE channel = <channelId>
AND
authorId IN (friendIds)
select messages
WHERE channel = <channelId>
AND
authorId IN (friendIds)
But Convex filters don’t support an “in” operator for a large dynamic array. What my options seem to be: Huge OR chain:
q.or(...friendIds.map(fid => q.eq(q.field("authorId"), fid)))
q.or(...friendIds.map(fid => q.eq(q.field("authorId"), fid)))
This is not reasonable for ~1000 IDs? Or: Fetch all recent messages by index and filter in JS:
const recent = await ctx.db
.query("messages")
.withIndex("by_channel_createdAt", q => q.eq("channel", channel))
.order("desc")
.take(2000);

const friendSet = new Set(friendIds);
return recent.filter(m => friendSet.has(m.authorId));
const recent = await ctx.db
.query("messages")
.withIndex("by_channel_createdAt", q => q.eq("channel", channel))
.order("desc")
.take(2000);

const friendSet = new Set(friendIds);
return recent.filter(m => friendSet.has(m.authorId));
This only works if I pick an arbitrary recent window. It does not solve searching through all 100k messages without scanning everything. My question: What is the recommended Convex approach for: “Get all messages in channel X where authorId is in a large dynamic list (1000 items)” without scanning the entire channel and without generating a massive OR chain? I get that I can denormalize this and have a view for each user, where the messages that get written are duplicated per user to make it O(1) at read time. But this could be 1000x more writes per message which doesn't make sense to do.
3 Replies
Convex Bot
Convex Bot2w ago
Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!
BrianOnDaBianchi
I think you can use convex-helpers stream filters to do this. Index would be on channel and authorId, when you stream the query you can filter it using TypeScript something like authorIdSet.includes(post.authorId). https://stack.convex.dev/merging-streams-of-convex-data
Merging Streams of Convex data
New convex-helpers are available now for fetching streams of documents, merging them together, filtering them them out, and paginating the results. Wi...
BrianOnDaBianchi
Another thing you can do is apply rules from convex-helpers to the query and only allow read access to the author's friends, then just do a query like normal and the rules will prevent the irrelevant messages from being returned. Rules are row based

Did you find this page helpful?