Vector Search with relational filter
Given this schema, how do I only retrieve embeddings for notes that belong to this particular user?
Do I need to put the
userId
into the embedding table as well?
10 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
Embeddings tables are usually mapped over by the parent table, eg., you get the user's notes and map over embeddings for the notes. But if you don't want the notes and just the embeddings in a query, adding the userId to the embeddings table is the best way.
But I'd generally recommend treating an embeddings table as an extension of the data it represents. The point of having a separate embeddings table is to allow the non-embedding data to be pulled down without the embeddings for bandwidth efficiency.
How would I change my schema to do that? Considering that I split notes into text chunks before embedding them, so there is a 1-to-many relationship.
Ah you’re just searching, I don’t know why I thought you were querying embeddings directly apart from search. At any rate, yeah you’ll want to add the user id to the embedding and filter on that in your search query.
So the userId has to be in both the note and the note embedding?
i.e.
That kind of duplication feels a bit ugly compared to relational databases
as convex is a relational database (and just runs on top of one), would you be willing to show the postgres or equivalent schema you'd use on those system? it would help me understand what you're not getting with convex right now
With Prisma, I would do something like this:
Not sure what SQL query this translates to. But the
userId
is only stored in the note and not in the embeddings.
@jamwt Can you help me with this? I'm preparing a tutorial for YouTube but I'm stuck here
What I need is a relation queryAdding userId to both tables is how you do a relation query here. It seems clunky but that's just because Convex api's are low level. Prisma is doing something very similar under the hood, and providing
findMany
through their orm.
Honestly this is probably a good chance to point this low-level aspect to users in your video, it comes up in a number of places. Convex isn't an orm, but an orm could be built on top of it. Ents (in maintenance mode) is a good example of this: https://labs.convex.dev/convex-ents
This "low level" concept is mentioned explicitly in the Architecture section of Zen of Convex: https://docs.convex.dev/understanding/zenThank you for the clarification! That's a good idea to mention it!
I'm fine with the double userId, I just wanted to make sure I'm not missing anything
It's a bit disorienting for sure. But yeah, a little denormalization can go a long way with Convex.