Intelligent Fig
Intelligent Fig11mo ago

Vector search advanced filtering

filter: (q) =>
q.or(q.eq("cuisine", "French"), q.eq("mainIngredient", "butter")),
filter: (q) =>
q.or(q.eq("cuisine", "French"), q.eq("mainIngredient", "butter")),
The above is what we can currently achieve using filters in vector search. Can we expect more advanced filtering anytime soon? here is an example of what i would like to achieve. (and operator, greater than and less than to filter numbers) eg.
filter: (q) =>
q.and(q.eq("category", "cosmetics"), q.and(q.gt("price", 10), q.lt("price, 50)))
filter: (q) =>
q.and(q.eq("category", "cosmetics"), q.and(q.gt("price", 10), q.lt("price, 50)))
7 Replies
Michal Srb
Michal Srb11mo ago
Out of all the items in the cosmetics category that match a given vector search, how many are in the price range you're trying to filter, and how many are outside of it? Say it's between 10% to 90%. Then filtering in JS only has to load max 10x more results. It's likely this will work just fine your use case.
const results = await ctx.vectorSearch("items", "embedding", {
vector: embedding,
limit: 16,
filter: (q) => q.eq("category", "cosmetics"),
});
return results.filter(item => item.price > 10 && item.price < 50);
const results = await ctx.vectorSearch("items", "embedding", {
vector: embedding,
limit: 16,
filter: (q) => q.eq("category", "cosmetics"),
});
return results.filter(item => item.price > 10 && item.price < 50);
@sujayakar can speak to whether more advanced filtering will be technically possible in the future.
Intelligent Fig
Intelligent FigOP11mo ago
Thanks for the input. If you are looking to support advanced filtering, maybe prioritising 'and operator' will make a huge difference. For example, the below filter will filter the documents where one of the filters is matching but not both.
filter: (q) =>
q.or(q.eq("category", "Cosmetics"), q.eq("subCategory", "Makeup")),
filter: (q) =>
q.or(q.eq("category", "Cosmetics"), q.eq("subCategory", "Makeup")),
In a practical scenario, user should be able to input multiple filters (in this case, the user selects category and sub category dropdowns, and then proceeds to input a string in search bar). So, support for something like below filter will be great:
filter: (q) =>
q.and(q.eq("category", "Cosmetics"), q.eq("subCategory", "Makeup")),
filter: (q) =>
q.and(q.eq("category", "Cosmetics"), q.eq("subCategory", "Makeup")),
Michal Srb
Michal Srb11mo ago
Similarly to my previous suggestion, you can use an or then filter out in JS for the and.
ian
ian11mo ago
I believe and filters is an ongoing research project for vector databases due to the structure of the data. Products that offer this as a built-in part of their syntax end up doing heavy manual filtering under the hood and being slower - you can do your own filtering in code in Convex since the code runs so close to the database.
Intelligent Fig
Intelligent FigOP11mo ago
That makes sense. Can we perform vector search on a specific set of documents? Let’s say you have an array of ids of convex documents, can we perform vector search only on those documents?
ian
ian11mo ago
The cool thing about normalized vectors (like OpenAI’s) is that you can compute them with a simple dot product. Very fast if you have the arrays in memory
sujayakar
sujayakar11mo ago
one other idea is that if you know you're always going to search for category AND subCategory, you could concatenate them into a field (i.e. set document.filterKey = category + ':' + subCategory) when inserting, declare filterKey as a filter field in the schema, and then do a single q.eq on that field. as ian mentioned, it's still ongoing research to support arbitrary AND expressions when filtering, but that workaround will work if it's a fixed set of columns.

Did you find this page helpful?