Vector search advanced filtering
The above is what we can currently achieve using filters in vector search.
Can we expect more advanced filtering anytime soon?
here is an example of what i would like to achieve. (and operator, greater than and less than to filter numbers)
eg.
7 Replies
Out of all the items in the cosmetics category that match a given vector search, how many are in the price range you're trying to filter, and how many are outside of it?
Say it's between 10% to 90%. Then filtering in JS only has to load max 10x more results. It's likely this will work just fine your use case.
@sujayakar can speak to whether more advanced filtering will be technically possible in the future.
Thanks for the input.
If you are looking to support advanced filtering, maybe prioritising 'and operator' will make a huge difference.
For example,
the below filter will filter the documents where one of the filters is matching but not both.
In a practical scenario, user should be able to input multiple filters (in this case, the user selects category and sub category dropdowns, and then proceeds to input a string in search bar).
So, support for something like below filter will be great:
Similarly to my previous suggestion, you can use an
or
then filter out in JS for the and
.I believe
and
filters is an ongoing research project for vector databases due to the structure of the data. Products that offer this as a built-in part of their syntax end up doing heavy manual filtering under the hood and being slower - you can do your own filtering in code in Convex since the code runs so close to the database.That makes sense. Can we perform vector search on a specific set of documents? Let’s say you have an array of ids of convex documents, can we perform vector search only on those documents?
The cool thing about normalized vectors (like OpenAI’s) is that you can compute them with a simple dot product. Very fast if you have the arrays in memory
one other idea is that if you know you're always going to search for
category AND subCategory
, you could concatenate them into a field (i.e. set document.filterKey = category + ':' + subCategory
) when inserting, declare filterKey
as a filter field in the schema, and then do a single q.eq
on that field.
as ian mentioned, it's still ongoing research to support arbitrary AND
expressions when filtering, but that workaround will work if it's a fixed set of columns.