Hey Convex! Is more query options coming
Hey Convex! Is more query options coming to vector search anytime soon? I've been trying to figure out a good way to do a vector search on a composite key, but it seems like the vector query builder only supports
eq
and or
so it doesn't seem like it's currently possible to filter by fieldA and fieldB. Am I missing something obvious?3 Replies
hey @Foxxy -- we didn't add AND support for vector search since it's still a bit of an open research problem to support it efficiently at scale. (can point to same papers if you're interested!)
one workaround for now is that if you know you're always going to filter by both fieldA and fieldB, you can create a new field with their concatenation and then filter by that.
so, if they're both strings, you can store
fieldC: fieldA + '-' + fieldB
and then have a single filter condition on fieldC
for your query. lmk if that helps!Id be interested if you have a link handy to read about it sometime! And yeah the new composite key to search on one field was my first thought, but wanted to double check before going too far. Thanks!
https://qdrant.tech/articles/filtrable-hnsw/ is a cool article on how qdrant implements filtering by modifying their HNSW index -- there's a neat result from percolation theory that explains why they can maintain high recall even up to a filter eliminating ~80% of results.
the filtered search track from neurips last year (https://big-ann-benchmarks.com/neurips23.html) is very relevant, and the winner looks promising, but there's not many details available yet other than the slides: https://big-ann-benchmarks.com/neurips23_slides/IVF_2_filter_Ben.pdf
the more established result is Filtered-DiskANN from microsoft research (https://harsha-simhadri.org/pubs/Filtered-DiskANN23.pdf), but they limit the number of distinct filter values to ~1000.