Foxxy
Foxxy8mo ago

Hey Convex! Is more query options coming

Hey Convex! Is more query options coming to vector search anytime soon? I've been trying to figure out a good way to do a vector search on a composite key, but it seems like the vector query builder only supports eq and or so it doesn't seem like it's currently possible to filter by fieldA and fieldB. Am I missing something obvious?
3 Replies
sujayakar
sujayakar8mo ago
hey @Foxxy -- we didn't add AND support for vector search since it's still a bit of an open research problem to support it efficiently at scale. (can point to same papers if you're interested!) one workaround for now is that if you know you're always going to filter by both fieldA and fieldB, you can create a new field with their concatenation and then filter by that. so, if they're both strings, you can store fieldC: fieldA + '-' + fieldB and then have a single filter condition on fieldC for your query. lmk if that helps!
Foxxy
FoxxyOP8mo ago
Id be interested if you have a link handy to read about it sometime! And yeah the new composite key to search on one field was my first thought, but wanted to double check before going too far. Thanks!
sujayakar
sujayakar8mo ago
https://qdrant.tech/articles/filtrable-hnsw/ is a cool article on how qdrant implements filtering by modifying their HNSW index -- there's a neat result from percolation theory that explains why they can maintain high recall even up to a filter eliminating ~80% of results. the filtered search track from neurips last year (https://big-ann-benchmarks.com/neurips23.html) is very relevant, and the winner looks promising, but there's not many details available yet other than the slides: https://big-ann-benchmarks.com/neurips23_slides/IVF_2_filter_Ben.pdf the more established result is Filtered-DiskANN from microsoft research (https://harsha-simhadri.org/pubs/Filtered-DiskANN23.pdf), but they limit the number of distinct filter values to ~1000.

Did you find this page helpful?