Multiple vector indexes on the same table
I have a bunch of embeddings stored in a table. I want to be able to filter those on different columns depending on the scenario. I assume I'm meant to create two different vectorIndex's. But I keep getting the error when I try:
The indexes I have on that table are:
If I remove the second
vectorIndex
then it works. Any ideas what I'm doing wrong?7 Replies
Interesting, this looks to me like it should work, so we'll look into it. But it also looks like you could use a single index
.vectorIndex("by_embedding", {vectorField: "embedding", dimensions: 1536, filterFields: ["clerkUserId", "documentId"]})
. For each vector search query you can filter on any subset of the filter fields.To expand on Lee's point a bit, a vectorIndex is different from a normal index in the way that filterFields work. You don't have to specify them in prefix order. You can filter on just
documentId
or just clerkUserId
. As a result, you don't benefit from having two indexes on the same field. Of course let us know if you have a use case that would benefit from multiple indexes that I haven't thought ofSo specifying both fields in the one index
["clerkUserId", "documentId"]
seems to work, thanks!
This mostly meets my needs - the only thing I'm missing is the ability to do AND
s in vector searches (intellisense and the docs both seem to indicate it's not available). When I'm filtering by documentId I would have liked to be able to also filter by clerkUserId to make sure I haven't stuffed up and am only looking at documents for the current user 🙂Gotcha. For that case, you could check the results when they come back, which would require fetching the document for the id returned from the search
@sshader and @sujayakar looked into supporting
AND
s for filters on vector search, and iirc we discovered that it's an open research problem (as long as you want any kind of guarantees about the accuracy or runtime)
If the research problem is solved, we'll work on supporting it 😛. Isn't it fun working on cutting edge ai stuffYeah, and checking the results after the fact is fine for this use case - hopefully it has no effect 😉
FWIW it looks like Pinecone has support for
AND
s. See: https://docs.pinecone.io/docs/metadata-filtering#querying-an-index-with-metadata-filtersyep i believe it can be implemented but not with a reasonable time complexity 😅