21 replies

vectorSearch filter limitations

Hi!

Im building a multi-user app for image library management and am incorporating vector search for retrieving similar images.

everything was actually quite smooth at first - I defined my images table to have an

ownerId

ownerId

ownerId

ownerId and used that same field as a filter field for the

vectorIndex

vectorIndex

vectorIndex

vectorIndex. As a basic filter this worked splendidly for access control.

      const vectorResults = await ctx.vectorSearch("images", "by_embedding", {
        vector: result.embeddings,
        limit: requestedLimit,
        filter: (q) => q.eq("ownerId", userId)
      });

      const vectorResults = await ctx.vectorSearch("images", "by_embedding", {
        vector: result.embeddings,
        limit: requestedLimit,
        filter: (q) => q.eq("ownerId", userId)
      });

      const vectorResults = await ctx.vectorSearch("images", "by_embedding", {
        vector: result.embeddings,
        limit: requestedLimit,
        filter: (q) => q.eq("ownerId", userId)
      });

      const vectorResults = await ctx.vectorSearch("images", "by_embedding", {
        vector: result.embeddings,
        limit: requestedLimit,
        filter: (q) => q.eq("ownerId", userId)
      });

the problem I began to encounter occurred once I began incorporating search into album-management.

Goal 1: Search for relevant images within an album
Goal 2: Find relevant images that are NOT already in this album (to add them in)
Design Restriction: Images can belong to multiple albums - that state is tracked in another table, not alongside the images themselves.

Goal 1 is currently implemented by searching for all the users images and filtering after the fact by membership to an album. This approach will get starved of results due to the 256 return limit once the album gets large (if relevant images not in album rank above anything inside it).

Goal 2 is similar: find all user's images and filter out the ones currently in album. This will starve once the album is full of highly ranked images.

I've been thinking about how to get around this.
- Brute force with paginating

vectorSearch

vectorSearch

vectorSearch

vectorSearch results
- Returning more than 256 (just kicks the can down the road). But that's inefficient.

The only solution I thought of involved a lot of data duplication: copying image embeddings multiple times (one per album) and using the

ownerId:albumId

ownerId:albumId

ownerId:albumId

ownerId:albumId as the filter key / index instead.
But that still only really addresses Goal 1 (Id be able to cope with max 256 results + no starvation), not Goal 2.

Some advice / guidance would be greatly appreciated! Id really prefer not to move vector search outside convex!

vectorSearch filter limitations

Similar Threads

Similar Threads

Similar Threads