Khalil
Khalil6mo ago

vector search stopped working

My code has not changed for a few weeks, and now the vector search method inside actions is no longer returning results when using the filter property.
10 Replies
Khalil
KhalilOP6mo ago
const results = await vectorStore.similaritySearch(args.query, 5, {
filter: (q) =>
q.eq(
"metadata.chatbotId",
args.chatbotId,
),
});
const results = await vectorStore.similaritySearch(args.query, 5, {
filter: (q) =>
q.eq(
"metadata.chatbotId",
args.chatbotId,
),
});
The results are always empty, I triple made sure that metadata.chatbotId exists for the value I am passing, and again, my code has not changed, this used to work some time ago this is using the langchain convex integration, also tried with vanilla vectorSearch and no luck
Emma
Emma6mo ago
What version of convex are you on?
Khalil
KhalilOP6mo ago
"convex": "^1.13.0",
Emma
Emma6mo ago
What does your code using vanilla vector search look like? (if you feel comfortable sharing)
Khalil
KhalilOP6mo ago
// convex/ai.ts

export const search = internalAction({
args: {
query: v.string(),
chatbotId: v.id("chatbot"),
},
handler: async (ctx, args) => {
const embeddings = await cohereClient.embed({
texts: [args.query],
model: DEFAULT_EMBEDDING_MODEL,
inputType: "search_query",
});
const [vector] = embeddings.embeddings as number[][];

const results = await ctx.vectorSearch("document", "byEmbeddings", {
vector,
filter: (q) => q.eq("metadata.chatbotId", args.chatbotId),
limit: 5,
});

return results;
},
});
// convex/ai.ts

export const search = internalAction({
args: {
query: v.string(),
chatbotId: v.id("chatbot"),
},
handler: async (ctx, args) => {
const embeddings = await cohereClient.embed({
texts: [args.query],
model: DEFAULT_EMBEDDING_MODEL,
inputType: "search_query",
});
const [vector] = embeddings.embeddings as number[][];

const results = await ctx.vectorSearch("document", "byEmbeddings", {
vector,
filter: (q) => q.eq("metadata.chatbotId", args.chatbotId),
limit: 5,
});

return results;
},
});
// convex/schema.ts

export default defineSchema({
document: defineTable({
text: v.string(),
embeddings: v.array(v.float64()),
metadata: v.object({
chatbotId: v.id("chatbot"),
}),
})
.vectorIndex("byEmbeddings", {
vectorField: "embeddings",
filterFields: ["metadata.chatbotId"],
dimensions: 1024,
})
.index("by_chatbot", ["metadata.chatbotId"])
})
// convex/schema.ts

export default defineSchema({
document: defineTable({
text: v.string(),
embeddings: v.array(v.float64()),
metadata: v.object({
chatbotId: v.id("chatbot"),
}),
})
.vectorIndex("byEmbeddings", {
vectorField: "embeddings",
filterFields: ["metadata.chatbotId"],
dimensions: 1024,
})
.index("by_chatbot", ["metadata.chatbotId"])
})
when I remove the filter, it works, but of course I need that filter to only search based on a chatbot's resources
Emma
Emma6mo ago
that's helpful to know! I'll try to repro and figure out why filtering is broken
presley
presley6mo ago
I was able to reproduce. The issue seem related to fact we are filtering on a nested field (or at least I couldn't reproduce it without nesting). I am investigating to try to find a root cause. As a stop gap, if you remove the nesting, my bet is it will work, but not sure how viable it is.
Khalil
KhalilOP6mo ago
It was working some time ago! My guess is that something changed recently
presley
presley6mo ago
The bug is related to nested field. We are working on rolliing out a fix. The issue is a bit tricky because we always showed recently added results (within the last hour) but due to incorrectly handling the nesting, weren't showing historical results. Thus, when you test it might appear working, which I think led to the confusion (and the fact we didn't catch it). We have fixed the issue and deployed a fix. Again, it only affected filtering on nested fields for data older than 1h. For historical data to become available, you should drop and readd the index. We will look into doing this automatically next week but easier to do that yourself for now. For example, you can remove the filter field from the index, deploy, and then add the field back. Alternatively, you could add another filter field like "text" and deploy `filterFields: ["metadata.chatbotId", "text"],. Adding the extra fields forces the index to rebuild. Later you can remove the "text" filter field again. Thanks for the report and sorry about the trouble!
Khalil
KhalilOP6mo ago
wow that was a fast resolution, thank you!

Did you find this page helpful?