Storing openai embeddings "length limit exceeded"
I'm generating OpenAI embeddings in an action, and when I try and pass them to a mutation for saving I run into:
I'm assuming my only option is to pass them to the mutation in batches? Hoping to avoid that, since it's not ideal from a transactional point of view.
In terms of the quantity I'm trying to pass to the mutation - it's just 100 vectors, and the original text which is approx 69k characters long.
6 Replies
Hey @yarrichar , for simplicity I would handle each embedding in a single mutation call.
Yeah, that's what I ended up doing. It's a bit of annoying limit though since it (1) makes it basically impossible to save updated embeddings without a window of either duplications or missing data, and (2) Bumps up the number of function calls I have to use greatly
I also believe I'm saving less than both the max DB persist limit, and the max function size argument (8MiB) (about 22KB per vector * 100 + 70KB for the raw text)
I guess I'd hit that with larger documents anyway though
This does sound like a bug on our side, will investigate, thanks!
Are you calling this from
"use node"
action by any chance?Yeah, I am calling it from "use node"
@Michal Srb now I'm getting:
When trying to do the save. It was a bug that I was trying to save over 1k, but I guess I could get there if I have multiple people saving at the same time.... Is one way to handle this to schedule each of the saves (with timeout = 0)?
Is one way to handle this to schedule each of the saves (with timeout = 0)?Yeah, or manage the batch size (say 20 at a time)
Example of batching: https://github.com/langchain-ai/langchainjs/blob/d7a9803ed4dbaae0202617cebf73b170f5db0335/libs/langchain-community/src/vectorstores/convex.ts#L178-L191
GitHub
langchainjs/libs/langchain-community/src/vectorstores/convex.ts at ...
🦜🔗 Build context-aware reasoning applications 🦜🔗. Contribute to langchain-ai/langchainjs development by creating an account on GitHub.