Storing openai embeddings "length limit exceeded"

I'm generating OpenAI embeddings in an action, and when I try and pass them to a mutation for saving I run into:

{"code":"BadJsonBody","message":"Failed to buffer the request body: length limit exceeded"}

{"code":"BadJsonBody","message":"Failed to buffer the request body: length limit exceeded"}

I'm assuming my only option is to pass them to the mutation in batches? Hoping to avoid that, since it's not ideal from a transactional point of view. In terms of the quantity I'm trying to pass to the mutation - it's just 100 vectors, and the original text which is approx 69k characters long.

6 Replies

Michal Srb•16mo ago

Hey @yarrichar , for simplicity I would handle each embedding in a single mutation call.

yarricharOP•16mo ago

Yeah, that's what I ended up doing. It's a bit of annoying limit though since it (1) makes it basically impossible to save updated embeddings without a window of either duplications or missing data, and (2) Bumps up the number of function calls I have to use greatly I also believe I'm saving less than both the max DB persist limit, and the max function size argument (8MiB) (about 22KB per vector * 100 + 70KB for the raw text) I guess I'd hit that with larger documents anyway though

Michal Srb•16mo ago

This does sound like a bug on our side, will investigate, thanks! Are you calling this from "use node" action by any chance?

yarricharOP•16mo ago

Yeah, I am calling it from "use node" @Michal Srb now I'm getting:

failure
Uncaught Error: Too many concurrent operations. See https://docs.convex.dev/functions/actions#limits for more information.
    at async <anonymous> (../convex/aiIndex.ts:119:16)
    at async handler (../convex/aiIndex.ts:116:12)

failure
Uncaught Error: Too many concurrent operations. See https://docs.convex.dev/functions/actions#limits for more information.
    at async <anonymous> (../convex/aiIndex.ts:119:16)
    at async handler (../convex/aiIndex.ts:116:12)

When trying to do the save. It was a bug that I was trying to save over 1k, but I guess I could get there if I have multiple people saving at the same time.... Is one way to handle this to schedule each of the saves (with timeout = 0)?

Michal Srb•16mo ago

Is one way to handle this to schedule each of the saves (with timeout = 0)?

Yeah, or manage the batch size (say 20 at a time)

Michal Srb•16mo ago

Example of batching: https://github.com/langchain-ai/langchainjs/blob/d7a9803ed4dbaae0202617cebf73b170f5db0335/libs/langchain-community/src/vectorstores/convex.ts#L178-L191

GitHub

langchainjs/libs/langchain-community/src/vectorstores/convex.ts at ...

🦜🔗 Build context-aware reasoning applications 🦜🔗. Contribute to langchain-ai/langchainjs development by creating an account on GitHub.

Storing openai embeddings "length limit exceeded"

Did you find this page helpful?