yokoli
yokoliโ€ข17mo ago

Alternatively would love to get help on

Alternatively, would love to get help on how to use transformer.js with convex to generate embeddings. I kept getting this strange error on convex when trying to use @xenova/transformers ๐Ÿงต
16 Replies
yokoli
yokoliOPโ€ข17mo ago
what i have: import { pipeline } from '@xenova/transformers'; export async function fetchEmbeddingBatchLocal( texts: string[], ): Promise<{ embeddings: number[][] }> { const generateEmbedding = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2'); // Generate a vector using Transformers.js const outputs = await Promise.all( texts.map((text) => generateEmbedding(text, { pooling: 'mean', normalize: true, }), ), ); // Extract the embedding output const embeddings: number[][] = outputs.map((output) => Array.from(output.data)); return { embeddings, }; } export async function fetchEmbeddingLocal(text: string) { const { embeddings } = await fetchEmbeddingBatchLocal([text]); return { embedding: embeddings[0] }; } Getting "Uncaught Error: no available backend found. ERR: [wasm] TypeError: (0 , i.cpus) is not a function"
rkbh
rkbhโ€ข17mo ago
Hey @yokoli, looking into this right now! Will get back to you in a bit Unfortunately, it doesn't look like we can support @xenova/transformers at this time ๐Ÿ˜ฆ . At runtime, it depends on local onnx files which our bundler is unable to bundle at push-time leading to this runtime issue. We have an upcoming feature to support packages like this with dynamic runtime dependencies in our Node runtime in Convex 1.4 (coming soon). There is still, however, a 250MB unzipped limit for packages enforced by AWS Lambda, which is likely to be exceeded by heavy packages like this which include local ML model files. Supporting larger packages is an active area of work for us, so hopefully we can support this in the near future.
yokoli
yokoliOPโ€ข17mo ago
Thanks for looking into this @rkbh ! fwiw @xenova/transformers unpacked size seems to be 45mb? onnx files are in total ~110mb. So it's just shy of the 250mb limit. In this case, curious why using node runtime wouldn't work out of box? @sujayakar would be amazing if we can get this approach working so we don't have to bother you to extend the vector db dimensions ๐Ÿ˜† ๐Ÿ˜† ๐Ÿ™ Thanks so much team!!!!
rkbh
rkbhโ€ข17mo ago
@yokoli yep, the package itself is only 45MB, but the dependencies it pulls in makes your node_modules folder 254 MB (on OS X with @xenova/transformers=2.6.2), which just passes the Lambda limit The reason it doesn't work out of the box with node is because our Node runtime uses a similar bundling process to the Convex runtime which can't pickup those onnx by default
yokoli
yokoliOPโ€ข17mo ago
ah i see. and there's no way to tell it to pick it up by updating the vite.config?
rkbh
rkbhโ€ข17mo ago
Yep there is a way by marking this as an external package (https://docs.convex.dev/functions/bundling#specifying-external-packages) in your convex.json. You would place this there:
{
"node": {
"externalPackages": ["*"]
}
}
{
"node": {
"externalPackages": ["*"]
}
}
You'll have to update to the newest Convex version 1.4.1 to use this feature btw.
Bundling | Convex Developer Hub
Bundling is the process of gathering, optimizing and transpiling the JS/TS
rkbh
rkbhโ€ข17mo ago
But this will hit the 250 MB Lambda limit. I tried it locally earlier ๐Ÿ˜ฆ
yokoli
yokoliOPโ€ข17mo ago
ahhhhhh got it ๐Ÿ˜ข thanks so much for looking into this though @rkbh
rkbh
rkbhโ€ข17mo ago
Thanks sm for mentioning this, seems like it would be really useful to support this package (and ones like it). Hopefully we'll be able to soon!!
yokoli
yokoliOPโ€ข17mo ago
and yeah if we can support transformer.js it ll unlock a plethora of AI use cases ok another idea: what if convex hosts a small out of box embeddings model for people to call from their convex apps? i imagine this could look like a convex function that only does generateEmbedding(text, { pooling: 'mean', normalize: true,
rkbh
rkbhโ€ข17mo ago
This is something @Indy and @sujayakar were thinking about recently, I believe! I'll defer to them on this
Indy
Indyโ€ข17mo ago
We've brainstormed about hosting some models directly. Though, we've seen most folks just happily use remote hosted models. But I do know there is some fascinating work happening in small embeddable models and this could be fascinating product to build out. Going to have to noodle on it some more.
yokoli
yokoliOPโ€ข17mo ago
my pitch: we can unlock tons of latent demand by telling people "ai town takes $0 to run". can start small with embeddings model, and devs can use this embedding model for ai town or any other ai apps they build
sujayakar
sujayakarโ€ข17mo ago
one other idea @yokoli: could you use the setup you were using with llama2 but with a different model? rather than using llama2's internal embedding, one idea would be to use an open embedding like https://huggingface.co/BAAI/bge-large-en-v1.5 (1024 dimensions). I wonder if it'd perform better on the memory retrieval stuff we do in ai town too since it's purpose trained for that.
yokoli
yokoliOPโ€ข17mo ago
unfortunately the Bert models (including the ones you sent above) don't yet work on Ollama for local inference ๐Ÿ˜ฆ thats why I was looking into either a smaller LLM with embeddings OR transformer.js
sujayakar
sujayakarโ€ข17mo ago
ahh okay, makes sense.

Did you find this page helpful?