yokoli•2y ago

Alternatively would love to get help on

Alternatively, would love to get help on how to use transformer.js with convex to generate embeddings. I kept getting this strange error on convex when trying to use @xenova/transformers 🧵

16 Replies

yokoliOP•2y ago

what i have:

import { pipeline } from '@xenova/transformers';

export async function fetchEmbeddingBatchLocal(
  texts: string[],
): Promise<{ embeddings: number[][] }> {
  const generateEmbedding = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');

  // Generate a vector using Transformers.js
  const outputs = await Promise.all(
    texts.map((text) =>
      generateEmbedding(text, {
        pooling: 'mean',
        normalize: true,
      }),
    ),
  );

  // Extract the embedding output
  const embeddings: number[][] = outputs.map((output) => Array.from(output.data));

  return {
    embeddings,
  };
}

export async function fetchEmbeddingLocal(text: string) {
  const { embeddings } = await fetchEmbeddingBatchLocal([text]);
  return { embedding: embeddings[0] };
}

Getting "Uncaught Error: no available backend found. ERR: [wasm] TypeError: (0 , i.cpus) is not a function"

rkbh•2y ago

Hey @yokoli, looking into this right now! Will get back to you in a bit Unfortunately, it doesn't look like we can support @xenova/transformers at this time 😦 . At runtime, it depends on local onnx files which our bundler is unable to bundle at push-time leading to this runtime issue. We have an upcoming feature to support packages like this with dynamic runtime dependencies in our Node runtime in Convex 1.4 (coming soon). There is still, however, a 250MB unzipped limit for packages enforced by AWS Lambda, which is likely to be exceeded by heavy packages like this which include local ML model files. Supporting larger packages is an active area of work for us, so hopefully we can support this in the near future.

yokoliOP•2y ago

Thanks for looking into this @rkbh ! fwiw @xenova/transformers unpacked size seems to be 45mb? onnx files are in total ~110mb. So it's just shy of the 250mb limit. In this case, curious why using node runtime wouldn't work out of box? @sujayakar would be amazing if we can get this approach working so we don't have to bother you to extend the vector db dimensions 😆 😆 🙏 Thanks so much team!!!!

rkbh•2y ago

@yokoli yep, the package itself is only 45MB, but the dependencies it pulls in makes your node_modules folder 254 MB (on OS X with @xenova/transformers=2.6.2), which just passes the Lambda limit The reason it doesn't work out of the box with node is because our Node runtime uses a similar bundling process to the Convex runtime which can't pickup those onnx by default

yokoliOP•2y ago

ah i see. and there's no way to tell it to pick it up by updating the vite.config?

rkbh•2y ago

Yep there is a way by marking this as an external package (https://docs.convex.dev/functions/bundling#specifying-external-packages) in your convex.json. You would place this there:

{
  "node": {
    "externalPackages": ["*"]
  }
}

{
  "node": {
    "externalPackages": ["*"]
  }
}

You'll have to update to the newest Convex version 1.4.1 to use this feature btw.

Bundling | Convex Developer Hub

Bundling is the process of gathering, optimizing and transpiling the JS/TS

rkbh•2y ago

But this will hit the 250 MB Lambda limit. I tried it locally earlier 😦

yokoliOP•2y ago

ahhhhhh got it 😢 thanks so much for looking into this though @rkbh

rkbh•2y ago

Thanks sm for mentioning this, seems like it would be really useful to support this package (and ones like it). Hopefully we'll be able to soon!!

yokoliOP•2y ago

and yeah if we can support transformer.js it ll unlock a plethora of AI use cases ok another idea: what if convex hosts a small out of box embeddings model for people to call from their convex apps? i imagine this could look like a convex function that only does

      generateEmbedding(text, {
        pooling: 'mean',
        normalize: true,

rkbh•2y ago

This is something @Indy and @sujayakar were thinking about recently, I believe! I'll defer to them on this

Indy•2y ago

We've brainstormed about hosting some models directly. Though, we've seen most folks just happily use remote hosted models. But I do know there is some fascinating work happening in small embeddable models and this could be fascinating product to build out. Going to have to noodle on it some more.

yokoliOP•2y ago

my pitch: we can unlock tons of latent demand by telling people "ai town takes $0 to run". can start small with embeddings model, and devs can use this embedding model for ai town or any other ai apps they build

sujayakar•2y ago

one other idea @yokoli: could you use the setup you were using with llama2 but with a different model? rather than using llama2's internal embedding, one idea would be to use an open embedding like https://huggingface.co/BAAI/bge-large-en-v1.5 (1024 dimensions). I wonder if it'd perform better on the memory retrieval stuff we do in ai town too since it's purpose trained for that.

BAAI/bge-large-en-v1.5 · Hugging Face

yokoliOP•2y ago

unfortunately the Bert models (including the ones you sent above) don't yet work on Ollama for local inference 😦 thats why I was looking into either a smaller LLM with embeddings OR transformer.js

sujayakar•2y ago

ahh okay, makes sense.

Alternatively would love to get help on

Did you find this page helpful?