Rodrigo-R
Rodrigo-Rβ€’7mo ago

Embeddings text-embedding-3-small vs text-embedding-ada-002

Hi, i just deleted all my embeddings and re parse everything using the cheaper text-embedding-3-small, but after the change my match rate went from 0.7 - 0.8 to 0.2 using vectorSearch with the same docuyments and same queries. The only thing i changed was the model. I'm using the same model to get embeddings for the documents and the query! What am i missing ? πŸ™‚ Regards!
4 Replies
ian
ianβ€’7mo ago
I'd play around with some sample data - fetch a few samples, maybe just using fetch in a node shell and manually validate. The dot product works for similarity if they're normalized. It's possible the model doesn't do as good of a job for your semantic space. If it's something Convex-specific I'd be surprised, but let us know if that's what you find.
ian
ianβ€’7mo ago
Gist
Implementation of chat completions and embeddings for any OpenAI-co...
Implementation of chat completions and embeddings for any OpenAI-compliant services, using browser fetch and no imports/dependencies - llm.ts
ian
ianβ€’7mo ago
And a post on embeddings in general I wrote a year ago: https://stack.convex.dev/the-magic-of-embeddings
The Magic of Embeddings
Embeddings, why they’re useful, and how we can store and use them in Convex.
Rodrigo-R
Rodrigo-ROPβ€’7mo ago
Thanks Ian, i was playing around, so far i can't find anything different other than the actual model.. actually i replaced my code with an example coming from convex with the same results, i'll play arround a bit more

Did you find this page helpful?