Matt Luo
Matt Luo11mo ago

Vector Search | Convex Developer Hub

I'm a prospective Convex customer needing clarification on vector storage pricing. The tooltip for vector storage on the pricing page says: "Includes the size of any vectors within a document that are stored in a vector index." And in the Professional plan, the tooltip says "Then $10 per GB". 1) What exactly is being measured in GB? A literal interpretation is that the data in vectorized form is measured and billed. But is it really $10/GB for vector data?
2) Which technique does Convex use (or encourage the user to use) to convert business text data to vectors? I only found this in the docs: If you're using embeddings, this dimension should match the size of your embeddings (e.g. 1536 for OpenAI). https://docs.convex.dev/vector-search 3) Any general advice on how a Convex customer should forecast vector storage billing costs for a given business text data? For most customers, is it essentially forecasting the space needed for OpenAI embeddings? This seems like the most important financial planning task for a Convex customer with significant vector searching in the app.
Vector Search | Convex Developer Hub
Run vector search queries on embeddings
9 Replies
Tom Redman
Tom Redman11mo ago
Hey @Matt Luo! Thanks for the ping - let me ask around to make sure I get you correct info. Hang tight!
Tom Redman
Tom Redman11mo ago
1) I have an example for you - the pricing is not for vector data, but just the index data, which is usually much smaller than the vector data itself. Here's a vector index I have, representing an index on a table with 2,049 vectorized documents (each row is about 50kb). On a database of ~2,049 rows (100MB), the vector index is 24.01MB
No description
No description
Tom Redman
Tom Redman11mo ago
So here, the vector index size is about 25% of the primary documents db. For 2) I've personally used the OpenAI embedding approach as described here: https://platform.openai.com/docs/guides/embeddings/embedding-models Do you want any more info beyond what's provided there?
3) Any general advice on how a Convex customer should forecast vector storage billing costs for a given business text data? For most customers, is it essentially forecasting the space needed for OpenAI embeddings?
This is largely up to you. It'll be the sum of the size of the rows you have stored in Convex. In one case I've built, I store email body, subject, metadata AND its vectorized representation, which makes it really easy to lookup as it all stays within Convex. These rows are pretty data heavy (full email body, etc.) and each one is about 50kb - 2k documents is 100MB. As mentioned above the vector index for this table is 25MB. An alternative option could be, if your documents are very large or need to be stored offsite for any reason, you could keep the vector + a reference to your external doc/data, this would keep data usage way down but it would add additional fetching complexity on your end.
Matt Luo
Matt LuoOP11mo ago
Thanks @Tom Redman , that's a good example because it distinguishes vector data usage from vector index usage. But I still think an explicit breakdown of pricing would help. So, is the price of the vectorized representation of your email data charged at $0.20 per GB? And the $10/GB price is only applied to the 24.01 MB of vector index usage and nothing else?
Tom Redman
Tom Redman11mo ago
This is correct, although I will double check right now just to make sure.
Matt Luo
Matt LuoOP11mo ago
If so, this sentence "Includes the size of any vectors within a document that are stored in a vector index." is probably causing confusion. It's ambiguous because one could interpret "Includes" to mean there is some superset of activity being priced at $10/GB, which could scare off a customer. Or it could mean "Only includes", which seems more likely but takes some Convex-specific knowledge to come to that interpretation
Tom Redman
Tom Redman11mo ago
Totally agree. Let me get a definitive answer for you, and if that's the case, we'll also update the website to make that more clear. @Matt Luo - FYI I have confirmed this is the case:
So, is the price of the vectorized representation of your email data charged at $0.20 per GB? And the $10/GB price is only applied to the 24.01 MB of vector index usage and nothing else?
I will get the website updated for clarity! Thank you
Matt Luo
Matt LuoOP11mo ago
Ok thanks, @Tom Redman ! I hope I converted some more prospects for convex. Great product!
Tom Redman
Tom Redman11mo ago
Same here, thanks so much. Really appreciate you!

Did you find this page helpful?