8 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
Not at the moment, added feature request label to this post.
When I've wanted this, I've written a custom query in the dashboard UI - you can fetch from the table and do a
.includes
in typescriptany update on this?
.includes
is not really an option with thousands of entries
I thought Full Text Search works like that, but it only matches the start of words - any plans on extending that to match inside words? thanks!
https://docs.convex.dev/search/text-searchFull Text Search | Convex Developer Hub
Run search queries over your Convex documents
You can use indexes in the custom query if you can narrow it down some. Otherwise you could write an action that does the fuzzy search over pages of data, that you run from the dashboard.
Scanning pages of data is what we'd need to do as well, so it's just as (in)efficient to do it yourself with whatever other behavior you want. Or a query that uses the text search index (if using that index isn't supported in the dashboard)
We used to have text search for mid-words but its performance and behavior wasn't worth it on the infra side, and there wasn't much customer interest so we removed it.
Teams that want more powerful text search often set up Algolia with streaming export / periodic sync. Not sure if that's overkill for you
@Ian Algolia is definitely overkill in our case
other issue is pagination, it's weird to filter after querying, because I might load multiple pages without hits
my use case: we have something like 10 thousand products in a list, I load that paginated, and users should be able to search for any part of the name of the product
it's not a huge deal, but I'm used to SQL and looks like this is suboptimal compared to "where name like %text%"
Yeah I'd approach it differently if it's within-product or an analitical workload.
I can give some context on why doing this in Convex is similar to SQL, as well as some recommendations
sql comparison
Under the hood, the SQL "LIKE" fuzzy matching with an OLTP SQL database is a table scan (without full text indexing) - so it will also be reading many pages and potentially finding nothing for most pages. And it'd likely be loading each row fully into memory, similar to Convex, unless you set up a column-oriented store (OLAP).
These sorts of queries can work at smaller scale and can become highly problematic in an OLTP setting when there's read/write locks taking place, so you'd want to be doing that in pages within a transaction, and probably on a stale read replica. So there isn't a fundamental difference between the approach with an action.
tactical options
Assuming there will be ~10k products in a list, I would decide between these options:
- use the built-in prefix text search, and possibly index a field that has more partial words that you put in there - e.g. also put in the word stems, common misspellings, etc.
- use an action that makes as few as 2 queries (if the rows are small) sequentially to read all the data and post-filter. One nice thing here is that if the products aren't changing, the query will be cached for when the same string is submitted, so you'd only pay the bandwidth on the first pass.
- use elastic search or algolia, which aren't too expensive and have optimized the crap out of this and can also offer things like biasing the title over the description over comments etc.
- use streaming export to dump the products into an OLAP DB and make sql queries against it
- maybe unconventional, but you could be doing text search for immediate results on prefixes, and then paginating all products to the client when there's a search with no results. This seems a bit crazy until you take into account: 1. once they all loaded, every search is instant w/ no server round-trip. 2. Every page will likely be a cache hit unless you change a product, so they'll return quickly and cost no bandwidth. In pages of 1000 (scale depending on the payload size), it'd be 10 round trips to the server. But if the product info changes frequently (e.g. if the inventory is on the product table and not a separate table) then the queries might become expensive.
We want to add a DuckDB integration where you could query against a slightly-stale column-oriented replica of select convex tables, but we haven't shipped that. That might be the best option when that's available?
thanks a lot for the detailed explanation! I'll look into these options, but likely go with the simplest solution for now and we'll see how it feels day to day