David Alonso•7mo ago

Exponential number of indexes required?

I'm trying to build a component that shows data according to multiple filters which can be set or unset. Can I just define one index with all these fields in there and ignore set equality filters to undefined values for the filters that are not set? If not, how should I construct my indices? The list I'm filtering over is potentially very long.

27 Replies

RJ•7mo ago

I'm not sure, but maybe you could simulate this behavior by using q.gte and q.lte? I'll write an example…

people: defineTable({
  name: v.string(),
  age: v.number(),
})
  .index("by_age_and_name", ["age", "name"])

people: defineTable({
  name: v.string(),
  age: v.number(),
})
  .index("by_age_and_name", ["age", "name"])

// filter by age but not name
db.query("people").withIndex("by_name_and_age", (q) =>
  q.gte("age", Number.MIN_VALUE)
  q.eq("name", "Joe")
)

// filter by age but not name
db.query("people").withIndex("by_name_and_age", (q) =>
  q.gte("age", Number.MIN_VALUE)
  q.eq("name", "Joe")
)

I'm not sure what the performance implications of this might be I think the later in the chain you begin filtering, the worse the performance would be, but I'm not positive

David AlonsoOP•7mo ago

Hmm I see, that's creative

RJ•7mo ago

As an aside, it would be nice if the sorting rules for each Convex data type were described in the docs (or if someone could point me to them, as I don't see them anywhere) I would guess that to get the same behavior with a string you would do this: q.gte("name", "") Another hack of sorts that might make this feasible would be to have an index for each of the fields that you plan on filtering on And have each index begin with a different field

David AlonsoOP•7mo ago

I need infinite scroll pagination though in case you're suggesting joining them, which would get tricky

RJ•7mo ago

And then check which fields are set and use an index that has one of those fields in the first position Why wouldn't this work in that case? Oh yeah sorry, I'm not suggesting joining them Just choosing which index to use on the basis of the set fields My b, didn't read your whole message

David AlonsoOP•7mo ago

no worries, i'll give this a shot when i implement this and let you know

deen•7mo ago

You can't do this. Only the last part of an index query can be a gt/lt range, nothing else can follow. I don't believe there's a way to specify "any" for an index range without just omitting it - which means that's the end of your query. I think the expect approach is to get as close as you can with an index, then use a filter for the rest. Finding the right balance will depend on your data. It's by ASCII value for strings - the same as eg. using .sort on an array of strings in javascript. Your line of thinking is possible, and powerful. Look into lexographical sorting, and Jira's implementation Lexorank. I quickly decided the complexity was too high for my use cases. But it's a verrry interesting concept.

David AlonsoOP•7mo ago

Maybe unrelated question but what if I wanted to keep track of aggregates for all possible combinations of filter values (including ones where some filters are unset). How many aggregates would I need with the aggregate component?

RJ•7mo ago

Ah ok, that makes sense Good to know! Would still be good to have documented

ampp•7mo ago

I thought .gt and .gte could only follow a .eq, and you can't you use it twice. So you really have to carefully plan out the use of the index.

RJ•7mo ago

Yeah @deen pointed this out too, I believe I was wrong

David AlonsoOP•7mo ago

anyone know this?

djbalin•7mo ago

Interesting, would like to know this as well. I guess at least you could create one aggregate that's a "chain" of some of your filters: a->b->c->d->.. and you can then query that aggregate by any number of filters beginning in the chain from a, if Im not mistaken

RJ•7mo ago

I don't, but this may be a good candidate for a new support thread (for better visibility, if nothing else)

lee•7mo ago

sounds like a math question, and i'm pretty sure the answer is exponential in the number of fields

deen•7mo ago

I would love to hear if the team has any more specific advice on this problem in general - how to approach filtering documents based on many possible fields, that may or may not be specified. Obviously, tradeoffs must be made somewhere, it depends on your data etc., but this is an extremely common pattern found in pretty much any software bigger than a toy app. Other databases generally encapsulate many of these decisions from you, for better or worse. Convex asks you to think about and make these choices explicitly - but there is little guidance and plenty of confusion about how to approach this. I don't think the Convex story is very good here. How would you implement something like the Linear issues filter feature? Or how would the Convex team think about implementing a comprehensive search/filter system like this for say, Dropbox for instance? Any relevant wisdom would be greatly appreciated.

lee•7mo ago

I think the answer will be something like using an index for the most discriminating field and post-filtering with pagination for the rest (a la https://stack.convex.dev/complex-filters-in-convex ). But I'll check with the team to see if we can give a better answer.

Using TypeScript to Write Complex Query Filters

There’s a new Convex helper to perform generic TypeScript filters, with the same performance as built-in Convex filters, and unlimited potential.

David AlonsoOP•7mo ago

I think it would be cool as @deen suggests to write a stack post on how to build a production-grade use case that already exists (e.g. Linear) with indexes/filters in Convex

deen•7mo ago

Yeah, this is essentially what I suggested earlier. It requires having read a few different stack posts, and the lightbulb moment to combine them for this purpose, and probably just a bunch of Convex experience. I don't know exactly why (maybe because it's so novel), but it didn't sink in for me for quite a while that, when I'm using Convex, it's really like just coding my own server in Typescript (mainly). Breaking out of the "on rails" query/mutation/action pattern feels wrong and annoying at first (args types), but once you finally "get it", it's like seeing the Convex Matrix, and everything becomes so much more achievable. It's maybe not as obvious as the team thinks. I wonder how many people never make that leap. Pagination in the docs is presented mainly as a frontend feature for React, with a small example at the end for use with something like Vite. But it's actually required for working with your own data once you get to an intermediate level, and it wasn't until after many re-reads of the Great Stack Post of All Time, "Stateful Online Migrations using Mutations", that this clicked for me. Also, be careful because Paginated Queries are a beta feature 😉 This definitely isn't Day 1 of Convex stuff, so I'm not exactly sure how to present it, but I think if you look at the most common questions that keep coming up here, making this leap is an issue. (see: "helper functions")

Indy•7mo ago

This is such incredible feedback @deen. First off, the team is well aware that Convex philosophically is non-obvious for a lot of folks. But as you mentioned, at some point it feels like seeing the matrix. That is why we wrote all these stack articles, but I acknowledge that the understanding is diffuse. There are a few things we'll be working on to help with this: 1. Generally update docs to nudge people towards patterns that help people build scalable apps. This is a balance, because the core docs are very focused on "this is how the api works." 2. Writing a more cohesive resource in a book like style that will try to capture everything we've learned as a community together on walking up the ladder of feature complexity. 3. For this particular problem, we've had to delay the OLAP system we've mentioned in the community for a while as other priorities have come up. The idea is that you can get a delayed snapshot of your data for "large filter" applications that you can query more simply than explicit pagination. We'll make the tradeoff clear here. Thanks for taking all this time and effort to learn Convex. We're going to make this better 🙂

deen•7mo ago

Please know I say all this with love! I've never spent so much time with a technology "product" before by choice. The Convex docs are actually really good, and stack is seriously incredible - I have to say, Ian in particular has such an incredible way of explaining a complex topic in a way that feels like it's personally tailored to my brain. I've feel like learned so much that I can apply anywhere - working with the Convex primitives makes navigating these interesting and challenging choices so much more manageable and ... fun? It's all kinda been like a delicious and addictive database onion to me. Something in the nearer term that may be good bang for buck is adding some sort of "intermediate" level example/tutorial that's actually a part of the docs. I find myself jumping to these sections anywhere I encounter them pretty quickly - for me it's really valuable to see the pieces working together in context, even if I don't fully understand them yet. Much less hand-holdy than the Get Started tutorial, more like a clean implementation of a slightly more complex system, demonstrating some of these more advanced concepts working together. Even just seeing single line comments like // we define our validator here separately so we can infer the types for our helper functions can be such a powerful learning multiplier, that makes non-obvious techniques feel natural.

Indy•7mo ago

You're not the first person to request this 🙂 Thanks for requesting this again, all the more motivation for us to put it together.

David AlonsoOP•6mo ago

I'd also love to see this! We've kinda outgrown the hello-world examples in the docs/stack at this point which is why I'd love to see examples of how to implement features in well-known apps instead of just toy apps (which were definitely useful when getting started, don't get me wrong!) I should also add - we love when you're opinionated in the docs on how you'd deal with more complex cases or structure things in a bigger codebase

Indy•6mo ago

One thing I'll point out for now, is that no matter what system (Convex or not), things kind of look the same. You break your app into small modules that are easier to reason about. Though we haven't released the authoring apis to Convex components today, there is nothing stopping you from structuring your code as if they were broken up into components. So let's say you have something that deals with push notifications, you'd probably break it out into its own sub folder that roughly looks like this: https://github.com/get-convex/expo-push-notifications/tree/main/src%2Fcomponent And only talk to it with a specific "public" API. It doesn't have to be Convex functions as the API, it can very well be just typescript functions.

GitHub

expo-push-notifications/src/component at main · get-convex/expo-pus...

Contribute to get-convex/expo-push-notifications development by creating an account on GitHub.

djbalin•6mo ago

I second the notions above and encourage what you mention here @Indy:

Generally update docs to nudge people towards patterns that help people build scalable apps. This is a balance, because the core docs are very focused on "this is how the api works."

For me, this nudging and opinionation is one of the main reasons I've grown to love Convex. In our small startup team of 2 developers, for example, both of us are talented, but neither has a lot of experience in building and scaling production apps. We love that Convex is both a great product and a team that seems intent on communicating best practices and sharing their experience. We're excited to leverage you guys' decades of experience on huge projects through stack posts, nudging in the docs, and here on Discord. Don't be afraid to be opinionated or provide suggestions!!! It's of incredible valuable for all of us - both for strictly learning to use Convex and its API, but, much more valuably, how to think about building a system, and which common problems Convex alleviates. The API/SDK is so well-documented that it's very easy to deviate from your blanket recommendations whenever that makes sense for a specific use-case. We gobble up and learn from all your recommendations, stack posts, Discord comments, etc., and we lean against those as much as we can, and whenever our lack of experience leaves us in doubt. But there is no lock-in: you do a good job of stating that these are recommendations and outlining tradeoffs, so we confidently deviate from these whenever relevant for our product. What you're doing is a huge gift to all junior/mid developers. Convex sits at an awesome abstraction level. Be opinionated and provide suggestions such that more junior developers can get onboard quickly and will feel safe in mentally offloading some tough decisions to you guys. By definition, experienced devs will find what they're looking for, and will probably also learn a thing or two in the process!

jamwt•6mo ago

we agree 100% and "the book" is a very big deliverable we owe everyone ASAP the only reason it doesn't exist yet is time and team! but it's important enough we will shift onto it soon Indy is focusing on tutorial and early onboarding now. "the book" is a focus we'll ideally get started on within the next few weeks and get the first version out in Q1 2025

ampp•6mo ago

Well if you want to do "a book" sometime 😁 i know someone at O'reily Media, but i'm sure these contacts are super easy to find given they are in the same city

Exponential number of indexes required?

Did you find this page helpful?