Support for overlap date range queries (w/start + end fields)
Hi there,
To my understanding, there is no way to to query entities (e.g. a booking {start:isostring, end:isostring}) that have both
start and end fields by a date range that isn't at risk of hitting the convex document limit, not by the documents within the date range, but by the nature of the amount of documents I have to scan so I can return a valid result due to how convex allows you to query the db.
- Where the risk is: writing a convex query to get all bookings that start before start AND end after end could likely exceed the 16k limit.
- For my usecase I want to surface all bookings between a range. I can get all that start between start and end and end between start and end without risk, but cannot get the bookings that start before start AND end after end before without risk of hitting doc limit.
- My understanding is you can't use an indexed query for booking.start <= end && booking.end >= start. Therefore I am forced to either collect all booking.start < start (which is a risk for doc limit) and from this result filter what also ends after end or the other way around (booking.end > end and then filter from this everything that also starts before start)
Am I incorrect or missing something? if not, are such range queries planned to be supported in the future?6 Replies
Couldn’t you do two indexed queries for the second case, (and merge results if required)?
Hey Tom, I'm not 100000% sure what you mean - I can do indexed queries to get all bookings that either start OR end between no problem. But to my knowledge there is no way to get all bookings that overlap (start before && end after) two dates in a scan count friendly way.
Appreciate your input - let me know if you think I've misunderstood! 🙂
@Kenny you might be able to do something like this.
@Tom Redman Yep exactly! I think there might be a bit of confusion here though. It's definitely possible to query for overlapping bookings in convex, but not in a scan friendly way (i.e. we have to scan too many documents). In this example you kindly provided, we get ALL docs before
twoWeeksAgo and ALL docs after twoWeeksFromNow. The issue is this could easily scan over the 16k doc scan limit. Ideally, convex would have an inbuilt method for allow me to check this e.g. on a e.g. ~composite an index .withIndex("dateRange", (q) => q.gte("end", start) && q.lte("end", end)).collect().
Apparently an R-Tree might be a viable data structure for this - I wonder if this could become a component in a similar way to the aggregate component
or maybe something akin to a GiST index in psqlAh, yes I see @Kenny. In theory the scan friendliest way to do this would be to paginate the results instead of collect(), in either case.
This is a cool problem though. I’m curious to understand how you end up resolving this. In a broad sense, it reminds me of this component by Sujay (Convex cofounder): https://www.convex.dev/components/geospatial
Feels like there is some (ahem) overlap.
ahhh @Tom Redman, Thank you for sharing! (and love the pun hahaha). I'm going to look into this, I'm guessing I'm not actually going to be limited to valid globe coordinates, and so this might potentially work. Appreciate you! 💯
Just concerned it might be more geared towards sphericalal or 2D positions, and 'd need 1D (height& width vs start,end).
I've earmarked it and will play around with it later this week
🤝