Night
Night•3y ago

Table size

In Convex is there a way to get the number of rows without having to query it? I am trying to select a random row in one of my tables the database
13 Replies
lee
lee•3y ago
hi! can you describe more about how getting the number of rows would help select a random row? If you don't mind using an interface that might change (isn't stable) and ignoring the typescript error, you could use await db.query('table').count().
Night
NightOP•3y ago
I'm using a Math.random() to generate a random integer between 0 and the length of the table
lee
lee•3y ago
we haven't invested in this count interface yet, so having more context on its use-cases will help prioritize
Night
NightOP•3y ago
I'm trying to select a random row in a table of texts to get a single column. My algorithm is: 1) Query texts and collect 2) Get length 3) Generate random index 4) Return that row So I'd need to query anyway. I'm just wondering if count() would be more optimized than collect().length
lee
lee•3y ago
if you've already queried and collected, can you get the length from that array? count() is constant time, while collect().length is linear. but count() is restricted to only work on entire tables
Night
NightOP•3y ago
So it would be linear either way
sujayakar
sujayakar•3y ago
yeah, for now @Night, I'd recommend just .collect()ing into an array, since convex's queries don't support offset, so even if you efficiently generated the random index from (3), jumping to the kth record would still be O(k) time. there's a trick we support for selecting a random record in logarithmic time, but I'll have to write it up, and I think it relies on internal APIs 😉
lee
lee•3y ago
the trick is almost possible with existing APIs. I'm imagining a db.query('table').withIndex('by_id', q => q.gt('_id', new Id('table', <randomly generated bytes>)).first() but generating the random bytes that make the query happy might be tricky. interesting thought exercise 😄
ian
ian•3y ago
Another idea is to have a field in your documents that is an integer id. Every time you add one, you find the row with the highest id and add your document with that + 1. Because of Convex's serializable isolation, this is safe. You could have an index on this new field, to make finding the largest one fast. Then that row also has the count by default, and you could generate a random int at or below it to fetch a random row. Note: this assumes you don't delete rows
Night
NightOP•3y ago
Thanks for all these possible methods! I'll probably stick with collect for now and if there's a performance problem I'll try out that indexing strategy because it's very unlikely I'd ever need to delete the rows in this table. What exactly is offsetting?
ian
ian•3y ago
Essentially querying a table by an numerical counter, like an array index - generically it's an offset from some location within an index. that's my understanding, anyways. i'm no db expert though
Night
NightOP•3y ago
When you create an index on a column it doesn't scan the entire table though, right? Using withIndex()
lee
lee•3y ago
That's correct, withIndex doesn't scan the entire table, just the part of the table specified in the index filter

Did you find this page helpful?