Hmza
Hmza5mo ago

one function costed 17GB reads in one day

i have a developed an app that is being used by a team. the max i have hit yet was 3 gb per day every. but yesterday the reads were 17gb somehow. can i request a little review to see what went wrong? i did figure the function and fixed some issues but still i want to be sure if the reason in the dashboard is correct.
No description
27 Replies
Hmza
HmzaOP5mo ago
also the difference between the functions and documents are not that big. not 12GB of different what bandwidth did
No description
No description
ian
ian5mo ago
Have you looked around at the breakdown of function calls? If you find the function that is chewing through a ton of bandwidth maybe you'll find a query that is using filter instead of withIndex or something. Or a paginated query that is constantly re-fetching a big page on new inserts (where a smaller page size could help)
ian
ian5mo ago
Queries that scale
As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...
ian
ian5mo ago
Under the "Functions breakdown by project" section there is a drop-down for "Database Bandwidth" in case you didn't notice that - I didn't notice it originally
Hmza
HmzaOP5mo ago
hey Ian thanks for the helpful insight i actually found the function that ruined the bandwidth, it was calling regularly. one question which i couldn't find answer to yet is lets say this my query const clients = await Promise.all( clientIds.map(clientId => ctx.db.query("clients").withIndex("by_id", q => q.eq("id", clientId)).first() ) ); now i do get clientIds from another query similar to this. as convex doesn't have an .in what i think is this loop is going through the clientIds array to get all clients right. what's a better way todo this same thing in convex? appreciate your help i don't want todo batch queries as this is already going into pagination next.
BakemonoHouse
BakemonoHouse5mo ago
just curious, is the clientId the id generated by convex? if so, why not use ctx.db.get(clientId) instead
Hmza
HmzaOP5mo ago
client ID is not convex id in this case because requirement is that client id should be 4 letters max so it can be referenced and searched easily (its a crm which is used by non techinal people and to track down their clients they use these ref. no rather client information.)
BakemonoHouse
BakemonoHouse5mo ago
got it, I saw the q.eq(_id) and assumed it was the convex ID. is there any other index you can use to narrow down the search (by index) then filter using clientId instead?
Hmza
HmzaOP5mo ago
Yeah the problem is client cannot be narrowed any more This particular piece of code is used to get todolist everyday for salea team so we need to get all possible clients first to then get leads related to them and then to get tasks related to each lead There is narrow down for leads which i’m applying and then tasks also doesn’t have any narrowing down thing Shit bad right bakemono😂
BakemonoHouse
BakemonoHouse5mo ago
i went through some scenario in my head but i guess i keep coming back to the index solution. funnily speaking my product is also a CRM and has exactly same business process but I have not experienced this kind of bandwidth.
Hmza
HmzaOP5mo ago
what is approx data in your database right now ? that is being read every day i mean because mine is already in production and being used every day. approx 17k documents read per day. mine has 30 staff members users i mean
BakemonoHouse
BakemonoHouse5mo ago
yeah we are in production too. slightly larger user base but lower RW # and bandwidth in terms of docs. few hundred DAU atm
Hmza
HmzaOP5mo ago
well i think i need audit my convex backend because i know i'm doing some stupid things in there plus i'm using react query builder which is advance search on approx 3.5k records right now and its being used about every 5 minutes. i need to scale
BakemonoHouse
BakemonoHouse5mo ago
yeah its definitely difficult in nosql. I do a lot of bending around relational schema or just bend the UX a bit to limit large queries.
Hmza
HmzaOP5mo ago
i only scaled todo list today and look at this. big difference already in "Action compute" with even more usage today
No description
Hmza
HmzaOP5mo ago
yeah that's actually good
BakemonoHouse
BakemonoHouse5mo ago
haha the eternal whack a mole for us anyways. I optimized something last time and ended up jacking up # of function calls.
jamwt
jamwt5mo ago
a big emphasis in H2 this year / H1 next year is both guides on optimizing/scaling and better telemetry in the dashboard to make it easy
Hmza
HmzaOP5mo ago
goated
jamwt
jamwt5mo ago
we actually have some users driving real, high DAU traffic and doing it very efficiently. but it's required 1:1 consulting sessions with the convex team b/c those guides and tools are missing right now
Hmza
HmzaOP5mo ago
totally agree. how can one request that? i'll not do right now but will definitely be good once i do what i can properly.
BakemonoHouse
BakemonoHouse5mo ago
super exciting news 🙈
jamwt
jamwt5mo ago
@Hmza if you're a pro customer, create a support ticket in your dashboard asking for a architecture review session. we can follow up and arrange something! we do these "production reviews" periodically with teams
Hmza
HmzaOP5mo ago
Ofcourse i’m a pro, thanks jamie will do that soon
ampp
ampp5mo ago
Does this include making some of that telemetry available within the api? I'd like to make the actual usage data available to at least the owners of our organization accounts and possible user based token rate limiting based off real weights.
ian
ian5mo ago
You can access a lot of information with Log Streams already: usage: - database_read_bytes: number - database_write_bytes: number, this and database_read_bytes make up the database bandwidth used by the function - file_storage_read_bytes: number - file_storage_write_bytes: number, this and file_storage_read_bytes make up the file bandwidth used by the function - vector_storage_read_bytes: number - vector_storage_write_bytes: number, this and vector_storage_read_bytes make up the vector bandwidth used by the function - action_memory_used_mb: number, for actions, the memory used in MiB. This combined with execution_time_ms makes up the action compute. Source: https://docs.convex.dev/production/integrations/log-streams/#function_execution-events
Log Streams | Convex Developer Hub
Configure logging integrations for your Convex deployment
ian
ian5mo ago
If you log data about a user, you can correlate users to function request_ids, and thereby associate with the function usage

Did you find this page helpful?