Hmza•10mo ago

one function costed 17GB reads in one day

i have a developed an app that is being used by a team. the max i have hit yet was 3 gb per day every. but yesterday the reads were 17gb somehow. can i request a little review to see what went wrong? i did figure the function and fixed some issues but still i want to be sure if the reason in the dashboard is correct.

27 Replies

HmzaOP•10mo ago

also the difference between the functions and documents are not that big. not 12GB of different what bandwidth did

ian•10mo ago

Have you looked around at the breakdown of function calls? If you find the function that is chewing through a ton of bandwidth maybe you'll find a query that is using filter instead of withIndex or something. Or a paginated query that is constantly re-fetching a big page on new inserts (where a smaller page size could help)

ian•10mo ago

In case this is helpful: https://stack.convex.dev/queries-that-scale

Queries that scale

As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...

ian•10mo ago

Under the "Functions breakdown by project" section there is a drop-down for "Database Bandwidth" in case you didn't notice that - I didn't notice it originally

HmzaOP•10mo ago

hey Ian thanks for the helpful insight i actually found the function that ruined the bandwidth, it was calling regularly. one question which i couldn't find answer to yet is lets say this my query const clients = await Promise.all( clientIds.map(clientId => ctx.db.query("clients").withIndex("by_id", q => q.eq("id", clientId)).first() ) ); now i do get clientIds from another query similar to this. as convex doesn't have an .in what i think is this loop is going through the clientIds array to get all clients right. what's a better way todo this same thing in convex? appreciate your help i don't want todo batch queries as this is already going into pagination next.

BakemonoHouse•10mo ago

just curious, is the clientId the id generated by convex? if so, why not use ctx.db.get(clientId) instead

HmzaOP•10mo ago

client ID is not convex id in this case because requirement is that client id should be 4 letters max so it can be referenced and searched easily (its a crm which is used by non techinal people and to track down their clients they use these ref. no rather client information.)

BakemonoHouse•10mo ago

got it, I saw the q.eq(_id) and assumed it was the convex ID. is there any other index you can use to narrow down the search (by index) then filter using clientId instead?

HmzaOP•10mo ago

Yeah the problem is client cannot be narrowed any more This particular piece of code is used to get todolist everyday for salea team so we need to get all possible clients first to then get leads related to them and then to get tasks related to each lead There is narrow down for leads which i’m applying and then tasks also doesn’t have any narrowing down thing Shit bad right bakemono😂

BakemonoHouse•10mo ago

i went through some scenario in my head but i guess i keep coming back to the index solution. funnily speaking my product is also a CRM and has exactly same business process but I have not experienced this kind of bandwidth.

HmzaOP•10mo ago

what is approx data in your database right now ? that is being read every day i mean because mine is already in production and being used every day. approx 17k documents read per day. mine has 30 staff members users i mean

BakemonoHouse•10mo ago

yeah we are in production too. slightly larger user base but lower RW # and bandwidth in terms of docs. few hundred DAU atm

HmzaOP•10mo ago

well i think i need audit my convex backend because i know i'm doing some stupid things in there plus i'm using react query builder which is advance search on approx 3.5k records right now and its being used about every 5 minutes. i need to scale

BakemonoHouse•10mo ago

yeah its definitely difficult in nosql. I do a lot of bending around relational schema or just bend the UX a bit to limit large queries.

HmzaOP•10mo ago

i only scaled todo list today and look at this. big difference already in "Action compute" with even more usage today

HmzaOP•10mo ago

yeah that's actually good

BakemonoHouse•10mo ago

haha the eternal whack a mole for us anyways. I optimized something last time and ended up jacking up # of function calls.

jamwt•10mo ago

a big emphasis in H2 this year / H1 next year is both guides on optimizing/scaling and better telemetry in the dashboard to make it easy

HmzaOP•10mo ago

goated

jamwt•10mo ago

we actually have some users driving real, high DAU traffic and doing it very efficiently. but it's required 1:1 consulting sessions with the convex team b/c those guides and tools are missing right now

HmzaOP•10mo ago

totally agree. how can one request that? i'll not do right now but will definitely be good once i do what i can properly.

BakemonoHouse•10mo ago

super exciting news 🙈

jamwt•10mo ago

@Hmza if you're a pro customer, create a support ticket in your dashboard asking for a architecture review session. we can follow up and arrange something! we do these "production reviews" periodically with teams

HmzaOP•10mo ago

Ofcourse i’m a pro, thanks jamie will do that soon

ampp•10mo ago

Does this include making some of that telemetry available within the api? I'd like to make the actual usage data available to at least the owners of our organization accounts and possible user based token rate limiting based off real weights.

ian•10mo ago

You can access a lot of information with Log Streams already: usage: - database_read_bytes: number - database_write_bytes: number, this and database_read_bytes make up the database bandwidth used by the function - file_storage_read_bytes: number - file_storage_write_bytes: number, this and file_storage_read_bytes make up the file bandwidth used by the function - vector_storage_read_bytes: number - vector_storage_write_bytes: number, this and vector_storage_read_bytes make up the vector bandwidth used by the function - action_memory_used_mb: number, for actions, the memory used in MiB. This combined with execution_time_ms makes up the action compute. Source: https://docs.convex.dev/production/integrations/log-streams/#function_execution-events

Log Streams | Convex Developer Hub

Configure logging integrations for your Convex deployment

ian•10mo ago

If you log data about a user, you can correlate users to function request_ids, and thereby associate with the function usage

one function costed 17GB reads in one day

Did you find this page helpful?