one function costed 17GB reads in one day
i have a developed an app that is being used by a team. the max i have hit yet was 3 gb per day every. but yesterday the reads were 17gb somehow. can i request a little review to see what went wrong? i did figure the function and fixed some issues but still i want to be sure if the reason in the dashboard is correct.
27 Replies
also the difference between the functions and documents are not that big.
not 12GB of different what bandwidth did
Have you looked around at the breakdown of function calls? If you find the function that is chewing through a ton of bandwidth maybe you'll find a query that is using
filter
instead of withIndex
or something. Or a paginated query that is constantly re-fetching a big page on new inserts (where a smaller page size could help)In case this is helpful: https://stack.convex.dev/queries-that-scale
Queries that scale
As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...
Under the "Functions breakdown by project" section there is a drop-down for "Database Bandwidth" in case you didn't notice that - I didn't notice it originally
hey Ian thanks for the helpful insight i actually found the function that ruined the bandwidth, it was calling regularly.
one question which i couldn't find answer to yet is lets say this my query
const clients = await Promise.all(
clientIds.map(clientId =>
ctx.db.query("clients").withIndex("by_id", q =>
q.eq("id", clientId)).first()
)
);
now i do get clientIds
from another query similar to this. as convex doesn't have an .in
what i think is this loop is going through the clientIds array to get all clients right.
what's a better way todo this same thing in convex?
appreciate your help
i don't want todo batch queries as this is already going into pagination next.just curious, is the clientId the id generated by convex? if so, why not use ctx.db.get(clientId) instead
client ID is not convex id in this case because requirement is that client id should be 4 letters max so it can be referenced and searched easily (its a crm which is used by non techinal people and to track down their clients they use these ref. no rather client information.)
got it, I saw the q.eq(_id) and assumed it was the convex ID. is there any other index you can use to narrow down the search (by index) then filter using clientId instead?
Yeah the problem is client cannot be narrowed any more
This particular piece of code is used to get todolist everyday for salea team so we need to get all possible clients first to then get leads related to them and then to get tasks related to each lead
There is narrow down for leads which i’m applying and then tasks also doesn’t have any narrowing down thing
Shit bad right bakemono😂
i went through some scenario in my head but i guess i keep coming back to the index solution. funnily speaking my product is also a CRM and has exactly same business process but I have not experienced this kind of bandwidth.
what is approx data in your database right now ?
that is being read every day i mean
because mine is already in production and being used every day. approx 17k documents read per day.
mine has 30 staff members
users i mean
yeah we are in production too. slightly larger user base but lower RW # and bandwidth in terms of docs.
few hundred DAU atm
well i think i need audit my convex backend because i know i'm doing some stupid things in there plus i'm using react query builder which is advance search on approx 3.5k records right now and its being used about every 5 minutes.
i need to scale
yeah its definitely difficult in nosql. I do a lot of bending around relational schema or just bend the UX a bit to limit large queries.
i only scaled todo list today and look at this. big difference already in "Action compute" with even more usage today
yeah that's actually good
haha the eternal whack a mole for us anyways. I optimized something last time and ended up jacking up # of function calls.
a big emphasis in H2 this year / H1 next year is both guides on optimizing/scaling and better telemetry in the dashboard to make it easy
goated
we actually have some users driving real, high DAU traffic and doing it very efficiently. but it's required 1:1 consulting sessions with the convex team b/c those guides and tools are missing right now
totally agree. how can one request that? i'll not do right now but will definitely be good once i do what i can properly.
super exciting news 🙈
@Hmza if you're a pro customer, create a support ticket in your dashboard asking for a architecture review session. we can follow up and arrange something!
we do these "production reviews" periodically with teams
Ofcourse i’m a pro, thanks jamie will do that soon
Does this include making some of that telemetry available within the api? I'd like to make the actual usage data available to at least the owners of our organization accounts and possible user based token rate limiting based off real weights.
You can access a lot of information with Log Streams already:
usage:
- database_read_bytes: number
- database_write_bytes: number, this and database_read_bytes make up the database bandwidth used by the function
- file_storage_read_bytes: number
- file_storage_write_bytes: number, this and file_storage_read_bytes make up the file bandwidth used by the function
- vector_storage_read_bytes: number
- vector_storage_write_bytes: number, this and vector_storage_read_bytes make up the vector bandwidth used by the function
- action_memory_used_mb: number, for actions, the memory used in MiB. This combined with execution_time_ms makes up the action compute.
Source: https://docs.convex.dev/production/integrations/log-streams/#function_execution-events
Log Streams | Convex Developer Hub
Configure logging integrations for your Convex deployment
If you log data about a user, you can correlate users to function request_ids, and thereby associate with the function usage