Caching system failure.
I already mentioned some of these issues in this GitHub post, but I’ll go into more detail here.
I've spent days trying to debug Convex's caching behavior, and at this point, it seems increasingly likely that the problems aren't coming from my end.
There are multiple issues with the caching system, including:
* Paginated query caching only working sporadically — sometimes it only works once every few days.
* Auth sessions refreshing too frequently, which seems to unnecessarily invalidate queries. According to the documentation, refreshes should happen every hour, but as shown in the screenshot, one occurred after just 40 minutes. (It even refreshed the authRefreshTokens >2000 times in less than a month. Even tho the tokens are not expired).
* Queries getting invalidated without any mutations. As you can see, nothing in the app has changed, yet almost every query is being re-run.
* Table updates invalidate all queries referencing that table, even if the updated field isn't part of the query result.
This makes it really hard to rely on the caching layer to reduce bandwidth or improve performance — especially in production scenarios.
GitHub
Issues · get-convex/convex-backend
The open-source reactive database for app developers - Issues · get-convex/convex-backend

143 Replies
I just refreshed my browser, and ... an other refreshSession, with the caching working this time. It's absolutely not idempotent

(And you can see that this time the getImagesPaginated wasn't cached somehow, even tho nothing had changed, I didn't even scroll the page)
---
Now I refreshed the page and the getCover functions are invalidated

Literally no mutations were made between the screenshots.
can you share the code in the query?
that's the
getCover
query
and the getImages
internal query
gotcha. and these are all the same logged in user in the logs?
I'm in a dev environment, I only have my browser signed in
got it. and none of these libraries rely on randomness?
My auth query (basically the tutorial one)
I'm not using any external library
My sanitizeImage function just sanitize the data
`
in the log, I see a lot of queries all executing at the same time -- those are all the same
projectId
?They do not have the same projectId
Let's say I have a bunch of projects with cover images. I removed the images from the getProjects query, and created a custom getCover query to retrieve the covers one by one,
This prevent the getProjects query to reload ALL the projects when a cover changes
final question -- no code changes in between as well, right? any code change invalidates all the caches
no code changes like mutations you mean ?
oh, no, I didn't push any updates

If you're referring to this:
nop this doesn't happen between the screenshots

Mhh, right click open link to see it better
There is no mutation in between the cache invalidations
the stripe get products doesn't mutate anything by the way
okay I downloaded this. which particular timestamps do you want me to focus on? I see a cluster of requests at 1:34:29, a mutation at 1:45:00, then more queries at 1:45:00 and 1:45:01
I think the 1:34:29 seems interesting since it starts with a getCover that is not cached, but the other ones are
how big are these results, btw? like # of KB. and is this the cloud product, or self-hosted image?
I'll check that, (Im using your cloud solution)
also, free account or convex pro? just trying to rule out cache eviction
It's even more interesting that the getAccount is cached at 1:34:29 and then uncached at 1:34:29 later on
free until pushed in production
I saved logs that include :
- getTasks
- getResults
- getProject
- getImages
All of this is around 84kb
I see auth:store being called -- that probably invalidates the account, I would guess
ah. so images are just paths or something. no image data. this should all fit in the cache no problem
just making sure
Yeah,but if you look at 1:45:00 it's being called too and yet it DOESNT invalidate the getAccount query
Yes i'm using my own S3 solution

the getAccount may already have been in flight. I'll ask one of our backend experts to take a look at this thread to help clarify. the timestamp may be when the request started, not when it ended.
correct, it seems to be when it started (I made a github post about it) https://github.com/get-convex/convex-backend/issues/88#issuecomment-2854644235
not sure. let me stop speculating and get an engineer to look at this
I see indy answered the post there. to elaborate on your follow up question
invalidation is done purely on the basis of index ranges
it doesn't care what fields you do or do not use within the document
so yes, if you wanted to avoid invalidation, you'd have to break that record out into a separate table/record
the system doesn't introspect the javascript runtime to determine what fields mattered. it's purely based on index ranges visited by database calls
How Convex Works
A deep dive on how Convex's reactive database works internally.
I'm guessing the log line stuff is just due to the timeline and the "sequential" nature of the loglines being a little misleading since many things are running concurrently, especially when tightly grouped
but I'll ping someone to take a look at this
I'll look into that
thank you very much !
As much as I have read there is no performance cost in spliting tables and merging them in a Promise.all(). Am I right ?
https://github.com/get-convex/convex-backend/issues/95#issuecomment-2892478284
Like in this example
I mean I dont promiseall in this example
but you got me 🤣
yeah, the speed will be around the same if you make the fetches concurrent, and the cache invalidation benefits will apply
---Oh, and one thing that might help: the auth:store is regularly pinging authRefreshTokens (in the health tab), causing a conflict. I don't think that's part of Convex Auth's normal behavior---
Irrelevant
Thanks again
Sorry to butt in, but I am having a very similar issue with the cache. This example is simply a mutation with a query updating its state due to the mutation, however me changing the score from 10 to 9 back to 10 doesn't cache the result of 10 at all. I have checked that the data being sent when the score is set to 10 are both the exact same, yet still no cache.
Another extremely weird state which may suggest that cache is working but isn't being reported in the logs is that I can change these scores, quickly, and get some of the requests to presumably be out of sync, once the state is then changed (By changing another subject's score) it reverts to what seems like a cached version? However in the logs there is no sign of any cache from any of those queries.
Mhhhh
It looks like you're mutating onValueChange. It looks like to me that it has some useEffect update behavior where the mutation is sent, the data is changed and an other mutation is sent
Can you not use an input, but a button instead and click the button to trigger the mutation?
And see if it produces the same problem
Whenever you have multiple clients connected, the value will change and therefore every client might trigger the onValueChange which will trigger the mutation and breaks everything
Use a defautValue instead of a value, and a key={currentValue}
This will force the input to rerender when the value change but won't trigger the onValueChange
(if the problem comes from there)
Well yeah of course its not the most efficient way of doing things I really do need to fix it and thanks for the help with that. But when it is in the broken state, my point is that it is pulling that data from somewhere, it isn't my browser's cache as it still happens with cache disabled. And when looking at the websocket response from convex, it's showing a seemingly cached value, but no cache was hit?

I am so confused... All the debug button is doing is similar to what you said to do and just manually calling the mutation, in this case setting score 1 to 85, yet the score 1 from a completely different subject, is being reverted to a cached state? I didn't even need to spam requests to make it break
Meanwhile from just this amount of debugging:

OK I just wanted to check if it was react triggering unwanted mutations, but it indeed seems to come from convex caching layer
If you refresh your browser the value that went from 17 to 16 goes back to 17?
Or it stays at 16
Yep..
This is so bizare
It's been an hour as well since I last tried but I guess the invisible cache is still there
Yep it stays at 16? Or it correctly retrieves the value from convex db
Sorry, it flicks back to the incorrect value. This is a new recording after a refresh
So it doesn't retrieve the value from convex db?
Looks like the initial load is cached but after changing it isn't cached
To be honest your logs seem all good, you're mutating so your queries change therefore they won't be cached, that's normal behavior to me
Well it is supposedly, in the logs it isn't indicating a cache and I looked at the websocket messages and it's returning it as if it's the "correct" value
Wait, what is your mutation code
Are you reverting changes in your mutation?
Maybe your mutation triggers an other row in the table?
I removed all other rows in the table
Could it be the JSON just disallowing the cache?
I used to have an updated_at which I removed because I thought it was the cause of the no cache
What are you doing with the subjects field?
I don't think there is such thing as disallowing the cache.
When you change a row in convex and query that said row, every queries using that row will automatically update and send the response (as uncached) to any client alive. The following requests (such as when you refresh the browser, and until you mutate the row again, will be cached)
Oh interesting
So would what I'm experiencing be normal then?
Well what I see is that when you mutate a field, an other field that isn't supposed to change changes?
Yes, that's a bug I found along the way when trying to find out why my bandwidth usage was so high.
I think you're just not sending the right values in your array, log the args in your mutation
And as for your bandwidth usage, you HAVE to split your queries to the LOWEST level possible
What I mean by that, ONLY QUERY the bare minimum you need
And as for your tables SPLIT THEM as much as you can
So that if you update a row, only queries that need this row will rerender
Example:
We might want to do something like this :
Project(Id, name, progress)
BUT since the progress often changes we should do this instead
Project(id, name)
ProjectProgress(projectId, progress)
So only queries that need the progress will update when the progress changes
Yeah I should've made more of an effort migrating from my other database, I could just query it all at once and be using 0.1% of the bandwidth per month
Very true
I just had a look at what's being sent through the websocket link and you're right, it's sending through the old data, I'll try and find out why now.
Yeah I was having the same issue, migrating from prisma. I'm rewriting it all because I'm having the same issue as you have. The doc isnt clear on things
Only update the fields you need. And query the db to fill the other fields. You can also just not send the field key and it won't change the value
This will also reduce the bandwidth, cause if the user click on a button it doesn't send ALL the page inputs
Yeah, just crazy that the tiniest amount of data is causing the bandwidth to skyrocket. I'll probably add debouncing to it as well to limit requests.
Adding denounce might slow your app too...
If a user clicks a button, it will wait for 300ms before sending the update and then 300ms-700ms to get the update. So whenever a user clicks on a button it will take 1s to update the UI
You should even add optimistic update which will indeed reduce the bandwidth BUT will add a complexity layer
Well I got the idea from this thread: https://discord.com/channels/1019350475847499849/1318355759561445447
That's a bit different in here
You're querying not mutating
Wait you're right
Wait I am too
Throttling Requests by Single-Flighting
For write-heavy applications, use single flighting to dynamically throttle requests. See how we implement this with React hooks for Convex.
You indeed have to have some optimistic update
Let's say you click 3 times within a second (1,2,3)
You only want to send 3, but you want the user to show (1,2,3)
So you have to unlink the input from convex (by using defaultValue instead of value)
Use a debounce as you say
And set a key={value} to update the input once the data changes
Oh wow great find
Because if an other tab is opened and you're using defaultValue it won't update that tab once the value changes
It will only update if the key=value (that comes from convex, changes)
You'll have a 1s delay between your tabs but you will drastically reduce the bandwidth
Yeah I don't mind the delay at all, it could be 10s for all I care. It just greatly improves the UX for my site as I implement it as an "Auto Save" feature.
Yeah obviously and I feel like not having to invalidate all the queries when performing mutation is a huge win
But, the trade-off is that we REALLY have to code better
Yeah, I was hoping for a smooth copy/paste and change of some functions for it to operate normally. Definitely learnt my lesson after this though
@Jamie Now you made me wonder if this function can even be cached at all by your system ?
I'd better split it into getByRunIds, getByIds, etc
I know you've been going deep on this, but let's recap:
The function above is cached based on:
- Its arguments. If one function is called with runIds and another with projectId, they're cached separately.
- What ranges of documents it reads - not only the documents it gets back, but the range. So if you ask for the "first 100" and one is inserted at the beginning, it's invalidated.
- What auth token is retrieving it (iff it reads the ctx.auth)
- Including any
runQuery
it runs. Doing ctx.runQuery
from within a query doesn't currently do sub-query caching (iirc).
So when one of the documents you query for updates, it will invalidate and re-run.
If you're iterating and going through many auth tokens, it will invalidate and re-run.
These are all requirements for correctness ("perfect" caching).
You're smart to separate your high frequency updates so you don't invalidate big queries.
Also limiting the "page size" so when one document changes you only have to fetch a smaller number of documents.
I cover some of this in this Stack post
A couple notes:
- You already fetched the tasks in project
above from the tasks
table, so there's no need to do an individual ctx.runQuery
.
- If there is, you should factor out the function to get a task. Calling it via ctx.runQuery
has no benefit currently.
- One benefit of calling the function directly is I believe the db.get(taskId)
will get a cached value since it was already read by the project
query.
- Another benefit is it doesn't spin up a new isolate (isolated JS container)
- If you return IDs to a client and it subscribes to them, those queries won't be invalidated unless the task changes. However, if the query that gets the IDs is reading all the documents to get the IDs, that one is still invalidated and doing all the reads, so doing super granular queries on the client doesn't buy you much unless you have a bunch of sidecar data for each task.
- If you don't want query subscriptions on the client and want to have an explicit refresh button or polling or some other inconsistent-but-maybe-cheaper approach, you can always use await client.query
instead of useQuery
.Queries that scale
As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...
To recap your initial concerns:
- Paginated query caching only working sporadically — sometimes it only works once every few days.,
- The cache helper's paginated query solves the cache busting behavior, and I assume most of this was due to refresh tokens / other invalidation mentioned above.
- Auth sessions refreshing too frequently, which seems to unnecessarily invalidate queries. According to the documentation, refreshes should happen every hour, but as shown in the screenshot, one occurred after just 40 minutes. (It even refreshed the authRefreshTokens >2000 times in less than a month. Even tho the tokens are not expired).,
- Have you gotten to the bottom of this? I recall Convex Auth used to refresh the token on each page load, so if you're constantly refreshing in a dev environment, this would track. Not a production issue I've heard of for any large-scale current customers, so this would be surprising.
- Queries getting invalidated without any mutations. As you can see, nothing in the app has changed, yet almost every query is being re-run.,
- Is this explained by auth invalidation / args / changes to unrelated fields?
- Table updates invalidate all queries referencing that table, even if the updated field isn't part of the query result.
- As discussed above, a document update doesn't invalidate a whole table, but it does invalidate queries that read the document even if the field isn't queried
I do agree that selective field querying is a good optimization to add btw - but currently the query ranges are stored in a highly efficient data structure that is at the document range granularity to compute invalidations.
One thing to do about the progress query: just have the query from the client for the progress alone, and have a separate query for the progress. By the look of the
getProjectAndProgress
query above, the whole thing is invalidated when the progress changes.Hey @Ian,
Just to make sure:
- I understand how the caching works (based on the user ctx, and the args)
- I refactored the getTasks into multiples queries
- as you mention, the user account gets refreshed frequently. Which seems to be normal in dev mode based on your saying
- BUT, as shown in the screenshot, queries don't get cached EVEN when the user is not refreshed
The database doesn't change AT ALL, yet, when I refresh the page sometimes it gets the cached version, sometimes not
Also the "Auth caching invalidation thing", would be correct IF all oh my queries were invalidated
But it's not the case, some are, some not, then they are, then they are not, both are using the user ctx
And the
getCover
wasn't being called for different projects on each invocation?It is called for different projects, but the project didn't change
Nor any table in my database
And they were cached just 10s before
And you didn't have many tabs open with different auth tokens?
Can It be a nextjs setup error? I believe that if the args sent and the user token are the same there is no reason to un cache things, so it doesn't come from next?
When your auth token changes, each project query will be a cache miss
1 client, 1 browser, no action, no mutation
BUT why my other queries using user ctx are correctly cached when the Auth token changes then?
Also here, covers are cached even on token refresh
As I mention, Caching makes no sense at all, there are no recognizable pattern in my project
What I would do personally would be to start simple and try to investigate what the difference is. It seems that doing many project fetches for different project IDs accounts for why there are many projects being called after an auth invalidation.
You can also try commenting out the auth parts temporarily to see if it reproduces without auth
It's not only this query, ALL my queries are invalidating for no reason
Even actions
Cached actions I mean
Actions are never cached
With the helper, inner cached functions
Used in actions
I understand your frustration with not being able to make sense of caching. I have recommendations for how to diagnose it, if you're willing to try them.
Just take a look at this, pick any query (getAccount, getCover, getProjects, etc) and just check each time they are being called. Try to guess whether it will be cached or not, you will fail each time 🥲
Yep Tell me 🙂
Also, publicQueries (not using any user ctx) are getting uncached too for no reason
That image was downsampled by Discord and is very hard to read, but I believe you that without more context it's hard to figure out why some things are cached and some things aren't
right click : open link
this will help !
it did! thanks.
One thing to note that I think will help explain what we're seeing is that when you call
ctx.runQuery
from an action I believe it will cache the sub-query.
So one action could have seeded all of the images. And the sub-query isn't touching auth, so it was cached for all users at once. This is different from ctx.runQuery within a query/mutation, if I'm recalling correctlywhat I tried:
- removing inner queries
- removing authQueries in favor of publicQueries (so no user context)
- removing ags (making no args queries to check if caching works)
surely many other things that if you mention, I think I would have already tried
ok, well, actually my action is calling stripe and nothing else, so... it doesn't touch any other part of the app
How many of these are being run from Next.js server-side vs. client-side?
I'm curious how many are using the websocket client vs. http client
mhhh id say 90% are websocket
Are the
getCover
calls over HTTP?nop using, useQuery
Have you tried console.log of the args to see if each getCover is for the same vs. different args/ documents it scans?
I even tried queries with no args
they are getting uncached too
example: the getProfile query ALWAYS retrieves the same profile (and has a plain args text in it that NEVER changes)
I have my projects A,B,C and it retrieves my covers for projects A,B,C so it's not the same args, but when my queries encounter the same args they should retrieve the cached version rigth ?
If they're calling query->query, there isn't sub-query caching last I checked
we cannot use internal queries ?
if so, why is it cached sometimes even with internal queries ?
So you could have many queries calling the same query and it wouldn't be cached. I generally strongly advise against doing ctx.runQuery from within mutations or queries
then why is it cached sometimes ?
If it's called from actions that might populate the cache maybe? I wouldn't over-optimize here and just do the query itself. The bandwidth will likely cost less than the function call
also my gest cost is not using ANY sub queries, nor authQueries, nor anything
it's getCost:client in the screenshot (I renamed publicQueries to unauth, and the default query({}) to publicQuery({}) for better naming)
it just returns plain JSON
Best Practices | Convex Developer Hub
This is a list of best practices and common anti-patterns around using Convex.
I have no action other than the stripe one in the screenshot, and the stripe one is only doing a stripe api call
Yes but it says that it runs in the same context/transaction, therefore it should be cached ?
no that's not the behavior - the same transaction means it'll read a consistent view of the database, but any in-memory caching is not shared between isolates
btw, just to step back here, since I know some of this frustration is coming from a hesitation in launching to production. One thing that can help take the pressure off the current investigation would be to figure out the impact of the caching behavior on doing a launch as-is.
What's the actual risk / cost we're talking about here? Do you expect nontrivial users / usage on day one? Would it exceed the included pro limits and be a nontrivial amount of usage pricing? I share your curiosity in understanding the internal caching logic, as well as understand there's a desire to predict pricing. But for most customers this ends up being the least of their concerns. It may end up that these hours of debugging cost more in salary than you're saving with any usage pricing above the included limits.
And then the investigation can be more paced - making a simple repo so you can play around with different behaviors and do any optimization you need based on real user usage
Based on my single usage, our monthly cost that we expected to be 30$ would go up to 200$ - 600$ actually
I used 1GB bandwidth in 4 days, 5mb write, 1gb read
Ah so this is projecting your dev usage - and then multiplying by some number of users?
low average paid users, we made a huge update, so I cannot really compare how the free users will behave
Ah, you have many existing users. That makes more sense
I have around 3,000 users
Congrats!
thank you 🙂
but yeah, if they none of my users are hitting the cache the bandwidth cost will be insane
I had redis and supabase and we were around 6gb of bandwidth per month FOR ALL USERS
but here I used 1gb in 4 days JUST ME
so I'm gonna check the internalQueries and mutations things, will extract them into helpers functions,
but as mentioned
this is not using any internal queries, mutations, etc
and yet, not cached
Have you investigated the usage graphs in the dashboard btw? Under project settings
That can give you a better rundown of usage in the past day, since you've been making optimizations around project progress invalidation etc.
And more info about which functions are most worth optimizing
at first I thought I lowered the bandwidth usage, but there are no real improvements
the usePaginated ones were the biggest problems, now that we have a usePaginatedCached helper in the front end, it might help, but still, all the other functions will add up and with 3000 users it will burn our bill
I would:
1. Reduce the
_internal
pattern to make helper functions instead
2. Split out the project progress query to not also read the project and be queried straight from the client, not from other queries, if this isn't already the case.
3. Make the project progress a public query, provided there's no secret data in there. In general if there are endpoints that can be shared between users that don't expose sensitive data, you'll benefit more from caching without them being authenticated.
And also a reminder from above in case it got lost:
- If you don't want query subscriptions on the client and want to have an explicit refresh button or polling or some other inconsistent-but-cheaper approach, you can always use await client.query
instead of useQuery
and choose how often to fetch it.the project progress has been extracted since then ! 🙂
And that's been sent to the client directly, ya? not via a runQuery?
what's the point of not using authenticated queries for the project if only the user can read it ?
If it's user-specific and not shared between users, not much. just possible caching between auth refresh
yeah I'm already using queries in server components for that
using useQuery for this, since I want the browser to be updated uppon changes
well that's not the problem here since in my screenshot even after auth refresh it's still cached........ AND sometimes not 🥲
And sometimes no auth refreshed an not cached 🥲
Also, I'm doing this for a public library (of images), and it doesn't get cached too.......
whatever i'm gonna check tomorrow (it's 00am), i'll extract the internal queries into helpers, and I'll update you
we were planning to make the migration during this week... let's see if it fixes everything
(spoiler, I have HUGE DOUBTS, given I have tried not using internal queries, using public queries, and it still didnt work)
Can it be on the nextJS side ? (a <Provider/> problem) ? do you have logs of the data sent to your server for my queries ?
There are log streams for granular usage data, but to log inputs/outputs, you need to add logging yourself.
I'd recommend adding extensive logging, dumping it to a text file, and doing some investigation on that.
NextJS could certainly be a culprit here in some way.
What would be the things to log ?
- args
- user token
that's it ?
I'll remove any unecessary providers (posthog, etc) and check
and just sanity checking - you don't have streaming import set up / modifying things directly from the dashboard / etc?
Good luck getting to the bottom of it tomorrow - it's a holiday here and I'm also keen to get back to it.
If you have a simple repro in a repo we could play with, that would help. We have a lot of tests and dashboards around caching behavior, so an insidious nondeterministic bug there feels unlikely.
But there well may be something unexpected, like running into LRU cache size limits, or possible optimizations around not caching things based on some heuristics that I'm unaware of. Understanding how that affects a project like yours is valuable.
Hopefully you get enough confidence to start rolling it out soon - that's always exciting
Mhhh, nop I didn't dig into the streaming stuff so i don't think I have anything like that setup, I also didn't interaect with convex dashboard during the screenshots
Axiom is pretty slick and my platform of choice for log stream data. You can build nice dashboards & alerting just from logging JSON from your code / string parsing.
Enjoy your holidays!
I'll try to dig into it (and hopefully find the problem). I'll keep you updated with it, I hope this can help people
Minor update :
- I extracted most (if not ALL) of my _internal queries into functions helpers.
- I didn't extract my _internal mutations since they ... mutate anyway...
I'm gonna run like this for the next few days and see if there are noticeable cache issues
I was thinking about helpers to make it easier to separate the args/handler definitions from exposing something publicly...
What I do today:
which allows calling things directly and sharing the args types. But it means writing the types explicitly for the handler. I wonder how useful this would be to folks:
no typescript typing, with full type safety!
I made a GitHub Issue for it, for anyone who wants to chime in
Oh @WeamonZ I just remembered another source of caching that could be making a big difference in Dev for you!
The cache is also invalidated if the function changes - so args, auth, and function definition. (oh, and also if it reads an environment variable that changes - basically anything it reads + the code it runs)
So during a dev loop you might be iterating on code and it'll cause re-execution of queries every time
npx convex dev
syncs the code. Obv. this is less of an issue in prod.
And I don't know if we have the cache key be a hash of the function and all its dependencies, or if we just invalidate all queries after a push.Interesting !
I extracted everything into a functions folder for each "services" (auth, subscription, projects, etc)
And I kept my queries into a "router" folder.
In the end I think I prefer extracting the helpers into their one files so that they don't share too much with the router (which handles auth, etc) 🤔
I think enforcing a way to code is better for beginners to be fair. I'm more of the opinionated side, rather that giving 304903 ways of doing the same thing.
I prefer the first way, It explicitly allows to extract the helper from the query.
Also, the second format has downsides such has exposing/requiring custom query args
yeah i'd use this for places where you're expecting to expose them as functions, aside from other helper functions. defining args when they're not actually validated is funky
Example:
I prefer to explicitly require a
userId
in my helper rather than relying on the user context
this helps dissociating the helper from the user ctx
And this reduces the amount of code from getUserProfile, getCurrentUserProfile, to just getProfile for example, this deduplicates code
Mhh ok I'll keep that in mind, but, in my tests I didn't make any changes on the codebase to make sure no cache was revalidated@WeamonZ hey wanted to follow up - we've been doing some investigation and you're right there's something odd going on here. It seems to especially be prevalent with endpoints that use the built-in
.paginate
invalidating the query cache.. we're digging in and don't have a root cause but wanted to say sorry & thanks for hanging in there. Hopefully the cost is still acceptable for shipping, and the good news will be it'll only get cheaper once we get a pagination fix.
I haven't looked into it, but I strongly suspect that using the paginator
/ stream
helpers will not have this issue as they don't use the built-in .paginate
under the hood.
It's not gapless / perferct pagination though - so I'd do some testing with it. e.g. you might want to be explicitly setting endCursor on the client-side to avoid inserts/deletes that happen at the page boundaryAaaah! I'm glad you found something!
Indeed, I used to have a lot of paginated queries at first (especially to display the user library), but the caching was WAY WORSE than the current one (that I think is still bugged... Or at least in dev..). I reached the 1Gb bandwidth in like 3-4 days... I decided to remove most of them in my app until it's fixed because the costs would have been absurd.
Now as for the rest, I really hope my app won't act like in dev.
I still have 200mb of reads in 2 days.. So let's say I only have around 600 users visiting during the month and using 30% of the bandwidth. It would be :
600x0,3x100x30 = 540 000mb
So around 540gb? When my current app (which uses the same system, just based on postgre, but invalidates things at the same time, only uses 6gb for thousands of users)
540-50=490x0.2=98$
I would have to pay 120$ each month for such a few amount of users
I don't quite understand how the bandwidth is calculated, how can I go from 6gb to 500gb. I'm using indexes everywhere...
I read an article about the pagination system but I didn't quite understand it all all ahaha
Have you set up Log Streams? That has per-function call db bandwidth information. And the homepage tries to surface slow / heavy queries, and the usage graphs try to break it down by function, but I agree knowing the performance is important here
Not yet, you mentioned Axiom and I'll check up on that.
I refactored some code and I'm seeing more and more cached functions.
I had a getUserSubscription querying the current subscription at runtime (like expiration date > Date.now()), I'm now relying to some scheduler to update it, so it's hitting the cache more
Merged some mutations into a big one (when the user uploads files, I now update the database all at once instead of per files). This reduces the amount of cache invalidations too
As for the the paginated queries it literally never hits the cache
Actually, the biggest improvements to caching have been related to payments. There was some mention of cache eviction, which might have been part of the issue — or maybe I'm just imagining things
By the way, when I try to modify the date to like start at 26-29 IT WONT LET ME, instead it will set it to 24-26. I can't modify the startAt. So... it's pretty hard for me to see the changes in bandwidth

But the read/write ratio is still pretty insane 😭😭😭😭.

Also, weirdly, now I don't see the cache invalidation uppon session refresh in dev.
As we mentioned previously, this would/could cause queries invalidation when the token gets refreshed, but it doesn't seem to be the case anymore 🤔
Which is a good thing, cause it's unwanted
Nevermind, it still occurs 🤣😭
Why tie queries to the refreshToken rather than the sessionAccount tho 🤔
yeah it's the auth jwt (the refresh token isn't sent - I think is saved as an http-only cookie?). the rationale being that when the JWT expires we technically don't know if you still have access. But I'd LOVE to reconcile the JWT validation, so if it results in the same user & fields, it will be a cache hit. The challenge is when the sub field is a session ID vs. just a user ID. So because the session ID changes, the query is invalidated b/c that's an argument.
Jamie Turner (@jamwt)
Mistake I just made: I didn't give our power users the benefit of the doubt.
I swore our caching system was tested within an inch of its life, and was skeptical of reports of a regression where paginated queries aren't caching. Turns out I was wrong.
Will fix ASAP.

X
GitHub
Cache journalled queries under both the original and final QueryJou...
…al (#37676)
GitOrigin-RevId: 3ba133ad97dd678bfdf88394f7b69b826765e378
🎊🎊🎊🎊🎊🎊🎊🎊🎊🎊
So glad you found the source of the problem !!
Thank you for trusting me and for being so open to discussion!