conradkoh•16mo ago

Stale while revalidate for re-subscribed queries

Between page navigations, pages mount queries that subscribe or unsubscribe. This results in loaders always triggering on a new subscription when a user navigates. Proposal: To improve the time to render, serve stale data from a cache (in memory) while the query subscription is still pending, especially if the user was on that page in the same session. In this case, we use non-persistent storage for the objects, so a session is assumed to be until the user refreshes the page. Here's a proof of concept: https://github.com/conradkoh/baby-tracker/blob/master/apps/mobile/lib/convex/use_query_swr.tsx#L5-L46 This solution duplicates in memory the objects returned from the server. If there is first class support for such caching from the library, similar to how react-query does it, but with real time features, it would be amazing.

GitHub

baby-tracker/apps/mobile/lib/convex/use_query_swr.tsx at master · c...

Contribute to conradkoh/baby-tracker development by creating an account on GitHub.

31 Replies

ballingt•16mo ago

This is cool to see, there's definitely room for this approach. What do you think of staying subscribed instead of using stale data? Like useQuery(api.foo.bar, {}, {staySubscribedFor5SecondsAfterUnmount: true}) Do you want this for identical queries, or queries with changing arguments?

conradkohOP•16mo ago

i’m getting really nice ux since making this change, because it also helps if the app goes into the background for a long period of time. in my case this is a mobile app so it makes sense. I would say i haven’t rationalized if the caching in the paginated hook is a good idea. I’m leaning towards that it’s a bad one haha. but for the normal useQuery one - I would say that a lot can be gained from the caching mechanisms common in existing web systems. this typically centers around users who have been to the app before, but either 1. have an internet connection that went away 2. have come back to a new session, but data was not updated since last seen (backend cache would likely be more appropriate in this case!) one super exciting thing about convex’s architecture is that the whole database is one with the backend system. so i think the cache invalidation optimization should be possible (even in userland). the dream flow would be 1. compute query on first subscribe 2. serve from backend cache for subsequent reads always (aka 1 read incurred for a compute that may have taken large row scans) 3. invalidate cache when dependent data has been mutated. there are 2 possible ways to handle this after: a. Let it get recomputed by the next subscriber who loads the query b. update on write (for fast initial load performance) of course these would have to assume some things - like function purity. I personally would like to see trade offs (just cache it even if Date constructor is used or external systems are called that may have changed - maybe a new backend pureQuery handler constructor that fails if anything external is used), which unlocks super nice optimized performance which only can be maintained by an event driven db. sorry, realized it could be a little abstract 😅 i hope it makes at least some sense 😂 will share more if I eventually implement some of these with examples. also for this:

Do you want this for identical queries, or queries with changing arguments?

I would say identical arguments 🙂 that’s how the cache key is being constructed in the example.

ballingt•16mo ago

What your describing is approximately how we intend for this to work! But in your case it sounds like our React client library is interacting poorly with your component structure. When you do page navigations, are these real navigations? or a client-side routing solution, where the page component where the useQuery lives is being invalidated, but shouldn't be? Another way to think about "cache" is a live cache that is never stale. That's what I mean by subscribing: as long as an application expresses interest in a query key (subscribes to it) the query result will be cached and updates will be pushed to it. Big picture I absolutely see your point about a global cache to avoid throwing away the current query result for a query key — but I want to call this "staying subscribed" so we don't have to give up consistent app-wide updates, just to communicate the philosophy we try to express our docs and libraries. So the solution I'd want is a more powerful way to express which queries you care about. Since the component that owns the query is being unmounted in your client navigation, we need to persist the query by (today, awkward I agree) moving it up the component tree, or by holding on to that subscription another way. One way looks like: stay subscribed to a query for 5 seconds after unmount

const client = useConvex();
useEffect(() => {
  const unsub = client
    .watchQuery(api.foo.bar, {})
    .onUpdate(()=>{});
  return function cleanup() {
    // unsubscribe 5 seconds after unmount
    setTimeout(unsub, 5000);
  }
}, [])

const client = useConvex();
useEffect(() => {
  const unsub = client
    .watchQuery(api.foo.bar, {})
    .onUpdate(()=>{});
  return function cleanup() {
    // unsubscribe 5 seconds after unmount
    setTimeout(unsub, 5000);
  }
}, [])

By not unsubscribing you could express that you want to stay subscribed forever, or with some more ergonomic APIs you could describe the conditions in which you'd like to be subscribed. There's also the notion not being subscribed (so saving on functions invocations) but keeping the last received value so that when you do come back to a query. This has the cost of keeping the value in memory but doesn't have the cost of watching the result live. This could make sense for cost-saving, but it is harder to reason about stale data like this. Currently we think it should be an explicit choice to opt into stale data, because now data from one useQuery won't be from the same timestamp as data from another useQuery.

conradkohOP•16mo ago

thanks for taking time to read and share your thoughts. i think i understand you better now! rather than introduce concepts of caching, and then holding cached objects in memory, we want to just hold one reference and a subscription that updates the reference. sounds like a good idea, sorry i didn’t understand this sooner. do you see any limitations to this paradigm for connections that have gone away for a long time and there is no way for the client to have known what the up to date value is? I guess in this case it should always go to undefined to indicate loading before the data actually comes back, correct? perhaps this can be seen as a different use case altogether from the routing one since the time between calls is vastly different. I might need to relook at how i’m thinking about this - you shared a pretty good alternative I hadn’t consider before. maybe i’m reaching too quick for stuff that is familiar!

ballingt•15mo ago

Currently these queries would never switch to undefined, the connected state if the client would just change to disconnected. I think we need more info here! And it's not a complete offline solution. Glad this makes sense to you, hold onto your high expectations here (eg maybe you do want cache without subscription!) but this is the reason we've been approaching it from this direction.

conradkohOP•15mo ago

mm yeah. for now, I've solved the flickering issue in the app, and I guess I don't mind paying the overhead of storing it in memory without garbage collection at this point (the app is super small). so it's still valuable for me at this point, so I'm sure there are others who would have the same use case. but also this is my first react native app, and I haven't really tried convex in the web context yet. 🙂 so perhaps needs for each kind are different too! but I think on a high level, something that seems to be missing is a way to ensure that the first render is super fast. do you happen to have any recommendations on how to optimise this (or what the go to design pattern would be here in the future with edge support)?

ballingt•15mo ago

of course these would have to assume some things - like function purity. I personally would like to see trade offs (just cache it even if Date constructor is used or external systems are called that may have changed - maybe a new backend pureQuery handler constructor that fails if anything external is used), which unlocks super nice optimized performance which only can be maintained by an event driven db.

Currently Convex uses a custom JavaScript runtime to enforce this purity so that there's no trust involved; and queries are not allowed to access external systems; today all querys are pureQuerys.

but I think on a high level, something that seems to be missing is a way to ensure that the first render is super fast. do you happen to have any recommendations on how to optimise this (or what the go to design pattern would be here in the future with edge support)?

We've prototyped systems to do edge caching, but regardless of those the current approach to first render being super fast is to server-side render.

conradkohOP•15mo ago

Currently Convex uses a custom JavaScript runtime to enforce this purity so that there's no trust involved; and queries are not allowed to access external systems; today all querys are pureQuerys.

interesting, sounds like a great approach. however, I believe that objects like Date or functions like Math.random are not pure. so not really all queries today are pure queries. but if indeed they are completely pure, then caching on write sounds that can be done safely. I'm wondering if a workaround can be to: 1. write the logic I might typically have as an action instead 2. create a new table that can act as a cache as a single row (although this will lead to complexity managing the sizes if the payload grows) 3. trigger the action to update the cache on write instead of on read this will likely save some row scanning cost and time as well perhaps? sounds a little convoluted, but do you think that's a good idea?

We've prototyped systems to do edge caching, but regardless of those the current approach to first render being super fast is to server-side render.

I guess this predominantly assumes a web based implementation, that the server has a runtime (as opposed to those using just a frontend and hoping to use convex as THE backend), and also that the server has some sort of caching (since server side renders will still likely be slow, due to the fact that there is no real time query subscription from something like NextJS that is running off functions). am I missing something?

jamwt•15mo ago

1. we use a known random seed, and fix the date at a known time in order to preserve purity of queries not sure I understand your point about edge caching. you can imagine all convex-managed edge computing regions as "clients" subscribed to queries on behalf of any connecting browser. therefore, they have all queries results already available locally and so can prerender very quickly in general, yeah--the goal is to use convex as the backend and we will just take care of all this these strategies don't rely upon another server side environment

conradkohOP•15mo ago

we use a known random seed, and fix the date at a known time in order to preserve purity of queries

thanks @jamwt! there was another thread where it was mentioned that if the Date was used, then results are not cached. https://discord.com/channels/1019350475847499849/1209831020820561931/1209858175793242162 if say the Date was fixed at a point in time, then the purity of queries should be preserved and it should be safe to hit the cache?

jamwt•15mo ago

we fix the date for a period of time. I think it's currently 5 seconds or 10 seconds. can't recall off-hand which means queries which depend on date will recalculate every ~N seconds (5-10)

conradkohOP•15mo ago

you can imagine all convex-managed edge computing regions as "clients" subscribed to queries on behalf of any connecting browser. therefore, they have all queries results already available locally and so can prerender very quickly

oh okay, if this is the case I think it's much clearer to me! it'll be sort of pre-computed close to the edge so that clients can have that data fast - very nice 🙂

which means queries which depend on date will recalculate every ~N seconds (5-10)

this is really interesting. so do you mean a new state will be automatically recomputed every 5-10 seconds (e.g. consume one function invocation), and then check if the value was changed? and if there was a change, then subscribers to the query would get the value automatically? it sounds pretty expensive to do that though..

jamwt•15mo ago

correct. as long as there is > 1 subscribers to this particular query (query name, params, new Date() in body), the date for that given set of (query name, params) is invalidated every N seconds. this will cause the function to be re-invoked with a newer new Date value re: expensive, i suppose it depends on the use pattern. it's somewhat rare for queries to call new Date though. mutations, yes, but queries, less so.

conradkohOP•15mo ago

I was just trying it out to see the current behavior using a small test query

export const test = query({
  args: {},
  handler: async (ctx, args) => {
    const date = new Date();
    return date.getTime();
  },
});

export const test = query({
  args: {},
  handler: async (ctx, args) => {
    const date = new Date();
    return date.getTime();
  },
});

I tried subscribing to this in the client and can confirm that it doesn't get updated past the initial render.

jamwt•15mo ago

hmm, yeah. @sujayakar can probably clarify. it's possible the behavior has changed in the last year

conradkohOP•15mo ago

also strictly speaking, I do think that allowing dates also makes it not really pure, even though it can be managed to some degree by returning a fixed date. also fixing Math.random to a specific seed will likely create more problems imo. in the worst case can result in security issues. so I still think that the current query is not technically pure, since the output is not deterministic. there's nothing wrong with this tbh, but you'll be forced to take tradeoffs between correctness and efficiency, because only pure functions can truly be cached indefinitely, as long as the underlying data is not updated.

jamwt•15mo ago

it looks like it does only update new Date every 5s or so, but it doesn't push down the updates @conradkoh there's a whole write up we'll do about this at some point. even internally the arguments about determinism vs. purity etc etc get pretty subtle in this case when it comes down to what's purely external. but the key point is: sources of nondeterminism are determinized so that we can reproducably consistently calculate a result in a way that ensures the semantics for caching / change propagation and so on re: implications of knowing the seed, say more? seeded cryptographically secure prngs are pretty common. what issues are you anticipating?

conradkohOP•15mo ago

so the main broad possible security issue i thought off was in the case where any query may use a library that depends on the crypto package - in these cases, users are not concerned with the internal workings of the package, so they the "bug" is not visible to them, since the library directly calls the runtime. so for any utility or tool where I am using the convex runtime, I have no way to be extra sure if the runtime is doing anything to affect the randomness that a third party package depends on). I'm aware that Math.random is not cryptographically secure, and that the crypto package can't be imported today - but just using this as an example

jamwt•15mo ago

yeah, at the end of the day, agree with you -- but the place the rubber meets the road is always about how carefully we "garden" the builtins to either swap them out with something determinized (Math.random) or remove them altogether b/c the semantics are incompatible (filesystem, say). that's a layer of our custom runtime we've put a lot of work into. we'll be open sourcing all this stuff soon so people can see how we've done it in the longer run, if someone needs something we've decided we can't safely determinize, then can always do use node and opt into an effectful runtime environment like an action. and then we make no attempts to cache it or understand its data dependencies, idempotency, etc but the intention is that it won't siliently misbehave; the intention is the bundling will fail b/c they cannot use that package or access that built in, so it is designed to "fail fast" if they attempt to use something we haven't included in the basis of the determinized execution environment the last two years or so we've honed the balance between enabling new apis when we can and at other times hardening the determinism with more and more confidence. we're probably pretty close right now to as good as we're gonna get until we have something like a WASM runtime with stricter controls over syscalls and effects. that would probably be something like a convex 2.0 runtime that could potentially support many backend languages but it definitely some time off I am a little surprised that value isn't propagating every 5s though b/c it is cached for 5s as you can see if you reload your app like crazy... so I don't know if that's an intention change or a regression. I'll have to ask the team this week

conradkohOP•15mo ago

mm got it! I validated the 5s cache - can confirm that it works! just doesn't auto update

jamwt•15mo ago

yeah. normally caching and subscription are the same thing in convex, it's literally how it all works... so I'm actually not sure how this isn't updating lol now I'm vaguely recalling a decision from ~18 months ago that this is less surprising than having the value pushed every 5s to end apps if no params or dependent data have changed. but yeah, the 5s cache is in effect for efficiency reasons

conradkohOP•15mo ago

apologies for dropping off halfway. I think overall, the current feature set works pretty well. Practically, there are 2 forms of means of detecting whether a function is pure or not. 1. build time (this sounds like possibly the runtime 2.0 you were talking about) 2. runtime - this is actually what is already present today regardless of the choices, the question still remains - how should the system handle impurity - either accept it or reject it. I personally think that it would be unfair to expect the system to handle it automatically, and while improving the dx, would eventually ruin the dx when the bill comes or when trying to debug the system. so I think one of the examples below here would be my ideal solution for the interim:

export const test = query({
  args: {},
  handler: async (ctx, args) => {
    ctx.disableCache(); //without this line, anything impure should fail
    const date = new Date();
    return date.getTime();
  },
});

export const test = pureQuery({ //pure queries should not allow anything impure
  args: {},
  handler: async (ctx, args) => {
    const date = new Date();
    return date.getTime();
  },
});

export const test = query({
  args: {},
  handler: async (ctx, args) => {
    ctx.disableCache(); //without this line, anything impure should fail
    const date = new Date();
    return date.getTime();
  },
});

export const test = pureQuery({ //pure queries should not allow anything impure
  args: {},
  handler: async (ctx, args) => {
    const date = new Date();
    return date.getTime();
  },
});

option 1 has the problem of being a breaking change though, so option 2 sounds more practical as an interim before runtime 2.0. mm yeah for sure! agree completely. if this were to be done, likely the cache should be checked first if the value did change before emitting the new state to subscribers. with that said i wouldn’t dare to be to presumptuous in my understanding of the decisions made to solve this problem - you guys have been doing a great job so far so i’d honestly trust the call you make it this over what i think tbh. just sharing some of the hiccups and surprises i met along the way. as someone who builds primarily for myself / for fun, i’m afraid that there will be some huge consumption of row reads that i have no way of resolving. so i’m just holding off releasing for now.

sujayakar•15mo ago

yep, this is it. we may change this behavior, since it is a bit odd that observing time causes the function's value to change if you reload the page but doesn't push updates. @conradkoh, one issue we ran into is that it's hard to reason about what 3rd party NPM packages are doing. some of them call Date.now() or Math.random() at import time for, say, feature detection, which would make them always non-deterministic. perhaps we could work around this by having something like pureQuery that specifies the timestamp we'd like to pin at build time, and then we'd never move that forward.

conradkohOP•15mo ago

hey @sujayakar. I agree with you - that it is hard to know whether a function from a 3rd party NPM package is deterministic or not. I feel that the fully reactive paradigm is challenged by determinism because of efficiency. more fundamentally, a function that depends on Math.random results in an infinite number of "updates" to be propagated. in a reactive model, I think it is fair to assume that every developer will try to trace an update to a trigger. in the event that I need a trigger run by the system and not a user, it will be fulfilled by a cron job. I would not expect a declarative query to set the refresh interval on my behalf, and execute it for me without considering the function's logic / context.

perhaps we could work around this by having something like pureQuery that specifies the timestamp we'd like to pin at build time, and then we'd never move that forward.

practically I don't think this works. take for example a simple use case where I have a function that takes a Date and adds 7 days:

function add7Days(d: Date()) {
  return new Date(d.getTime() + 7 * 24 * 60 * 60 * 1000);
}

function add7Days(d: Date()) {
  return new Date(d.getTime() + 7 * 24 * 60 * 60 * 1000);
}

More fundamentally, changing the behavior of Date or any others standard library changes the semantics of code that resides in third party packages (not just user code). That logic is completely transparent to me as a user - I used a package so I don't have to care about the inner logic. on a design level, would be nice to pin down what the ideal behavior of a non-deterministic query should be. is it (could be multiple) a. regenerate based on a system defined interval b. regenerate based on a user defined interval c. do not regenerate - user should create a cron / scheduled task to act as the trigger d. fail completely at build time e. fail completely at run time personally what I am looking for is b + d. but since d needs runtime 2.0, b + e for now. I was giving this a little more thought - was reminded me of one challenge that @ballingt raised previously. I think it isn't that simple to determine if a query is cacheable, if we can''t determine all execution paths of the query because of runtime params. So I think what @jamwt shared was correct again - most likely this can only come with a runtime 2.0 - or at least without static detection of which impure system / runtime APIs are being called. in the example I shared here, pureQuery will still result in a runtime error if there are branches in logic, which is still no good.

jamwt•15mo ago

@conradkoh hey! you just make the function arguments input into the caching/propagation key, which is what convex does today so no need for runtime 2 for that

conradkohOP•15mo ago

sorry I didn't quite get this. could you expand a little on it?

ballingt•15mo ago

@conradkoh Arguments are already part of the cache key, so Convex pretty much provides what you want right now by doing this at runtime. Every query is already perfectly cacheable — but you're right that handling of Date and math.random() could be cleaner and more customizable, including being runtime errors if that's what the developer wants.

in the example I shared here, pureQuery will still result in a runtime error if there are branches in logic, which is still no good.

That's already what happens when you do a fetch() in a query: TypeScript isn't expressive enough to describe our various environments, so it has to be a runtime error. It would be great it this were statically analyzable but when I write JavaScript, I expect runtime errors when I e.g. use a Node.js API in a React component in Next.js. This isn't so bad. You're asking for a way to enforce purity at build time, agreed we can't do that as long as we're using JavaScript — instead you'll have to look at the dashboard and see "this query is frequently being updated due to the Date call on this line," and change the code not to do that or configure the behavior of Date for that function, e.g. to always return the same value. Convex could provide configurable behavior for Date and math.random() because these are implemented in Convex-written Rust code, both for when to invalidate the cache and when to push down updates.

conradkohOP•15mo ago

makes sense! thanks @ballingt, great summary of where we landed with the discussion.

instead you'll have to look at the dashboard and see "this query is frequently being updated due to the Date call on this line,"

by the way, is this a feature that already exists?

ballingt•15mo ago

This does not exist yet. We have folks thinking about this kind of thing, in particular with respect to OCC conflicts (spoilers for the upcoming client release) where one transaction couldn't complete because other transactions repeatedly conflicted with it. Describing why queries were invalidated is in this same family of features. We know it's important to customers in particular for billing reasons where you want to know exactly why a query wasn't a cache hit.

Indy•15mo ago

As a very rudimentary indication we do give a cache hit percentage when you look at the query in the dashboard function page. But you'd have to understand what your expectation is for this query.

conradkohOP•15mo ago

thanks @Indy! i saw it, i think it’ll be really helpful!

Stale while revalidate for re-subscribed queries

Did you find this page helpful?