Delta updates on query? - understanding bandwidth
Hello, I am building a card game with big ridiculous states (see picture).
I'm trying to understand data usage in detail again.
On games.getGame(id), does the subscription have to get the whole row everytime? are there partial updates and/or compression middleware that could allow for delta updates similar to git patches?
Is a future optimization to split up queries to only grab a single column or so in order to reduce database bandwidth or is there any other recommended steps to reduce bandwidth?
20 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
hey @cyremur -- we currently don't have a way to do partial reads or writes to a document but it's on our radar.
in the meantime, one recommendation is to break your document up into smaller pieces. for example, you could have a separate
combatLog
table that stores { gameId: Id<"games">, "action": ... }
with an index on gameId
. then, it'd be efficient to (1) append a new entry to the combat log and (2) only read a range of the log if needed.thanks for the answer @sujayakar yeah I figured this was gonna be the answer. will keep it as is and then optimize when I go into alpha tests.
Even if I break out the tables though - and / or implement compression middleware - my use case would still benefit from delta updates:
Combat log e.g. is not just a log for analysis, but actually shows the opponents last turns as part of the interface. See bottom right of this image.
In the schema, I have
combatLog: v.array(vCombatAction),
So in my dream scenario, we have almost like a query planner middleware that automatically checks on v.array or v.object columns whether it is worth it to send diffs instead of raw data. In this case, it could just broadcast something like APPEND("X passed turn") instead of resending the entire array. On the game field object, it could detect that only 2 fields have been changed and can send a diff instead of the entire document.
Will likely implement a version of this AND/OR try breaking the documents into subdocuments when it becomes necessary to actually conserve bandwidth for me, but maybe it's a common enough use case to be worth an official middleware.makes sense!
btw, we currently have pagination (https://docs.convex.dev/database/pagination) for querying stuff like a
combatLog
table and efficiently sending updates as it changes, but we have some improvements in mind to make it easier to use.Paginated Queries | Convex Developer Hub
Load paginated queries.
hm I kinda want to have all moves on screen the whole time, implementing this via the pagination concept feels like a bit of a misuse. Like I would ALWAYS have to trigger loadMore as soon as status hits "CanLoadMore"
First off, that game screen looks amazing! I'm not sure it's the kind of game I'd go for, but it definitely has a lot of visual appeal.
As for the data retrieval process, here's a rough idea: Use two queries.
First, this piggybacks on the table-splitting idea proposed by @sujayakar , so each move in the combat log would have to be its own document in a separate table.
The first query retrieves the entire collection of moves for a given game when the game loads (or perhaps use pagination to only get the most recent X moves, only loading moves older than that if the user requests). If it's the start of a game, it will be an empty array; otherwise an array of all moves up to the present time. Again, this query only runs once when the game first loads, and it stores the collected documents in a state variable, which is used to render the on-screen list.
The second query only retrieves the most recent move in the game using an index targeting the game ID, with the query ending in
.order("desc").take(1)
. As each new record comes in via this query, it's appended to the full array in state.
This means that the only potentially-heavy query is the first one, but only if a game is in-progress and has lots of moves.
Would that work for this use case?My idea for the one-time-only query came after reading about
ConvexReactClient
and its query
method here: https://docs.convex.dev/api/classes/react.ConvexReactClient#query
FWIW, I've not actually used that method before, but it sounds like it would work.Class: ConvexReactClient | Convex Developer Hub
react.ConvexReactClient
Thanks for the input, once the game is stable enough (hopefully in like a month) I will start working on the more out there performance improvements.
makes sense -- I agree that from an API perspective, it's a bit awkward to be manually calling
loadMore
until you reach the end.
from a data loading perspective, however, it's accommodating the case where there are many (say thousands) of combat log entries in a game. then, it may make sense to stream them into the app in pages and not block interactivity on loading all of the log entries.@sujayakar just started playtesting and hit this 90Mb Bandwidth over 4 games. Optimization is starting to get more priority again for me.
also @Clever Tagline if you're interested in the game feel free to DM me
cool! do you have a sense for how much bandwidth it should take per game as a lower bound? we can then see what optimizations we’d need to get there.
also curious how the tables and queries are set up for the game — maybe there are some easy wins like the document splitting idea from before
Appreciate the info. As I said earlier, this isn't really my type of game. My wife might be interested in taking it for a spin, though. I'll show her the screenshot and if she's interested, I'll DM you.
Not really but I'm reconsidering provisioning a little server that just keeps the state in memory and implements its own websocket api. Would be a lot more inconvenient though. Will look into splitting up the state more. But I guess 0.2Mb state * 100 game actions for 20Mb feels ball park correct.
Appreciate all the attention you're giving even to more out there use cases like mine by the way 🙂
yeah, i’d be curious if we could find a way to make the bandwidth close to
sizeof(action) * 100 actions
. on one extreme convex could just sync an action log, but there should be ways to tweak the server data model to get close to this without fully upending everything.I think the main issue is just that my gamestate representation is gratuitously verbose for development conveniences and I need to add encoding like chess notation to compress it. The json for the battlefield alone is way to big when it arguably can be downsized to 64bytes for the terrain types and maybe a couple more bytes for hexId to entityId mapping. Really was designed for maximum typescript convenience and takes an awful amount of space.
That's what I'm considering quick wins atm.
At some point, I could go deep and only selectively load gamestate for the actions that need it and split up everything but that feels like a big challenge and will slow down development. Better encoding will be a data layer only change on the other hand and not touch game logic.
makes sense.
yeah, I think splitting stuff up into smaller documents will make a lot of stuff better automatically -- mutations will be cheaper when they only fetch what they need, and queries can be finer-grained, have fewer reactivity updates, get cached more effectively. but, understood how this then means pushing database access into your game logic.
on the other extreme, if you're storing everything in one big game state document, have you tried compressing the game state before writing it to the db? this is really quick & dirty, and it'll make the dashboard not that useful, but it could be worth trying.
i've used
lz4js
in queries/mutations and it works great:
was trying it with 10 repetitions of a 40KB json document (not a representative test for compression ratio, ofc)
I like the idea of this. However, I have also grown fond to the schema niceties. Is it possible to reuse my schema definition for validation of the uncompressed doc?
And thanks again for all the advice you've already given here.
hmm, not that I'm aware of. one idea for a workaround would be to switch to using zod validators. it'd be less nicely integrated with everything but still pretty close.
So I've tested this compression on the last three gamestates
Seeing an average ratio of 0.18
Based on those last three playtests and that compression ratio, my bandwitdh per game is about:
Uncompressed 43MB
Compressed 8MB
So if I look at current pricing, that would be
UNCOMPRESSED $0.01/game
COMPRESSED $0.002/game
Honestly, that still feels ok for now and I think I'll just upgrade my account in january and eat the bandwidth cost...