cyremur
cyremur3mo ago

Delta updates on query? - understanding bandwidth

Hello, I am building a card game with big ridiculous states (see picture). I'm trying to understand data usage in detail again. On games.getGame(id), does the subscription have to get the whole row everytime? are there partial updates and/or compression middleware that could allow for delta updates similar to git patches? Is a future optimization to split up queries to only grab a single column or so in order to reduce database bandwidth or is there any other recommended steps to reduce bandwidth?
No description
20 Replies
Convex Bot
Convex Bot3mo ago
Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!
sujayakar
sujayakar3mo ago
hey @cyremur -- we currently don't have a way to do partial reads or writes to a document but it's on our radar. in the meantime, one recommendation is to break your document up into smaller pieces. for example, you could have a separate combatLog table that stores { gameId: Id<"games">, "action": ... } with an index on gameId. then, it'd be efficient to (1) append a new entry to the combat log and (2) only read a range of the log if needed.
cyremur
cyremurOP3mo ago
thanks for the answer @sujayakar yeah I figured this was gonna be the answer. will keep it as is and then optimize when I go into alpha tests. Even if I break out the tables though - and / or implement compression middleware - my use case would still benefit from delta updates: Combat log e.g. is not just a log for analysis, but actually shows the opponents last turns as part of the interface. See bottom right of this image. In the schema, I have combatLog: v.array(vCombatAction), So in my dream scenario, we have almost like a query planner middleware that automatically checks on v.array or v.object columns whether it is worth it to send diffs instead of raw data. In this case, it could just broadcast something like APPEND("X passed turn") instead of resending the entire array. On the game field object, it could detect that only 2 fields have been changed and can send a diff instead of the entire document. Will likely implement a version of this AND/OR try breaking the documents into subdocuments when it becomes necessary to actually conserve bandwidth for me, but maybe it's a common enough use case to be worth an official middleware.
No description
sujayakar
sujayakar3mo ago
makes sense! btw, we currently have pagination (https://docs.convex.dev/database/pagination) for querying stuff like a combatLog table and efficiently sending updates as it changes, but we have some improvements in mind to make it easier to use.
cyremur
cyremurOP3mo ago
hm I kinda want to have all moves on screen the whole time, implementing this via the pagination concept feels like a bit of a misuse. Like I would ALWAYS have to trigger loadMore as soon as status hits "CanLoadMore"
Clever Tagline
Clever Tagline3mo ago
First off, that game screen looks amazing! I'm not sure it's the kind of game I'd go for, but it definitely has a lot of visual appeal. As for the data retrieval process, here's a rough idea: Use two queries. First, this piggybacks on the table-splitting idea proposed by @sujayakar , so each move in the combat log would have to be its own document in a separate table. The first query retrieves the entire collection of moves for a given game when the game loads (or perhaps use pagination to only get the most recent X moves, only loading moves older than that if the user requests). If it's the start of a game, it will be an empty array; otherwise an array of all moves up to the present time. Again, this query only runs once when the game first loads, and it stores the collected documents in a state variable, which is used to render the on-screen list. The second query only retrieves the most recent move in the game using an index targeting the game ID, with the query ending in .order("desc").take(1). As each new record comes in via this query, it's appended to the full array in state. This means that the only potentially-heavy query is the first one, but only if a game is in-progress and has lots of moves. Would that work for this use case?
Clever Tagline
Clever Tagline3mo ago
My idea for the one-time-only query came after reading about ConvexReactClient and its query method here: https://docs.convex.dev/api/classes/react.ConvexReactClient#query FWIW, I've not actually used that method before, but it sounds like it would work.
cyremur
cyremurOP3mo ago
Thanks for the input, once the game is stable enough (hopefully in like a month) I will start working on the more out there performance improvements.
sujayakar
sujayakar3mo ago
makes sense -- I agree that from an API perspective, it's a bit awkward to be manually calling loadMore until you reach the end. from a data loading perspective, however, it's accommodating the case where there are many (say thousands) of combat log entries in a game. then, it may make sense to stream them into the app in pages and not block interactivity on loading all of the log entries.
cyremur
cyremurOP2w ago
@sujayakar just started playtesting and hit this 90Mb Bandwidth over 4 games. Optimization is starting to get more priority again for me.
No description
cyremur
cyremurOP2w ago
also @Clever Tagline if you're interested in the game feel free to DM me
sujayakar
sujayakar2w ago
cool! do you have a sense for how much bandwidth it should take per game as a lower bound? we can then see what optimizations we’d need to get there. also curious how the tables and queries are set up for the game — maybe there are some easy wins like the document splitting idea from before
Clever Tagline
Appreciate the info. As I said earlier, this isn't really my type of game. My wife might be interested in taking it for a spin, though. I'll show her the screenshot and if she's interested, I'll DM you.
cyremur
cyremurOP2w ago
Not really but I'm reconsidering provisioning a little server that just keeps the state in memory and implements its own websocket api. Would be a lot more inconvenient though. Will look into splitting up the state more. But I guess 0.2Mb state * 100 game actions for 20Mb feels ball park correct. Appreciate all the attention you're giving even to more out there use cases like mine by the way 🙂
sujayakar
sujayakar2w ago
yeah, i’d be curious if we could find a way to make the bandwidth close to sizeof(action) * 100 actions. on one extreme convex could just sync an action log, but there should be ways to tweak the server data model to get close to this without fully upending everything.
cyremur
cyremurOP2w ago
I think the main issue is just that my gamestate representation is gratuitously verbose for development conveniences and I need to add encoding like chess notation to compress it. The json for the battlefield alone is way to big when it arguably can be downsized to 64bytes for the terrain types and maybe a couple more bytes for hexId to entityId mapping. Really was designed for maximum typescript convenience and takes an awful amount of space. That's what I'm considering quick wins atm. At some point, I could go deep and only selectively load gamestate for the actions that need it and split up everything but that feels like a big challenge and will slow down development. Better encoding will be a data layer only change on the other hand and not touch game logic.
sujayakar
sujayakar2w ago
makes sense. yeah, I think splitting stuff up into smaller documents will make a lot of stuff better automatically -- mutations will be cheaper when they only fetch what they need, and queries can be finer-grained, have fewer reactivity updates, get cached more effectively. but, understood how this then means pushing database access into your game logic. on the other extreme, if you're storing everything in one big game state document, have you tried compressing the game state before writing it to the db? this is really quick & dirty, and it'll make the dashboard not that useful, but it could be worth trying. i've used lz4js in queries/mutations and it works great:
import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({
args: {
repetitions: v.number()
},
handler: async (ctx, args) => {
let s = [];
for (let i = 0; i < args.repetitions; i++) {
s.push(example);
}
console.time("encode");
const encoder = new TextEncoder();
const buf = encoder.encode(s.join("\n"));
console.timeEnd("encode");

console.time("compress");
const compressed = lz4.compress(buf);
console.timeEnd("compress");

console.time("decompress");
const decompressed = lz4.decompress(compressed);
console.timeEnd("decompress");

if (!decompressed.every((value, index) => value === buf[index])) {
throw new Error("Decompressed data does not match original data");
}

console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
}
})
import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({
args: {
repetitions: v.number()
},
handler: async (ctx, args) => {
let s = [];
for (let i = 0; i < args.repetitions; i++) {
s.push(example);
}
console.time("encode");
const encoder = new TextEncoder();
const buf = encoder.encode(s.join("\n"));
console.timeEnd("encode");

console.time("compress");
const compressed = lz4.compress(buf);
console.timeEnd("compress");

console.time("decompress");
const decompressed = lz4.decompress(compressed);
console.timeEnd("decompress");

if (!decompressed.every((value, index) => value === buf[index])) {
throw new Error("Decompressed data does not match original data");
}

console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
}
})
was trying it with 10 repetitions of a 40KB json document (not a representative test for compression ratio, ofc)
encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)
encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)
cyremur
cyremurOP2w ago
I like the idea of this. However, I have also grown fond to the schema niceties. Is it possible to reuse my schema definition for validation of the uncompressed doc? And thanks again for all the advice you've already given here.
sujayakar
sujayakar2w ago
hmm, not that I'm aware of. one idea for a workaround would be to switch to using zod validators. it'd be less nicely integrated with everything but still pretty close.
cyremur
cyremurOP7d ago
So I've tested this compression on the last three gamestates
PS C:\Users\cyremur\oasis-dreams> npx ts-node .\compress.ts
encode: 0.803ms
compress: 6.71ms
decompress: 3.181ms
[lz4] Compressed 45.02KB to 7.59KB (ratio: 0.17)
encode: 0.424ms
compress: 0.879ms
decompress: 0.254ms
[lz4] Compressed 29.43KB to 5.54KB (ratio: 0.19)
encode: 0.552ms
compress: 0.886ms
decompress: 0.33ms
[lz4] Compressed 41.22KB to 7.23KB (ratio: 0.18)
PS C:\Users\cyremur\oasis-dreams> npx ts-node .\compress.ts
encode: 0.803ms
compress: 6.71ms
decompress: 3.181ms
[lz4] Compressed 45.02KB to 7.59KB (ratio: 0.17)
encode: 0.424ms
compress: 0.879ms
decompress: 0.254ms
[lz4] Compressed 29.43KB to 5.54KB (ratio: 0.19)
encode: 0.552ms
compress: 0.886ms
decompress: 0.33ms
[lz4] Compressed 41.22KB to 7.23KB (ratio: 0.18)
Seeing an average ratio of 0.18 Based on those last three playtests and that compression ratio, my bandwitdh per game is about: Uncompressed 43MB Compressed 8MB So if I look at current pricing, that would be UNCOMPRESSED $0.01/game COMPRESSED $0.002/game Honestly, that still feels ok for now and I think I'll just upgrade my account in january and eat the bandwidth cost...