cyremur•9mo ago

Delta updates on query? - understanding bandwidth

Hello, I am building a card game with big ridiculous states (see picture). I'm trying to understand data usage in detail again. On games.getGame(id), does the subscription have to get the whole row everytime? are there partial updates and/or compression middleware that could allow for delta updates similar to git patches? Is a future optimization to split up queries to only grab a single column or so in order to reduce database bandwidth or is there any other recommended steps to reduce bandwidth?

20 Replies

Convex Bot•9mo ago

Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!

sujayakar•9mo ago

hey @cyremur -- we currently don't have a way to do partial reads or writes to a document but it's on our radar. in the meantime, one recommendation is to break your document up into smaller pieces. for example, you could have a separate combatLog table that stores { gameId: Id<"games">, "action": ... } with an index on gameId. then, it'd be efficient to (1) append a new entry to the combat log and (2) only read a range of the log if needed.

cyremurOP•9mo ago

thanks for the answer @sujayakar yeah I figured this was gonna be the answer. will keep it as is and then optimize when I go into alpha tests. Even if I break out the tables though - and / or implement compression middleware - my use case would still benefit from delta updates: Combat log e.g. is not just a log for analysis, but actually shows the opponents last turns as part of the interface. See bottom right of this image. In the schema, I have combatLog: v.array(vCombatAction), So in my dream scenario, we have almost like a query planner middleware that automatically checks on v.array or v.object columns whether it is worth it to send diffs instead of raw data. In this case, it could just broadcast something like APPEND("X passed turn") instead of resending the entire array. On the game field object, it could detect that only 2 fields have been changed and can send a diff instead of the entire document. Will likely implement a version of this AND/OR try breaking the documents into subdocuments when it becomes necessary to actually conserve bandwidth for me, but maybe it's a common enough use case to be worth an official middleware.

sujayakar•9mo ago

makes sense! btw, we currently have pagination (https://docs.convex.dev/database/pagination) for querying stuff like a combatLog table and efficiently sending updates as it changes, but we have some improvements in mind to make it easier to use.

Paginated Queries | Convex Developer Hub

Load paginated queries.

cyremurOP•9mo ago

hm I kinda want to have all moves on screen the whole time, implementing this via the pagination concept feels like a bit of a misuse. Like I would ALWAYS have to trigger loadMore as soon as status hits "CanLoadMore"

Clever Tagline•9mo ago

First off, that game screen looks amazing! I'm not sure it's the kind of game I'd go for, but it definitely has a lot of visual appeal. As for the data retrieval process, here's a rough idea: Use two queries. First, this piggybacks on the table-splitting idea proposed by @sujayakar , so each move in the combat log would have to be its own document in a separate table. The first query retrieves the entire collection of moves for a given game when the game loads (or perhaps use pagination to only get the most recent X moves, only loading moves older than that if the user requests). If it's the start of a game, it will be an empty array; otherwise an array of all moves up to the present time. Again, this query only runs once when the game first loads, and it stores the collected documents in a state variable, which is used to render the on-screen list. The second query only retrieves the most recent move in the game using an index targeting the game ID, with the query ending in .order("desc").take(1). As each new record comes in via this query, it's appended to the full array in state. This means that the only potentially-heavy query is the first one, but only if a game is in-progress and has lots of moves. Would that work for this use case?

Clever Tagline•9mo ago

My idea for the one-time-only query came after reading about ConvexReactClient and its query method here: https://docs.convex.dev/api/classes/react.ConvexReactClient#query FWIW, I've not actually used that method before, but it sounds like it would work.

Class: ConvexReactClient | Convex Developer Hub

react.ConvexReactClient

cyremurOP•9mo ago

Thanks for the input, once the game is stable enough (hopefully in like a month) I will start working on the more out there performance improvements.

sujayakar•9mo ago

makes sense -- I agree that from an API perspective, it's a bit awkward to be manually calling loadMore until you reach the end. from a data loading perspective, however, it's accommodating the case where there are many (say thousands) of combat log entries in a game. then, it may make sense to stream them into the app in pages and not block interactivity on loading all of the log entries.

cyremurOP•6mo ago

@sujayakar just started playtesting and hit this 90Mb Bandwidth over 4 games. Optimization is starting to get more priority again for me.

cyremurOP•6mo ago

also @Clever Tagline if you're interested in the game feel free to DM me

sujayakar•6mo ago

cool! do you have a sense for how much bandwidth it should take per game as a lower bound? we can then see what optimizations we’d need to get there. also curious how the tables and queries are set up for the game — maybe there are some easy wins like the document splitting idea from before

Clever Tagline•6mo ago

Appreciate the info. As I said earlier, this isn't really my type of game. My wife might be interested in taking it for a spin, though. I'll show her the screenshot and if she's interested, I'll DM you.

cyremurOP•6mo ago

Not really but I'm reconsidering provisioning a little server that just keeps the state in memory and implements its own websocket api. Would be a lot more inconvenient though. Will look into splitting up the state more. But I guess 0.2Mb state * 100 game actions for 20Mb feels ball park correct. Appreciate all the attention you're giving even to more out there use cases like mine by the way 🙂

sujayakar•6mo ago

yeah, i’d be curious if we could find a way to make the bandwidth close to sizeof(action) * 100 actions. on one extreme convex could just sync an action log, but there should be ways to tweak the server data model to get close to this without fully upending everything.

cyremurOP•6mo ago

I think the main issue is just that my gamestate representation is gratuitously verbose for development conveniences and I need to add encoding like chess notation to compress it. The json for the battlefield alone is way to big when it arguably can be downsized to 64bytes for the terrain types and maybe a couple more bytes for hexId to entityId mapping. Really was designed for maximum typescript convenience and takes an awful amount of space. That's what I'm considering quick wins atm. At some point, I could go deep and only selectively load gamestate for the actions that need it and split up everything but that feels like a big challenge and will slow down development. Better encoding will be a data layer only change on the other hand and not touch game logic.

sujayakar•6mo ago

makes sense. yeah, I think splitting stuff up into smaller documents will make a lot of stuff better automatically -- mutations will be cheaper when they only fetch what they need, and queries can be finer-grained, have fewer reactivity updates, get cached more effectively. but, understood how this then means pushing database access into your game logic. on the other extreme, if you're storing everything in one big game state document, have you tried compressing the game state before writing it to the db? this is really quick & dirty, and it'll make the dashboard not that useful, but it could be worth trying. i've used lz4js in queries/mutations and it works great:

import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({    
    args: {
        repetitions: v.number() 
    },
    handler: async (ctx, args) => {                
        let s = [];
        for (let i = 0; i < args.repetitions; i++) {
            s.push(example);
        }
        console.time("encode");
        const encoder = new TextEncoder();
        const buf = encoder.encode(s.join("\n"));
        console.timeEnd("encode");

        console.time("compress");
        const compressed = lz4.compress(buf);
        console.timeEnd("compress");

        console.time("decompress");
        const decompressed = lz4.decompress(compressed);
        console.timeEnd("decompress");

        if (!decompressed.every((value, index) => value === buf[index])) {
            throw new Error("Decompressed data does not match original data");
        }

        console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
    }
})

import { v } from "convex/values";
import { mutation } from "./_generated/server"
import * as lz4 from "lz4js";

export const compressionTest = mutation({    
    args: {
        repetitions: v.number() 
    },
    handler: async (ctx, args) => {                
        let s = [];
        for (let i = 0; i < args.repetitions; i++) {
            s.push(example);
        }
        console.time("encode");
        const encoder = new TextEncoder();
        const buf = encoder.encode(s.join("\n"));
        console.timeEnd("encode");

        console.time("compress");
        const compressed = lz4.compress(buf);
        console.timeEnd("compress");

        console.time("decompress");
        const decompressed = lz4.decompress(compressed);
        console.timeEnd("decompress");

        if (!decompressed.every((value, index) => value === buf[index])) {
            throw new Error("Decompressed data does not match original data");
        }

        console.log(`[lz4] Compressed ${(buf.length / 1024).toFixed(2)}KB to ${(compressed.length / 1024).toFixed(2)}KB (ratio: ${(compressed.length / buf.length).toFixed(2)})`);
    }
})

was trying it with 10 repetitions of a 40KB json document (not a representative test for compression ratio, ofc)

encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)

encode: 0ms
compress: 14ms
decompress: 8ms
[lz4] Compressed 407.22KB to 15.33KB (ratio: 0.04)

cyremurOP•6mo ago

I like the idea of this. However, I have also grown fond to the schema niceties. Is it possible to reuse my schema definition for validation of the uncompressed doc? And thanks again for all the advice you've already given here.

sujayakar•6mo ago

hmm, not that I'm aware of. one idea for a workaround would be to switch to using zod validators. it'd be less nicely integrated with everything but still pretty close.

cyremurOP•6mo ago

So I've tested this compression on the last three gamestates

PS C:\Users\cyremur\oasis-dreams> npx ts-node .\compress.ts
encode: 0.803ms
compress: 6.71ms
decompress: 3.181ms
[lz4] Compressed 45.02KB to 7.59KB (ratio: 0.17)
encode: 0.424ms
compress: 0.879ms
decompress: 0.254ms
[lz4] Compressed 29.43KB to 5.54KB (ratio: 0.19)
encode: 0.552ms
compress: 0.886ms
decompress: 0.33ms
[lz4] Compressed 41.22KB to 7.23KB (ratio: 0.18)

PS C:\Users\cyremur\oasis-dreams> npx ts-node .\compress.ts
encode: 0.803ms
compress: 6.71ms
decompress: 3.181ms
[lz4] Compressed 45.02KB to 7.59KB (ratio: 0.17)
encode: 0.424ms
compress: 0.879ms
decompress: 0.254ms
[lz4] Compressed 29.43KB to 5.54KB (ratio: 0.19)
encode: 0.552ms
compress: 0.886ms
decompress: 0.33ms
[lz4] Compressed 41.22KB to 7.23KB (ratio: 0.18)

Seeing an average ratio of 0.18 Based on those last three playtests and that compression ratio, my bandwitdh per game is about: Uncompressed 43MB Compressed 8MB So if I look at current pricing, that would be UNCOMPRESSED $0.01/game COMPRESSED $0.002/game Honestly, that still feels ok for now and I think I'll just upgrade my account in january and eat the bandwidth cost...

Delta updates on query? - understanding bandwidth

Did you find this page helpful?