DanyloD
Convex Community3mo ago
6 replies
Danylo

LLM / Streaming

We currently run a game mode where users interact with an LLM while simultaneously viewing the live-streamed responses of their opponent.

Currently, this system is managed via WebSockets, but I believe Convex could be a great fit. However, we have some concerns regarding concurrency and function limits, which I’ve outlined below:

Main concerns:
- Real-time performance: Due to the competitive nature of the game, we can’t afford to introduce noticeable buffering or delays in updates.
- Concurrency: We anticipate a high volume of tokens being streamed simultaneously. Each match involves two LLMs and two players, all potentially streaming tokens in real time—and we expect many matches to run concurrently.
Billing question:
If we use Convex to stream LLM outputs from the server to the client—sending each token individually—how would billing work? Specifically, would each token streamed represent a separate Function Call (i.e., one mutation/query per token)?
Was this page helpful?