Streaming more than text
Working with the
get-convex
persistent-text-streaming
has been great, would appreciate some advice on handling non-text chunk appending. https://github.com/get-convex/persistent-text-streaming/blob/6c14033c16871f71fb27457c59ab1c227444999b/src/react/index.ts#L10C1-L12C2
Currently it's very straightforward to add basic chat via this component, however, when considering streaming the current state of generation for things like tool calls (eg. web browsing) it's not as clear how to handle.
Are there plans to include more robust support different types of objects that could be streamed to the client, rather than just text? Or is there recommended way to handle this with the current component?
On another note, wonder if this window check should be included in a useEffect
instead to prevent having to dynamically import in NextJS. https://github.dev/get-convex/persistent-text-streaming/blob/6c14033c16871f71fb27457c59ab1c227444999b/src/react/index.ts#L10-L12GitHub
persistent-text-streaming/src/react/index.ts at 6c14033c16871f71fb2...
Stream text back to a client over HTTP while persisting it on the server for reloading/sharing - get-convex/persistent-text-streaming
3 Replies
Hi. streaming more than text gets a little tricky, just because text streaming is really all the LLMs represent. you can respresent an agentic type flow with tool calls as a series of steps, and the LLM ones are each individual instances of persistent-text-streaming
things like "calling tool X" in response to an LLM tool invocation can probably be just synced down as regular DB objects, so the whole sequence is like an array of steps, and some of those steps are text streaming, and others are just pulling cards from the DB
hmm, are u suggesting having a separate "steps" table or something to store each step and let the
persistent-text-streaming
component just handle the text? wouldn't it be simpler to just add chunks as they come in to the same chunks table just with types since the llm provider will be streaming the chunks at that level anyways?yeah, that could be useful. I misunderstood your original question/statement, actually. I think you'd end up waiting both a lot of the time, b/c you'd have streaming chunks (incluing, say, tool calls) from each LLM invocation, but then also the actual tool calls themselves, happening on the convex server, and then the resumption of the stream
feel free to open tickets for any of this stuff on the repo!