Llm.ts wrapper
I have an llm.ts file that just thinly wraps the http API with some nice types, retries, batching, etc
8 Replies
I’m intrigued, would you mind sharing this? Have you built somewhat complex use cases with this (e.g. agentic workflows)?
here's one version of it in AI Town: https://github.com/a16z-infra/ai-town/blob/main/convex/util/llm.ts
Examples doing some RAG:
https://github.com/a16z-infra/ai-town/blob/08e3f419ba3f20ce46c63f8157a0ad223f0261d0/convex/agent/memory.ts#L325
https://github.com/a16z-infra/ai-town/blob/08e3f419ba3f20ce46c63f8157a0ad223f0261d0/convex/agent/conversation.ts#L13
AI Town characters are agents at the end of the day, but that whole control flow is done with regular code, not an AI-specific framework
I prefer to work with prompts and data streams directly when I can.
GitHub
llama-farm-chat/shared/llm.ts at main · get-convex/llama-farm-chat
Use locally-hosted LLMs to power your cloud-hosted webapp - get-convex/llama-farm-chat
Im curious about the streams. I'm such a efficiency geek, i want redesign the streaming part to only send the update, right now the code sends the entire pagination page each update. I'm curious if you had thought about this or how you would do it. It seems more straight forward from the db side but i'm thinking its more work on the rendering nextjs side...
AI Chat with HTTP Streaming
By leveraging HTTP actions with streaming, this chat app balances real-time responsiveness with efficient bandwidth usage. Users receive character-by-...
Not quite, i was thinking of breaking the message up to message fragments so each fragment is inserted into the table. Then when it completes the message table is updated. Just when i watch axiom with any of these message systems you are usually sending the entire pagination set worth of data each re-render so 10 messages x like 60 updates. That isn't nice on the bandwidth counter.
With the linked article, you could stream to the client, then only write the message once to the database at the end, so one client would get streamed results and others would see it all at once (or chunked by sentence, etc)
ah.. yeah i guess i need to look much closer, i was thinking of this in the context of work stealing with efficient streaming so i probably have to be a bit more creative. Thanks for mentioning that.