I assume if my app is AI heavy, I'd
I assume if my app is AI heavy, I'd spend a lot of compute gb-hours waiting for the responses? Is there some workaround for that?
6 Replies
To be more specific, it's not a chat where the data gets streamed to the user. More like a request gets submitted, processed via AI and stored in the db
But they are pretty heavy requests, can take a couple of minutes
waiting for an llm response does not use billable cpu compute time
i might be confusing this with cloudflare though. so let me double check.
do i have to use convex's LLM library? How does it even tell that its not using CPU
i was wrong. cloudflare doesn’t bill “wall time” while the request is waiting, but convex does. i think they are working on this? but hard to say when or if it’ll change.
Damn
Thanks for checking
pretty sure it has to do with the fact they need to keep the websocket connection open so it is still using server resources while waiting :/