AI Response Streaming Pattern and Costs
Do you have any plans to treat patches and reads affected as a different billing category? I think the mental model of doing it like this (via DB) is brilliant, but I’m concern I’ll be doing bad optimizations just to be more long term cost conscious.
How are you/should I be thinking about this?
Should I architect this type of stuff with a web socket instead and only persist once the stream is done? Thanks in advance advance. p
