Best way to Rate limit all mutations
Is there a recommended way I could wrap all mutations in my convex backend with some rate limiting? should I just use convex helpers and abstract the mutation function with my own custom one with some rate limiting by userId?
just looking for suggestions. I guess I'd hate for someone to abuse my system and add in millions of images or records with a script or something.
just looking for suggestions. I guess I'd hate for someone to abuse my system and add in millions of images or records with a script or something.
10 Replies
my current approach right now is a custom method I invoke in my mutations
and that just stores a record in a rateLimits table and resets a timestamp if the window has elapsed
Yeah, here's the API I'm thinking about when I get around to building a rate limiting helper:
- Configure namespaces with settings like
createEvent: { fillRate: 10, cap: 100, period: "minute" }
that means: For a given createEvent
key, the safe limit is 10 units per minute. Unused unites get accumulated (like rollover minutes) up to 100. So if you've been idle for the last 10 minutes, you'll have 100 units if you fire a burst all at once.
- To "consume" a rate limit (open to suggestions on verb choice), you can call consume("createEvent", planId, 5)
- or for the normal case of consuming one, just consume("createEvent", planId)
It will return (or should it throw with these values?):
1. if the rate limit has been exceeded
2. when the request might next be serviceable based on the fill rate & current value
How does that sound? Any special features you'd want?
The data structure is a "token bucket" where the "filling" is continuous and doesn't require a cron.
Note that it will not prevent the mutation from being called in the first place. But it prevents firing off any actions / writes / storage.
Another fun extension of this I thought of this morning is: if you don't rely on the return of the mutation / action that you're trying to consume from, you can use the scheduler to schedule it to retry at the time when the tokens will be available, and earmark those tokens in the bucket by putting it in as a token "debt"
Mutation A wants 20 tokens, but there's only 5 and it refills at 1 per second. So it schedules itself for 15 seconds out, and adds 20 to the "debt"
5 seconds later, Mutation B wants 5 tokens, and there are 10 tokens, but it sees the debt of 20 so it schedules itself for 15 seconds out.
etc.
10 seconds later Mutation A wakes up and no one has been able to decrement, so it consumes the 20 down to 0 and runs.
5 seconds later Mutation B wakes up and there are 5 which it decrements to 0.
You can set a "max debt" so you don't accumulate too long of a queue (b/c this will impact latency for all the requests during this time), after which it will just refuse to take on debt and you should fail the request.I mean that all sounds good, the cap stuff with brust capacity sounds useful but might be more than my needs personally. I think throwing an error is enough for my needs, but I could maybe see how others might want a timestamp or try again in 4 seconds type of return value
Would we need something similar for queries or is that silly because convex caches query responses
Queries wouldn't be able to "consume" tokens. Thankfully they also can't kick off a call to an LLM
They could observe how may tokens are left for a client to try to throttle itself.
The goal with the burst stuff is if you say "you can send 5 messages per minute" and they send none the first minute, then 6 the next minute, it gives some headroom while providing on average no more than 5/min
Yeah I think the burst stuff is good to have, or at least an option to opt into
Hi @ian - your suggestions on the rate limit helpers look great. It seems similar to the sliding window Upstash rate limit library. But your phrasing and framing of rollover units is easier to reason about than Upstash’s sliding window of time
Also, an upstash cold start for the sliding window takes a lot of “commands”. For the sliding window, I have seen about 10 commands for one user submission on the front end
Interesting - I haven't used Upstash yet, so would be interested if there's any nice features I could incorporate. If anyone wants me to prioritize this, let me know! It's on my list, but has gotten pre-empted
I'm currently working on a revamp to wrapping database operations which you can use for RLS, but also for database triggers, denormalization, etc. Then I was thinking of making a headless version of the work stealing pattern I just wrote a couple Stack posts about.
@ian if possible could you prioritise this because it much a needed feature for all AI apps? also can give your suggestion for a free AI tool without any user registration how can we rate limit it? to prevent abuses
For a free tool there’s a couple layers you could add:
- basic users: use a client-generated session ID that they’re rate limited on.
- medium protection: You’d need to rate limit authorizing new session IDs for malicious users who figure out how to keep minting more. Maybe a speed bump where you can’t use a new session id for 15s would discourage script kitties
- leaky but handy: connect the session id to an IP via a single call to a Next.js api endpoint that reads the IP and sends it Convex to associate. You then rate limit from Convex using the sessionId once it has an associated IP. Or put a Convex http endpoint in front of a cloudflare proxy that writes the IP into a header (haven't done this but assume it's straightforward)
- Robust: authorize the session ID after they submit a captcha. This is probably your best bet
@Web Dev Cody I got around to making the rate limit library. Would love to know what you think:
It's in
convex-helpers@0.1.38
.
More info in this new Stack post: https://stack.convex.dev/rate-limitingImplementing Rate Limiting with only two numbers
Implementing application rate limiting when you have fast access to a database with strong ACID guarantees. Token bucket and fixed window, with fairne...