Possible issue with LRU and shared memory zones?
Maxim Dounin
mdounin at mdounin.ru
Sat Sep 21 17:13:46 UTC 2024
Hello!
On Sat, Sep 21, 2024 at 03:14:08PM +0200, Eirik Øverby via nginx wrote:
> Hi!
>
> We've used nginx on our FreeBSD systems for what feels like forever, and
> love it. Over the last few years we've been hit by pretty massive DDoS
> attacks, and have been employing various tricks in nginx to fend them off.
> One of them is, of course, rate limiting.
>
> Given a config like..
> limit_req_zone $request zone=unique_request_5:100m rate=5r/s;
>
> and then
> limit_req zone=unique_request_5 burst=50 nodelay;
>
> we're getting messages like this:
> could not allocate node in limit_req zone "unique_request_5"
>
> We see this on an idle node that only get very sporadic requests. However,
> this is preceded by a DDoS attack several hours earlier, which consisted of
> requests hitting this exact location block with short requests like
> POST /foo/bar?token=DEADBEEF
>
> When, after a few million requests like this in a short timespan, a "normal"
> request comes in - *much* longer than the DDoS request - , e.g.
> POST /foo/bar?token=DEADBEEF&moredata=foo&evenmoredata=bar
>
> this is immediately REJECTED by the rate limiter, and we get the
> aforementioned error in the log.
>
> The current theory, supported by consulting with FreeBSD developers far more
> educated and experienced than myself, is that something is going wrong with
> the LRU allocator: Since nearly all of the shared memory zone was filled
> with short requests, freeing up one (or even two) of them will not be
> sufficient for these new requests. Only an nginx restart clears this up.
>
> Is there anything we can do to avoid this? I know the API for clearing and
> monitoring the shared memory zones until now has only been available in
> nginx plus - but we are strictly on a FOSS-only diet so using anything like
> that is obviously out of the question.
I think your interpretation is (mostly) correct, and the issue
here is that all shared memory zone pages are occupied for small
slab allocations. As such, slab allocator cannot fulfill the
allocation request for a larger allocation. And trying to free
some limit_req nodes doesn't fix this, at least not immediately,
since each page contains multiple nodes.
This is especially likely to be seen if $request is indeed very
large (larger than 2k assuming 4k page size), and slab allocator
cannot fulfill it from the existing slabs and falls back to
allocating full pages.
Eventually this should fix itself - each requests frees up to 5
limit_req nodes (usually just 2 expired nodes, but might clear
more if first allocation attempt fails). This might take a while
though, since clearing even one page might require a lot of
limit_req nodes freed: one page contains 64 of 64-byte nodes, but
since nodes are cleared in LRU order, freeing 64 nodes might not
be enough.
In the worst case this will require something like 63 * (number of
pages) nodes freed. For 100m shared zone this gives 1612800
nodes, and hence about 800k requests. This probably explains why
this is seen as "only restart clears things up".
This probably can be somewhat improved by adjusting number of
nodes limit_req normally clears - but this shouldn't be too many
either, as this can open an additional DoS vector, and hence it
cannot guarantee an allocation anyway. Something like "up to 16
normally, up to 128 in case of an allocation failure" might be a
way to go though.
Another solution might be to improve configuration to ensure that
all limit_req nodes require equal or close amount of memory - this
is usually true with $binary_remote_addr being used for limit_req,
but certainly not for $request. Trivial fix that comes in mind is
to use some hash, such as MD5, and limit the hash instead. This
will ensure fixed size of limit_req allocation, and will
completely eliminate the problem.
With standard modules, this can be done with embedded Perl, such
as:
perl_set $request_md5 'sub {
use Digest::MD5 qw(md5);
my $r = shift;
return md5($r->variable("request"));
}';
(Note though that Perl might not be the best solution for DoS
protection, as it implies noticeable overhead.)
With 3rd party modules, set_misc probably would be most
appropriate, such as with "set_md5 $request_md5 $request;".
Hope this helps.
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx
mailing list