[PATCH] Upstream: max_fails doesn't respect the next_upstream config
Maxim Dounin
mdounin at mdounin.ru
Thu Oct 30 16:26:39 UTC 2025
Hello!
On Thu, Oct 30, 2025 at 02:48:54PM +0800, solomon wrote:
> # HG changeset patch
> # User Zhefeng Chen <logchen2009 at gmail.com>
> # Date 1761731339 -28800
> # Wed Oct 29 17:48:59 2025 +0800
> # Node ID b81b918263d445c32cb2cff5e2724e3cac62dab3
> # Parent c3be8460587196f376bccf29817c3a6523e18150
> Upstream: max_fails doesn't respect the next_upstream config
>
> In the description of `max_fails`, it says:
> What is considered an unsuccessful attempt is defined by the
> `proxy_next_upstream`, `fastcgi_next_upstreami`,
> `uwsgi_next_upstream`, `scgi_next_upstream`,
> `memcached_next_upstream`, and `grpc_next_upstream` directives.
>
> But actually 403 and 404 are always considered an successful
> attempt, while other cases are always considered an
> unsuccessful attempt.
>
> The `ngx_http_upstream_free_round_robin_peer` function depends
> on this to update the health check state.
The documentation of the proxy_next_upstream directive (and other
directives mentioned) further clarifies the behaviour
(http://freenginx.org/r/proxy_next_upstream):
: The directive also defines what is considered an unsuccessful
: attempt of communication with a server. The cases of error,
: timeout and invalid_header are always considered unsuccessful
: attempts, even if they are not specified in the directive. The
: cases of http_500, http_502, http_503, http_504, and http_429 are
: considered unsuccessful attempts only if they are specified in the
: directive. The cases of http_403 and http_404 are never considered
: unsuccessful attempts.
Note that error, timeout, and invalid_header cases are always
considered unsuccessful (because they are, indeed, unsuccessful,
and there isn't much to be done here). This is what you probably
mean by "other cases are always considered an unsuccessful
attempt". This doesn't apply to all other cases though -
http_500, http_502, http_503, http_504, and http_429 are only
considered unsuccessful if they are explicitly specified in the
directive. These are valid responses, but at the same time these
are error responses, so depending on the particular use case
[free]nginx can either accept them as is (and send to the client)
or try to fallback to a different server. See
ngx_http_upstream_test_next() for details on where these codes are
checked.
Note well that 403 and 404 are explicitly excluded from being
considered unsuccessful attempts. This behaviour is intentional
and designed to support some common use cases, notably:
1. When resources are distributed among multiple servers, and
"proxy_next_upstream http_404;" is used to iterate over available
servers till the requested resource is found. Switching off a
server if the resource is not found in such a setup is not desired
for obvious reasons.
2. Similarly, but with resources rsynced to all servers, so
"proxy_next_upstream http_404 http_403;" makes it possible to
fallback to other servers if the resource is not yet synced to a
particular server (or the new directory is being synced, see
5231:05c53652e7b4).
Could you please elaborate a bit more on why do you suggest to
change this behaviour? And how the mentioned use cases are
expected to be handled with the new behaviour? Is it intentional
that error, timeout, and invalid_header cases won't be considered
unsuccessful unless explicitly specified in the
proxy_next_upstream directive with the proposed change?
And invalid_header won't be considered unsuccessful by default?
[...]
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list