grpc, does not exclude server for upstream on 5xx

Maxim Dounin mdounin at mdounin.ru
Mon Feb 3 22:29:31 UTC 2025


Hello!

On Mon, Feb 03, 2025 at 03:30:12PM +0300, Николай Токачев wrote:

>    Hello!
> 
>    There is an urgent need to balance gRPC traffic using nginx. I set up a
>    simple test environment with two gRPC services. One of the services is
>    behind a proxy (nginx).
> 
>    Here is the load balancer configuration:
>    upstream grpc_upstream {
>      server        192.168.100.30:8080 max_fails=1 fail_timeout=15s;
>      server        192.168.100.11:30002 max_fails=1 fail_timeout=15s;
>    }
> 
>    server {
>           listen                                 80;
>           http2                                  on;
>           server_name                            nginx.home.local;
>           gzip                                   off;
>           charset                                utf-8;
>           large_client_header_buffers            8 256k;
> 
> 
>           location / {
>             grpc_buffer_size       256k;
>             grpc_next_upstream     error timeout invalid_header http_504
>    http_503 http_502 http_500;
>             grpc_set_header        X-Real-IP $remote_addr;
>             grpc_set_header        X-Forwarded-For $remote_addr;
>             grpc_set_header        X-Forwarded-Host $host;
>             grpc_pass_header       Set-Cookie;
>             grpc_pass              grpc://grpc_upstream;
>           }
>    }
> 
> 
>    When the gRPC service behind the proxy server is turned off, our nginx
>    load balancer receives 502 errors but does not remove the server from
>    the upstream. Half of the requests fail.
> 
>    Here is the output from tail -f /var/log/nginx/access.log:
>    192.168.100.11 - - [01/Feb/2025:12:02:06 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 502 157 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
>    192.168.100.11 - - [01/Feb/2025:12:02:07 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 200 15 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
>    192.168.100.11 - - [01/Feb/2025:12:02:08 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 502 157 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
>    192.168.100.11 - - [01/Feb/2025:12:02:09 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 200 15 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
>    192.168.100.11 - - [01/Feb/2025:12:02:10 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 502 157 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
>    192.168.100.11 - - [01/Feb/2025:12:02:11 +0300] "POST
>    /helloworld.Greeter/SayHello HTTP/2.0" 200 15 "-" "grpcurl/v1.9.1
>    grpc-go/1.61.0" "-"
> 
>    I found a similar case
>    here: [1]https://trac.nginx.org/nginx/ticket/2060.
> 
>    Could you please let me know if there are any plans to address this
>    issue?

Thanks for reminding about this, certainly needs to be fixed.  
I'll take a look.

Just in case, a simple enough workaround would be to close the 
connection by the proxy server instead of returning 502 in case of 
errors.  In such a configuration the load balancer will be able to 
properly mark the proxy server as failed, and will stop sending 
additional requests to it till fail_timeout expires.

-- 
Maxim Dounin
http://mdounin.ru/


More information about the nginx mailing list