[PATCH]HTTP/2 connection not properly closing during graceful shutdown

Maxim Dounin mdounin at mdounin.ru
Thu May 16 18:06:28 UTC 2024


Hello!

On Wed, May 15, 2024 at 11:46:29AM +0800, Kasei Wang wrote:

> Hello,
> 
> I found that there is a slight probability of HTTP/2 connections not
> properly closing during graceful shutdown, leading to worker processes
> in shutting down state remaining stuck for an extended period. After
> investigation, the issue appears to stem from the following:
> 
> 1. worker processes in shutting down state use
> ngx_close_idle_connections to close all idle connections, including
> HTTP/2 connections.
> 2. For HTTP/2 connections, c->idle is set to true in ngx_http_v2_init.
> According to the explanation in
> <https://hg.nginx.org/nginx/rev/5e95b9fb33b7>, GOAWAY should be sent to
> all HTTP/2 connections.
> 3. There might be a time gap between ngx_event_accept and
> ngx_http_v2_init. For TLS connections, ngx_http_v2_init will be
> executed after ALPN received, and for plaintext http2 connections,
> ngx_http_v2_init will be executed after parsing the http2 preface. If
> ngx_close_idle_connections is executed between ngx_event_accept and
> ngx_http_v2_init, there's a possibility that c->idle of some
> connections is set to true AFTER ngx_close_idle_connections, causing
> those connections to not enter the GOAWAY process and leading to the
> aforementioned problem.
> 
> To verify this, I've written a simple HTTP/2 client. This program will
> wait 15 seconds after TCP connection establishment before starting to
> send data. The purpose of sleep is to to raise the probability of
> encountering the issue. You can reproduce the problem by executing
> "nginx -s reload" during this 15-second wait. If you're interested, you
> can try my test program
> (<https://github.com/kaseiwang/http2-grace-shutdown-test>) to reproduce
> the issue.
> 
> The following patch would call ngx_http_v2_finalize_connection to close
> http2 connections which is initialized after
> ngx_close_idle_connections.
> 
> And here are some previous discussion on the another maillist:
> <https://mailman.nginx.org/pipermail/nginx-devel/2024-April/TSBIREWXEN7B4U7SRSR4LRJDPOR3XH4Q.html
> >
> 
> Please confirm if this issue exists, review my analysis and the patch
> if possible. Thank you very much.

I think your analysis is correct.  Thanks for catching this.

> # HG changeset patch
> # User Kasei Wang <kasei at kasei.im>
> # Date 1715744317 -28800
> #      Wed May 15 11:38:37 2024 +0800
> # Node ID 7f18358cc012039f6bb4e502a6f39a7bd50d669b
> # Parent  4cfd64a8330dd5bacd839796732fd8c49d87e86e
> HTTP/2: close http2 connections initialized during graceful shutdown.
> 
> In some rare cases, a HTTP/2 connections can be initialized during a
> graceful shutdown. Now close such an connection to avoid unexcepted
> delays in the graceful shutdown.
> 
> diff -r 4cfd64a8330d -r 7f18358cc012 src/http/v2/ngx_http_v2.c
> --- a/src/http/v2/ngx_http_v2.c Tue May 14 17:47:44 2024 +0300
> +++ b/src/http/v2/ngx_http_v2.c Wed May 15 11:38:37 2024 +0800
> @@ -304,6 +304,11 @@
>      c->idle = 1;
>      ngx_reusable_connection(c, 0);
> 
> +    if (ngx_exiting) {
> +        ngx_http_v2_finalize_connection(h2c, NGX_HTTP_V2_NO_ERROR);
> +        return;
> +    }
> +
>      if (c->buffer) {
>          p = c->buffer->pos;
>          end = c->buffer->last;

The patch looks working, though in the case in question the code 
will do various otherwise unneeded things, notably will send 
SETTINGS and WINDOW_UPDATE frames to the client.  OTOH, avoiding 
these might be tricky and will require more complex changes, and 
probably don't worth the effort.

Another approach might be to allow just one request if a 
connection is established just before the shutdown, much like 
HTTP/1.x code does.  Something as simple as the following patch 
should do the trick:

diff --git a/src/http/v2/ngx_http_v2.c b/src/http/v2/ngx_http_v2.c
--- a/src/http/v2/ngx_http_v2.c
+++ b/src/http/v2/ngx_http_v2.c
@@ -1336,7 +1336,8 @@ ngx_http_v2_state_headers(ngx_http_v2_co
     clcf = ngx_http_get_module_loc_conf(h2c->http_connection->conf_ctx,
                                         ngx_http_core_module);
 
-    if (clcf->keepalive_timeout == 0
+    if (ngx_exiting
+        || clcf->keepalive_timeout == 0
         || h2c->connection->requests >= clcf->keepalive_requests
         || ngx_current_msec - h2c->connection->start_time
            > clcf->keepalive_time)


What do you think?

-- 
Maxim Dounin
http://mdounin.ru/



More information about the nginx-devel mailing list