[PATCH]HTTP/2 connection not properly closing during graceful shutdown
Maxim Dounin
mdounin at mdounin.ru
Tue May 21 09:26:38 UTC 2024
Hello!
On Fri, May 17, 2024 at 04:35:57PM +0800, Kasei Wang wrote:
> Hello!
>
> On Fri, May 17, 2024 at 2:06 AM Maxim Dounin <mdounin at mdounin.ru> wrote:
> >
> > Hello!
> >
> > On Wed, May 15, 2024 at 11:46:29AM +0800, Kasei Wang wrote:
> >
> > > Hello,
> > >
> > > I found that there is a slight probability of HTTP/2 connections not
> > > properly closing during graceful shutdown, leading to worker processes
> > > in shutting down state remaining stuck for an extended period. After
> > > investigation, the issue appears to stem from the following:
> > >
> > > 1. worker processes in shutting down state use
> > > ngx_close_idle_connections to close all idle connections, including
> > > HTTP/2 connections.
> > > 2. For HTTP/2 connections, c->idle is set to true in ngx_http_v2_init.
> > > According to the explanation in
> > > <https://hg.nginx.org/nginx/rev/5e95b9fb33b7>, GOAWAY should be sent to
> > > all HTTP/2 connections.
> > > 3. There might be a time gap between ngx_event_accept and
> > > ngx_http_v2_init. For TLS connections, ngx_http_v2_init will be
> > > executed after ALPN received, and for plaintext http2 connections,
> > > ngx_http_v2_init will be executed after parsing the http2 preface. If
> > > ngx_close_idle_connections is executed between ngx_event_accept and
> > > ngx_http_v2_init, there's a possibility that c->idle of some
> > > connections is set to true AFTER ngx_close_idle_connections, causing
> > > those connections to not enter the GOAWAY process and leading to the
> > > aforementioned problem.
> > >
> > > To verify this, I've written a simple HTTP/2 client. This program will
> > > wait 15 seconds after TCP connection establishment before starting to
> > > send data. The purpose of sleep is to to raise the probability of
> > > encountering the issue. You can reproduce the problem by executing
> > > "nginx -s reload" during this 15-second wait. If you're interested, you
> > > can try my test program
> > > (<https://github.com/kaseiwang/http2-grace-shutdown-test>) to reproduce
> > > the issue.
> > >
> > > The following patch would call ngx_http_v2_finalize_connection to close
> > > http2 connections which is initialized after
> > > ngx_close_idle_connections.
> > >
> > > And here are some previous discussion on the another maillist:
> > > <https://mailman.nginx.org/pipermail/nginx-devel/2024-April/TSBIREWXEN7B4U7SRSR4LRJDPOR3XH4Q.html
> > > >
> > >
> > > Please confirm if this issue exists, review my analysis and the patch
> > > if possible. Thank you very much.
> >
> > I think your analysis is correct. Thanks for catching this.
> >
> > > # HG changeset patch
> > > # User Kasei Wang <kasei at kasei.im>
> > > # Date 1715744317 -28800
> > > # Wed May 15 11:38:37 2024 +0800
> > > # Node ID 7f18358cc012039f6bb4e502a6f39a7bd50d669b
> > > # Parent 4cfd64a8330dd5bacd839796732fd8c49d87e86e
> > > HTTP/2: close http2 connections initialized during graceful shutdown.
> > >
> > > In some rare cases, a HTTP/2 connections can be initialized during a
> > > graceful shutdown. Now close such an connection to avoid unexcepted
> > > delays in the graceful shutdown.
> > >
> > > diff -r 4cfd64a8330d -r 7f18358cc012 src/http/v2/ngx_http_v2.c
> > > --- a/src/http/v2/ngx_http_v2.c Tue May 14 17:47:44 2024 +0300
> > > +++ b/src/http/v2/ngx_http_v2.c Wed May 15 11:38:37 2024 +0800
> > > @@ -304,6 +304,11 @@
> > > c->idle = 1;
> > > ngx_reusable_connection(c, 0);
> > >
> > > + if (ngx_exiting) {
> > > + ngx_http_v2_finalize_connection(h2c, NGX_HTTP_V2_NO_ERROR);
> > > + return;
> > > + }
> > > +
> > > if (c->buffer) {
> > > p = c->buffer->pos;
> > > end = c->buffer->last;
> >
> > The patch looks working, though in the case in question the code
> > will do various otherwise unneeded things, notably will send
> > SETTINGS and WINDOW_UPDATE frames to the client. OTOH, avoiding
> > these might be tricky and will require more complex changes, and
> > probably don't worth the effort.
> >
> > Another approach might be to allow just one request if a
> > connection is established just before the shutdown, much like
> > HTTP/1.x code does. Something as simple as the following patch
> > should do the trick:
> >
> > diff --git a/src/http/v2/ngx_http_v2.c b/src/http/v2/ngx_http_v2.c
> > --- a/src/http/v2/ngx_http_v2.c
> > +++ b/src/http/v2/ngx_http_v2.c
> > @@ -1336,7 +1336,8 @@ ngx_http_v2_state_headers(ngx_http_v2_co
> > clcf = ngx_http_get_module_loc_conf(h2c->http_connection->conf_ctx,
> > ngx_http_core_module);
> >
> > - if (clcf->keepalive_timeout == 0
> > + if (ngx_exiting
> > + || clcf->keepalive_timeout == 0
> > || h2c->connection->requests >= clcf->keepalive_requests
> > || ngx_current_msec - h2c->connection->start_time
> > > clcf->keepalive_time)
> >
> >
> > What do you think?
[...]
> Thanks for your response.
>
> Actually, I was worried that sending GOAWAY on stream 0 might cause
> compatibility issues for some clients. There are so many client
> implementations, and I can't be sure that all clients can correctly
> retry the request in this situation.
>
> The disadvantage of your approach is that the worker process may still
> need to wait for the first request to arrive. This is not a very long
> time, and it should be limited by the client_header_timeout, which is
> 60s by default. So I think the waiting is acceptable. The advantage of
> this approach is that the process to send GOAWAY frame would be more
> similar to the existing reload process, reducing the possibility of
> affecting client compatibility.
>
> Overall, I prefer your approach because it minimizes the change impact
> for the client.
Thanks for sharing.
For completeness, full patch with commit log below.
# HG changeset patch
# User Maxim Dounin <mdounin at mdounin.ru>
# Date 1716283099 -10800
# Tue May 21 12:18:19 2024 +0300
# Node ID 429d40e8275d2606da7cb710de5bc40a905fe52f
# Parent 46ecad404a296042c0088e699f275a92758e5ab9
HTTP/2: handling of connections initialized during shutdown.
If an HTTP/2 connection opened before a graceful shutdown, but
ngx_http_v2_init() is called after idle connections were closed, such
a connection ended up being open till closed by the client (or up to
keepalive_time), delaying shutdown.
With this change, such connections are allowed to serve just one request,
much like it happens in HTTP/1.x, and closed afterwards.
Reported by Kasei Wang,
https://freenginx.org/pipermail/nginx-devel/2024-May/000277.html
diff --git a/src/http/v2/ngx_http_v2.c b/src/http/v2/ngx_http_v2.c
--- a/src/http/v2/ngx_http_v2.c
+++ b/src/http/v2/ngx_http_v2.c
@@ -1336,7 +1336,8 @@ ngx_http_v2_state_headers(ngx_http_v2_co
clcf = ngx_http_get_module_loc_conf(h2c->http_connection->conf_ctx,
ngx_http_core_module);
- if (clcf->keepalive_timeout == 0
+ if (ngx_exiting
+ || clcf->keepalive_timeout == 0
|| h2c->connection->requests >= clcf->keepalive_requests
|| ngx_current_msec - h2c->connection->start_time
> clcf->keepalive_time)
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list