[PATCH 1 of 3] Correctly calculate and set Age header

Hiroaki Nakamura hnakamur at gmail.com
Mon Jul 1 13:45:59 UTC 2024


Hello

2024年6月27日(木) 9:32 Maxim Dounin <mdounin at mdounin.ru>:
> [...]
> The commit message certainly needs more work.  Still, you may want
> to postpone it, as well as code modifications, before various
> conceptual questions are cleaned up.

I have updated the commit log again.

> > > > @@ -75,6 +77,8 @@
> > > >      time_t                           error_sec;
> > > >      time_t                           last_modified;
> > > >      time_t                           date;
> > > > +    time_t                           response_time;
> [...]
> The "date" field is set when the cache item is created, and
> can be updated on cache item revalidation (with
> "proxy_cache_revalidate on").  It is not updated when sending a
> cached response.

Thank you, I had misunderstood this.
I take this into consideration and I found that we can keep
ngx_http_file_cache_header_t as is if we ignore
corrected_initial_age concept and just use the Age header from the
upstream as is (see below).

> [...]
> Note that "too many debug logging" comment still applies.

I have deleted debug logs.

> [...]
>
> > > > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.c
> > > > --- a/src/http/ngx_http_upstream.c      Thu Jun 20 20:26:24 2024 +0900
> > > > +++ b/src/http/ngx_http_upstream.c      Thu Jun 20 20:26:41 2024 +0900
> > > > @@ -1549,6 +1559,7 @@
> > > >      ngx_memzero(u->state, sizeof(ngx_http_upstream_state_t));
> > > >
> > > >      u->start_time = ngx_current_msec;
> > > > +    u->request_time = ngx_time();
> > >
> > > Adding a duplicate start time shouldn't be needed.
> >
> > It is not a duplicate since u->request_time is real clock time as opposed to
> > u->start_time is monotonic clock.
> >
> > We need a real clock time of response_time since we calculate
> > the difference from the upstream date value.
> >
> > Also we use the calculation response_time - request_time
> > when the upstream date is not present or not well synchronized (see below),
> > so we also need request_time to be a real clock time.
>
> Actually, monotonic clock is much better suited for time
> difference calculation, and that's exactly the reason why it is
> used for u->start_time.  And, since at any particular moment you
> know u->start_time, current monotonic time and current wall clock
> time, it is trivial to calculate wall clock start time if needed
> (but I doubt it's actually needed, see below).
>
> [...]
>
> > > > @@ -2648,7 +2663,12 @@
> > > >              valid = ngx_http_file_cache_valid(u->conf->cache_valid,
> > > >                                                u->headers_in.status_n);
> > > >              if (valid) {
> > > > -                valid = now + valid;
> > > > +                ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
> > > > +                               "adjust cache valid_sec:%O, "
> > > > +                               "valid:%O, init_age:%d for 304",
> > > > +                               now + valid - r->cache->corrected_initial_age,
> > > > +                               valid, r->cache->corrected_initial_age);
> > > > +                valid = now + valid - r->cache->corrected_initial_age;
> > >
> > > Not sure if Age calculations should apply to caching time as
> > > calculated with proxy_cache_valid directives, but if it does, it
> > > should apply consistently, in all cases where proxy_cache_valid
> > > times are used.  At least ngx_http_upstream_intercept_errors()
> > > case seems to be ignored in the patch.
> >
> > I have updated ngx_http_upstream_intercept_errors as well.
> >
> > > Also, if Age applies, there should be a way to ignore it,
> > > similarly to how Cache-Control can be ignored with the
> > > proxy_ignore_headers directive.
> >
> > I do not know how to do this.
> > Maybe add
> >     { ngx_string("Age"), NGX_HTTP_UPSTREAM_IGN_AGE },
> > to ngx_http_upstream_ignore_headers_masks?
>
> If at all, yes, this probably should be implemented like this.
>
> Still, for now I'm no convinced that Age needs to be taken into
> account for proxy_cache_valid.
>
> For example, consider a setup with two caching servers: an edge
> cache, a large central cache, and an origin server.  Assuming a
> response without any Cache-Control headers, something like
>
>     proxy_cache_valid 200 1h;
>
> on the edge cache, and
>
>     proxy_cache_valid 200 1d;
>
> on the large cache.
>
> With the suggested patch there will be Age header generated for
> responses returned by the large cache, effectively preventing
> caching on the edge cache.
>
> While such (obviously suboptimal and not desired) behaviour can be
> prevented by using "proxy_ignore_headers Age;" (assuming it is
> implemented), this would require extra otherwise unneeded
> configuration.  Further, this will prevent proper use of the Age
> header for Cache-Control-based caching for other requests.
>
> A better approach might be to limit the Age header to
> Cache-Control-based caching only.  Not sure though, and it
> certainly needs more analysis.

Thank you for your explanation.

I think now I understand an important difference between
proxy_cache_valid and Cache-Control.
The start time of the cache period of proxy_cache_valid is when nginx
receives a response while the start time of Cache-Control
max-age and s-maxage is when sending a response from the
origin server.

I have modified the patch to limit adjusting valid_sec by decreasing
the value of Age header to Cache-Control-based caching only.

> [...]
> This looks slightly more reasonable.  At least, this won't result
> in obviously bogus age values if clocks are not synchronized.
>
> Still, a simpler and mostly equivalent solution might be to avoid
> the whole "corrected_initial_age" concept, and use just Age as is,
> combined with the time passed since the cache item was
> created/revalidated.  The time in question is when the response
> headers were received, and usually a good approximation of the
> time when the response was actually generated.

Alright, I have changed the patch to ignore apparent_age and
response_delay and just use the upstream age.

When sending a cached response, the Age value is set to
the initial Age value plus resident time.

In this implementation, the modification of ngx_http_file_cache_header_t
is not needed since we can just use the value of the date field and
the Age header value stored in the cached response.

>
> > > > +#if (NGX_HTTP_CACHE)
> > > > +    if (r->cache) {
> > > > +        r->cache->response_time = response_time;
> > > > +        r->cache->corrected_initial_age = corrected_initial_age;
> > > > +        if (u->headers_in.adjusting_valid_sec) {
> > > > +            r->cache->valid_sec -= corrected_initial_age;
> > >
> > > The "adjusting_valid_sec" logic looks unclear (probably at least
> > > needs a better name) and fragile.  If I'm reading the code
> > > correctly, it will do the wrong thing at least if the upstream
> > > server returns something like:
> > >
> > > Cache-Control: max-age=60
> > > X-Accel-Expires: @<time>
> > >
> > > since u->headers_in.adjusting_valid_sec won't be cleared.
> >
> > I found that we must ignore Expires when Cache-Control with max-age or s-maxage
> > is set according to https://www.rfc-editor.org/rfc/rfc9111#name-expires:
> >
> > "If a response includes a Cache-Control header field with the max-age directive
> > (Section 5.2.2.1), a recipient MUST ignore the Expires header field. Likewise,
> > if a response includes the s-maxage directive (Section 5.2.2.10), a shared cache
> > recipient MUST ignore the Expires header field."
>
> The Expires header and X-Accel-Expires header are different
> headers.  The X-Accel-Expires header is a non-standard header
> which takes precedence over both Expires and Cache-Control.

I have fixed the patch to reset the relative_freshness flag for the
X-Accel-Expires header.

>
> [...]
>
> > > > +            ngx_log_debug2(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0,
> > > > +                           "http upstream adjusted cache "
> > > > +                           "valid_sec:%O, init_age:%O",
> > > > +                           r->cache->valid_sec, corrected_initial_age);
> > > > +        }
> > > > +    }
> > > > +#endif
> > >
> > > If I'm reading the code correctly, this will add the Age
> > > header to responses which are not cached.  This somewhat
> > > contradicts to what RFC 9111 says in the description of the Age
> > > header (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1-6):
> > >
> > > : The presence of an Age header field implies that the response was
> > > : not generated or validated by the origin server for this request.
> > >
> > > Please also note that in many cases [free]nginx is expected to be
> > > seen as the origin server, even if it caches some upstream server
> > > responses according to the configuration (in many cases, ignoring
> > > Cache-Control or Expires headers).  In this case using the
> > > Age header even on cached responses might not be desired.  It
> > > would be great to dig further into possible use cases and make
> > > sure all are properly covered.
> >
> > I have changed to not add Age if the calculated value is zero and also
> > updated tests.
>
> Note that this somewhat contradicts special handling of "Age: 0"
> in HTTP/3 part of your patch.
>
> Also, if I'm reading the patch correctly, as currently implemented
> the code might result in non-zero Age even for directly returned
> non-cached responses.

I have changed the modification of the Age header as below:

* pass through the Age header from the upstream when receiving
  a response from the upstream.
* set the Age header (may be 0) when sending a cached response.

>
> Also, "Age: 0" might legitimately appear for cached responses
> during the first second after the response was cached.  Not sure
> if it needs to be returned, but given the about RFC quote it might
> be usable for some.

Yes, I think "Age: 0" is valid in this case.

* https://www.rfc-editor.org/rfc/rfc9111#name-age
  Age = delta-seconds
  delta-seconds = 1*DIGIT
* https://www.rfc-editor.org/rfc/rfc9204.html#static-table
  The value of 0 for the Age header can be omitted.


>
> Note well that the "would be great to dig further" comment still
> applies.
>
> [...]
>
> Hope this helps.
>
> --
> Maxim Dounin
> http://mdounin.ru/

Best regards,

--
Hiroaki Nakamura


More information about the nginx-devel mailing list