[PATCH 1 of 3] Correctly calculate and set Age header
Hiroaki Nakamura
hnakamur at gmail.com
Tue Jun 25 13:47:55 UTC 2024
Hello,
Thank you for your detailed review!
2024年6月24日(月) 9:02 Maxim Dounin <mdounin at mdounin.ru>:
>
> Hello!
>
> On Thu, Jun 20, 2024 at 08:39:43PM +0900, Hiroaki Nakamura wrote:
> Note that it might be a good idea to update the commit log to
> follow style rules, such as:
>
> : Cache: added calculation of the Age header.
> :
> : Implement the calculation of the Age header as specified in
> : RFC 9111 "HTTP Caching", https://www.rfc-editor.org/rfc/rfc9111.html
I have updated the commit message.
> > @@ -75,6 +77,8 @@
> > time_t error_sec;
> > time_t last_modified;
> > time_t date;
> > + time_t response_time;
>
> Note that the "response_time" field being added seems to be
> exactly equivalent to the exiting "date" field.
Yes, the "response_time" is equivalent to the "date" field when
receiving a response from the upstream.
However the 'date" is updated when sending a cached response.
We need to keep the cache creation time.
I have renamed "response_time" to "creation_time" to make it clearer.
> [diff]
> showfunc=1
I have added this config to my ~/.hgrc.
> Note that this uses data from the cache node, which is not
> available if the cache node was loaded from disk, for example, on
> server restart, and this will certainly lead to incorrect results.
> If I'm reading the code correctly, it will result in seconds since
> the Epoch in the Age header on all responses after a restart.
>
> This suggests that either things should be implemented quite
> differently, or this patch needs to be merged with the second one,
> which implements loading relevant information from the cache
> header.
I have merged this patch with the second one.
> > + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
> > + "http file cache send, resp:%O, resident:%d, age:%d",
> > + c->response_time, resident_time, current_age);
> Please also note that using "%d" for time_t is wrong, as time_t
> size might be different from the size of int. For time_t, "%T"
> should be used instead.
I have changed to use "%T".
> > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_request.h
> > --- a/src/http/ngx_http_request.h Thu Jun 20 20:26:24 2024 +0900
> > +++ b/src/http/ngx_http_request.h Thu Jun 20 20:26:41 2024 +0900
> > @@ -290,6 +290,7 @@
> > off_t content_offset;
> > time_t date_time;
> > time_t last_modified_time;
> > + off_t age_n;
>
> Using off_t here looks wrong, as off_t is a type for file offsets,
> and not for dates.
I have changed the type of age_n to time_t.
> > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.c
> > --- a/src/http/ngx_http_upstream.c Thu Jun 20 20:26:24 2024 +0900
> > +++ b/src/http/ngx_http_upstream.c Thu Jun 20 20:26:41 2024 +0900
> > @@ -1549,6 +1559,7 @@
> > ngx_memzero(u->state, sizeof(ngx_http_upstream_state_t));
> >
> > u->start_time = ngx_current_msec;
> > + u->request_time = ngx_time();
>
> Adding a duplicate start time shouldn't be needed.
It is not a duplicate since u->request_time is real clock time as opposed to
u->start_time is monotonic clock.
We need a real clock time of response_time since we calculate
the difference from the upstream date value.
Also we use the calculation response_time - request_time
when the upstream date is not present or not well synchronized (see below),
so we also need request_time to be a real clock time.
> > @@ -2529,6 +2541,8 @@
> > return;
> > }
> >
> > + ngx_http_upstream_update_age(r, u, ngx_time());
>
> Note that the "ngx_time()" as an argument looks unneeded, it
> should be better to obtain the time in the function itself.
I have changed not to pass ngx_time() as an argument.
> Also, from semantic point of view this probably should be in
> ngx_http_upstream_process_headers(), and not a dedicated function
> call.
I have moved the code for
* resident_time = now - response_time;
* current_age = corrected_initial_age + resident_time;
from ngx_http_cache_send to ngx_http_upstream_process_headers.
As for ngx_http_upstream_update_age, I tried to move it into
ngx_http_upstream_process_headers(), but I could not make tests pass.
So I keep ngx_http_upstream_update_age to be called from
ngx_http_upstream_process_header and ngx_http_upstream_test_next.
> > @@ -2648,7 +2663,12 @@
> > valid = ngx_http_file_cache_valid(u->conf->cache_valid,
> > u->headers_in.status_n);
> > if (valid) {
> > - valid = now + valid;
> > + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
> > + "adjust cache valid_sec:%O, "
> > + "valid:%O, init_age:%d for 304",
> > + now + valid - r->cache->corrected_initial_age,
> > + valid, r->cache->corrected_initial_age);
> > + valid = now + valid - r->cache->corrected_initial_age;
>
> Not sure if Age calculations should apply to caching time as
> calculated with proxy_cache_valid directives, but if it does, it
> should apply consistently, in all cases where proxy_cache_valid
> times are used. At least ngx_http_upstream_intercept_errors()
> case seems to be ignored in the patch.
I have updated ngx_http_upstream_intercept_errors as well.
> Also, if Age applies, there should be a way to ignore it,
> similarly to how Cache-Control can be ignored with the
> proxy_ignore_headers directive.
I do not know how to do this.
Maybe add
{ ngx_string("Age"), NGX_HTTP_UPSTREAM_IGN_AGE },
to ngx_http_upstream_ignore_headers_masks?
> > @@ -2672,6 +2692,59 @@
> > }
> >
> >
> > +static void
> > +ngx_http_upstream_update_age(ngx_http_request_t *r, ngx_http_upstream_t *u,
> > + time_t now)
> > +{
> > + time_t response_time, date, apparent_age, response_delay, age_value,
> > + corrected_age_value, corrected_initial_age;
> > +
> > + /*
> > + * Update age response header.
> > + * https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-age
> > + */
> > + response_time = now;
> > + if (u->headers_in.date != NULL) {
> > + date = ngx_parse_http_time(u->headers_in.date->value.data,
> > + u->headers_in.date->value.len);
> > + if (date == NGX_ERROR) {
> > + date = now;
> > + }
> > + } else {
> > + date = now;
> > + }
> > + apparent_age = ngx_max(0, response_time - date);
> > +
> > + response_delay = response_time - u->request_time;
> > + age_value = u->headers_in.age_n != -1 ? u->headers_in.age_n : 0;
> > + corrected_age_value = age_value + response_delay;
> > +
> > + corrected_initial_age = ngx_max(apparent_age, corrected_age_value);
> > + r->headers_out.age_n = corrected_initial_age;
>
> Note that this approach, as described in RFC 9111 as a possible
> way to calculate the Age header, implies that time on both proxy
> server and the origin server must be "reasonably well
> synchronized", which is not really the case in many practical
> situations.
>
> If the time on the origin server is mostly arbitrary, for example,
> in the past, apparent_age might be very large, leading to a very
> large corrected_initial_age as well, preventing caching.
>
> As such, I would rather consider a more robust algorithm instead.
Yes, if the upstream date is too old, then apparant_age becomes too big.
How about using the following calculation?
if upstream date is between request_time and response_time, we assume
the upstream's clock is reasonably well synchronized and use
corrected_initial_age = age_value + (response_time - date);
if upstream date is not present or not between request_time
and response_time, we do not rely on the upstream date and use
corrected_initial_age = age_value + (response_time - u->request_time);
When sending a cached response, we calculate the Age
as specified in RFC 9111:
resident_time = now - response_time;
current_age = corrected_initial_age + resident_time;
where response_time is cache_creation_time.
I have updated my patchset to use the above calculation.
> > +#if (NGX_HTTP_CACHE)
> > + if (r->cache) {
> > + r->cache->response_time = response_time;
> > + r->cache->corrected_initial_age = corrected_initial_age;
> > + if (u->headers_in.adjusting_valid_sec) {
> > + r->cache->valid_sec -= corrected_initial_age;
>
> The "adjusting_valid_sec" logic looks unclear (probably at least
> needs a better name) and fragile. If I'm reading the code
> correctly, it will do the wrong thing at least if the upstream
> server returns something like:
>
> Cache-Control: max-age=60
> X-Accel-Expires: @<time>
>
> since u->headers_in.adjusting_valid_sec won't be cleared.
I found that we must ignore Expires when Cache-Control with max-age or s-maxage
is set according to https://www.rfc-editor.org/rfc/rfc9111#name-expires:
"If a response includes a Cache-Control header field with the max-age directive
(Section 5.2.2.1), a recipient MUST ignore the Expires header field. Likewise,
if a response includes the s-maxage directive (Section 5.2.2.10), a shared cache
recipient MUST ignore the Expires header field."
I have renamed adjusting_valid_sec to relative_freshness and added guard for
Expires and X-Accel-Expires: @<time> like below:
/*
* https://www.rfc-editor.org/rfc/rfc9111#name-expires
*
* If a response includes a Cache-Control header field with the max-age
* directive (Section 5.2.2.1), a recipient MUST ignore the Expires
* header field. Likewise, if a response includes the s-maxage directive
* (Section 5.2.2.10), a shared cache recipient MUST ignore the Expires
* header field.
*/
if (!u->headers_in.relative_freshness) {
r->cache->valid_sec = expires;
}
> > + ngx_log_debug2(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0,
> > + "http upstream adjusted cache "
> > + "valid_sec:%O, init_age:%O",
> > + r->cache->valid_sec, corrected_initial_age);
> > + }
> > + }
> > +#endif
>
> If I'm reading the code correctly, this will add the Age
> header to responses which are not cached. This somewhat
> contradicts to what RFC 9111 says in the description of the Age
> header (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1-6):
>
> : The presence of an Age header field implies that the response was
> : not generated or validated by the origin server for this request.
>
> Please also note that in many cases [free]nginx is expected to be
> seen as the origin server, even if it caches some upstream server
> responses according to the configuration (in many cases, ignoring
> Cache-Control or Expires headers). In this case using the
> Age header even on cached responses might not be desired. It
> would be great to dig further into possible use cases and make
> sure all are properly covered.
I have changed to not add Age if the calculated value is zero and also
updated tests.
> > @@ -5319,6 +5397,39 @@
> >
> >
> > static ngx_int_t
> > +ngx_http_upstream_process_age(ngx_http_request_t *r,
> > + ngx_table_elt_t *h, ngx_uint_t offset)
> > +{
> > + ngx_http_upstream_t *u;
> > +
> > + u = r->upstream;
> > +
> > + if (u->headers_in.age) {
> > + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
> > + "upstream sent duplicate header line: "%V: %V", "
> > + "previous value: "%V: %V"",
> > + &h->key, &h->value,
> > + &u->headers_in.age->key,
> > + &u->headers_in.age->value);
> > + return NGX_HTTP_UPSTREAM_INVALID_HEADER;
> > + }
> > +
> > + h->next = NULL;
> > + u->headers_in.age = h;
> > + u->headers_in.age_n = ngx_atoof(h->value.data, h->value.len);
> > +
> > + if (u->headers_in.age_n == NGX_ERROR) {
> > + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
> > + "upstream sent invalid "Age" header: "
> > + ""%V: %V"", &h->key, &h->value);
> > + return NGX_HTTP_UPSTREAM_INVALID_HEADER;
> > + }
>
> Note that returning errors here violates RFC 9111 SHOULD
> recommendations, notably
> (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1):
>
> : Although it is defined as a singleton header field, a cache
> : encountering a message with a list-based Age field value SHOULD
> : use the first member of the field value, discarding subsequent
> : ones.
>
> : If the field value (after discarding additional members, as per
> : above) is invalid (e.g., it contains something other than a
> : non-negative integer), a cache SHOULD ignore the field.
>
> Unless there are good reasons, a better solution might be to
> report warnings and follow the recommendations instead.
I have changed the invalid Age value and duplicated Age headers
to report warnings.
> > diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.h
> > --- a/src/http/ngx_http_upstream.h Thu Jun 20 20:26:24 2024 +0900
> > +++ b/src/http/ngx_http_upstream.h Thu Jun 20 20:26:41 2024 +0900
> > @@ -287,14 +287,17 @@
> >
> > ngx_table_elt_t *cache_control;
> > ngx_table_elt_t *set_cookie;
> > + ngx_table_elt_t *age;
> >
> > off_t content_length_n;
> > time_t last_modified_time;
> > + off_t age_n;
>
> See above, "off_t" looks wrong.
I have changed it to time_t.
I am sending updated two patches:
* [PATCH 1 of 2] Cache: added calculation of the Age header
* [PATCH 2 of 2] Tests: Update and add tests for Age header
Could you take another look?
Thank you,
Hiroaki Nakamura
More information about the nginx-devel
mailing list