[PATCH 1 of 3] Correctly calculate and set Age header
Maxim Dounin
mdounin at mdounin.ru
Sun Jun 23 23:42:20 UTC 2024
Hello!
On Thu, Jun 20, 2024 at 08:39:43PM +0900, Hiroaki Nakamura wrote:
> # HG changeset patch
> # User Hiroaki Nakamura <hnakamur at gmail.com>
> # Date 1718882801 -32400
> # Thu Jun 20 20:26:41 2024 +0900
> # Branch correct_age
> # Node ID c81df54e3d0333c26d4296792dc0df767b386f91
> # Parent 73929a4f3447d558747623884b5ba281c13332d8
> Correctly calculate and set Age header.
> Implement the calculation of the Age header as specified in
> "RFC 9111: HTTP Caching"
> https://www.rfc-editor.org/rfc/rfc9111.html
Thanks for the patches.
Note that it might be a good idea to update the commit log to
follow style rules, such as:
: Cache: added calculation of the Age header.
:
: Implement the calculation of the Age header as specified in
: RFC 9111 "HTTP Caching", https://www.rfc-editor.org/rfc/rfc9111.html
It also might be also a good idea to add additional implementation
details which might worth explaining, especially given that RFC
911 does not really specify how the Age header is to be
calculated, but rather gives some guidelines on possible
approaches.
See below for some more comments.
>
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_cache.h
> --- a/src/http/ngx_http_cache.h Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_cache.h Thu Jun 20 20:26:41 2024 +0900
> @@ -59,6 +59,8 @@
> size_t body_start;
> off_t fs_size;
> ngx_msec_t lock_time;
> + time_t response_time;
> + time_t corrected_initial_age;
> } ngx_http_file_cache_node_t;
>
>
> @@ -75,6 +77,8 @@
> time_t error_sec;
> time_t last_modified;
> time_t date;
> + time_t response_time;
Note that the "response_time" field being added seems to be
exactly equivalent to the exiting "date" field.
> + time_t corrected_initial_age;
>
> ngx_str_t etag;
> ngx_str_t vary;
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_file_cache.c
> --- a/src/http/ngx_http_file_cache.c Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_file_cache.c Thu Jun 20 20:26:41 2024 +0900
> @@ -971,6 +971,8 @@
> fcn->uniq = 0;
> fcn->body_start = 0;
> fcn->fs_size = 0;
> + fcn->response_time = 0;
> + fcn->corrected_initial_age = 0;
>
> done:
>
> @@ -980,6 +982,8 @@
Nitpicking: please consider adding
[diff]
showfunc=1
to your Mercurial configuration to ensure that function names are
shown in diffs, this makes reviews easier.
>
> c->uniq = fcn->uniq;
> c->error = fcn->error;
> + c->response_time = fcn->response_time;
> + c->corrected_initial_age = fcn->corrected_initial_age;
> c->node = fcn;
>
> failed:
> @@ -1624,6 +1628,7 @@
> ngx_int_t
> ngx_http_cache_send(ngx_http_request_t *r)
> {
> + time_t resident_time, current_age;
> ngx_int_t rc;
> ngx_buf_t *b;
> ngx_chain_t out;
> @@ -1646,6 +1651,17 @@
> return NGX_HTTP_INTERNAL_SERVER_ERROR;
> }
>
> + /*
> + * Update age response header.
> + * https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-age
> + */
> + resident_time = ngx_time() - c->response_time;
> + current_age = c->corrected_initial_age + resident_time;
> + r->headers_out.age_n = current_age;
Note that this uses data from the cache node, which is not
available if the cache node was loaded from disk, for example, on
server restart, and this will certainly lead to incorrect results.
If I'm reading the code correctly, it will result in seconds since
the Epoch in the Age header on all responses after a restart.
This suggests that either things should be implemented quite
differently, or this patch needs to be merged with the second one,
which implements loading relevant information from the cache
header.
> + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
> + "http file cache send, resp:%O, resident:%d, age:%d",
> + c->response_time, resident_time, current_age);
Note that there seems to be somewhat too many debug logging in the
patch.
Please also note that using "%d" for time_t is wrong, as time_t
size might be different from the size of int. For time_t, "%T"
should be used instead.
See ngx_sprintf() comment in src/core/ngx_string.c for the full
list of supported formats and corresponding types.
> +
> rc = ngx_http_send_header(r);
>
> if (rc == NGX_ERROR || rc > NGX_OK || r->header_only) {
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_header_filter_module.c
> --- a/src/http/ngx_http_header_filter_module.c Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_header_filter_module.c Thu Jun 20 20:26:41 2024 +0900
> @@ -322,6 +322,10 @@
> len += sizeof("Last-Modified: Mon, 28 Sep 1970 06:00:00 GMT" CRLF) - 1;
> }
>
> + if (r->headers_out.age_n != -1) {
> + len += sizeof("Age: ") - 1 + NGX_OFF_T_LEN + 2;
> + }
> +
> c = r->connection;
>
> if (r->headers_out.location
> @@ -518,6 +522,10 @@
> *b->last++ = CR; *b->last++ = LF;
> }
>
> + if (r->headers_out.age_n != -1) {
> + b->last = ngx_sprintf(b->last, "Age: %O" CRLF, r->headers_out.age_n);
> + }
> +
> if (host.data) {
>
> p = b->last + sizeof("Location: ") - 1;
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_request.c
> --- a/src/http/ngx_http_request.c Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_request.c Thu Jun 20 20:26:41 2024 +0900
> @@ -646,6 +646,7 @@
> r->headers_in.keep_alive_n = -1;
> r->headers_out.content_length_n = -1;
> r->headers_out.last_modified_time = -1;
> + r->headers_out.age_n = -1;
>
> r->uri_changes = NGX_HTTP_MAX_URI_CHANGES + 1;
> r->subrequests = NGX_HTTP_MAX_SUBREQUESTS + 1;
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_request.h
> --- a/src/http/ngx_http_request.h Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_request.h Thu Jun 20 20:26:41 2024 +0900
> @@ -290,6 +290,7 @@
> off_t content_offset;
> time_t date_time;
> time_t last_modified_time;
> + off_t age_n;
Using off_t here looks wrong, as off_t is a type for file offsets,
and not for dates.
(Not sure it needs to be here at all though, a better solution
might be to rewrite/add the header within the upstream module.)
> } ngx_http_headers_out_t;
>
>
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_special_response.c
> --- a/src/http/ngx_http_special_response.c Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_special_response.c Thu Jun 20 20:26:41 2024 +0900
> @@ -581,6 +581,7 @@
>
> r->headers_out.content_length_n = -1;
> r->headers_out.last_modified_time = -1;
> + r->headers_out.age_n = -1;
> }
>
>
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.c
> --- a/src/http/ngx_http_upstream.c Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_upstream.c Thu Jun 20 20:26:41 2024 +0900
> @@ -50,6 +50,8 @@
> ngx_http_upstream_t *u);
> static ngx_int_t ngx_http_upstream_test_next(ngx_http_request_t *r,
> ngx_http_upstream_t *u);
> +static void ngx_http_upstream_update_age(ngx_http_request_t *r,
> + ngx_http_upstream_t *u, time_t now);
> static ngx_int_t ngx_http_upstream_intercept_errors(ngx_http_request_t *r,
> ngx_http_upstream_t *u);
> static ngx_int_t ngx_http_upstream_test_connect(ngx_connection_t *c);
> @@ -132,6 +134,8 @@
> ngx_table_elt_t *h, ngx_uint_t offset);
> static ngx_int_t ngx_http_upstream_process_vary(ngx_http_request_t *r,
> ngx_table_elt_t *h, ngx_uint_t offset);
> +static ngx_int_t ngx_http_upstream_process_age(ngx_http_request_t *r,
> + ngx_table_elt_t *h, ngx_uint_t offset);
> static ngx_int_t ngx_http_upstream_copy_header_line(ngx_http_request_t *r,
> ngx_table_elt_t *h, ngx_uint_t offset);
> static ngx_int_t
> @@ -319,6 +323,10 @@
> ngx_http_upstream_copy_header_line,
> offsetof(ngx_http_headers_out_t, content_encoding), 0 },
>
> + { ngx_string("Age"),
> + ngx_http_upstream_process_age, 0,
> + ngx_http_upstream_ignore_header_line, 0, 0 },
> +
> { ngx_null_string, NULL, 0, NULL, 0, 0 }
> };
>
> @@ -499,6 +507,7 @@
>
> u->headers_in.content_length_n = -1;
> u->headers_in.last_modified_time = -1;
> + u->headers_in.age_n = -1;
>
> return NGX_OK;
> }
> @@ -1068,6 +1077,7 @@
> ngx_memzero(&u->headers_in, sizeof(ngx_http_upstream_headers_in_t));
> u->headers_in.content_length_n = -1;
> u->headers_in.last_modified_time = -1;
> + u->headers_in.age_n = -1;
>
> if (ngx_list_init(&u->headers_in.headers, r->pool, 8,
> sizeof(ngx_table_elt_t))
> @@ -1549,6 +1559,7 @@
> ngx_memzero(u->state, sizeof(ngx_http_upstream_state_t));
>
> u->start_time = ngx_current_msec;
> + u->request_time = ngx_time();
Adding a duplicate start time shouldn't be needed.
>
> u->state->response_time = (ngx_msec_t) -1;
> u->state->connect_time = (ngx_msec_t) -1;
> @@ -2008,6 +2019,7 @@
> ngx_memzero(&u->headers_in, sizeof(ngx_http_upstream_headers_in_t));
> u->headers_in.content_length_n = -1;
> u->headers_in.last_modified_time = -1;
> + u->headers_in.age_n = -1;
>
> if (ngx_list_init(&u->headers_in.headers, r->pool, 8,
> sizeof(ngx_table_elt_t))
> @@ -2529,6 +2541,8 @@
> return;
> }
>
> + ngx_http_upstream_update_age(r, u, ngx_time());
Note that the "ngx_time()" as an argument looks unneeded, it
should be better to obtain the time in the function itself.
Also, from semantic point of view this probably should be in
ngx_http_upstream_process_headers(), and not a dedicated function
call.
> +
> ngx_http_upstream_send_response(r, u);
> }
>
> @@ -2615,6 +2629,7 @@
> "http upstream not modified");
>
> now = ngx_time();
> + ngx_http_upstream_update_age(r, u, now);
>
> valid = r->cache->valid_sec;
> updating = r->cache->updating_sec;
> @@ -2648,7 +2663,12 @@
> valid = ngx_http_file_cache_valid(u->conf->cache_valid,
> u->headers_in.status_n);
> if (valid) {
> - valid = now + valid;
> + ngx_log_debug3(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
> + "adjust cache valid_sec:%O, "
> + "valid:%O, init_age:%d for 304",
> + now + valid - r->cache->corrected_initial_age,
> + valid, r->cache->corrected_initial_age);
> + valid = now + valid - r->cache->corrected_initial_age;
Not sure if Age calculations should apply to caching time as
calculated with proxy_cache_valid directives, but if it does, it
should apply consistently, in all cases where proxy_cache_valid
times are used. At least ngx_http_upstream_intercept_errors()
case seems to be ignored in the patch.
Also, if Age applies, there should be a way to ignore it,
similarly to how Cache-Control can be ignored with the
proxy_ignore_headers directive.
> }
> }
>
> @@ -2672,6 +2692,59 @@
> }
>
>
> +static void
> +ngx_http_upstream_update_age(ngx_http_request_t *r, ngx_http_upstream_t *u,
> + time_t now)
> +{
> + time_t response_time, date, apparent_age, response_delay, age_value,
> + corrected_age_value, corrected_initial_age;
> +
> + /*
> + * Update age response header.
> + * https://www.rfc-editor.org/rfc/rfc9111.html#name-calculating-age
> + */
> + response_time = now;
> + if (u->headers_in.date != NULL) {
> + date = ngx_parse_http_time(u->headers_in.date->value.data,
> + u->headers_in.date->value.len);
> + if (date == NGX_ERROR) {
> + date = now;
> + }
> + } else {
> + date = now;
> + }
> + apparent_age = ngx_max(0, response_time - date);
> +
> + response_delay = response_time - u->request_time;
> + age_value = u->headers_in.age_n != -1 ? u->headers_in.age_n : 0;
> + corrected_age_value = age_value + response_delay;
> +
> + corrected_initial_age = ngx_max(apparent_age, corrected_age_value);
> + r->headers_out.age_n = corrected_initial_age;
Note that this approach, as described in RFC 9111 as a possible
way to calculate the Age header, implies that time on both proxy
server and the origin server must be "reasonably well
synchronized", which is not really the case in many practical
situations.
If the time on the origin server is mostly arbitrary, for example,
in the past, apparent_age might be very large, leading to a very
large corrected_initial_age as well, preventing caching.
As such, I would rather consider a more robust algorithm instead.
> +
> + ngx_log_debug8(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0,
> + "http upstream set age:%O, req:%O, resp:%O, date:%O, "
> + "a_age:%O, resp_delay:%O, u_age:%O, c_age:%O",
> + corrected_initial_age, u->request_time, response_time, date,
> + apparent_age, response_delay, u->headers_in.age_n,
> + corrected_age_value);
> +
> +#if (NGX_HTTP_CACHE)
> + if (r->cache) {
> + r->cache->response_time = response_time;
> + r->cache->corrected_initial_age = corrected_initial_age;
> + if (u->headers_in.adjusting_valid_sec) {
> + r->cache->valid_sec -= corrected_initial_age;
The "adjusting_valid_sec" logic looks unclear (probably at least
needs a better name) and fragile. If I'm reading the code
correctly, it will do the wrong thing at least if the upstream
server returns something like:
Cache-Control: max-age=60
X-Accel-Expires: @<time>
since u->headers_in.adjusting_valid_sec won't be cleared.
> + ngx_log_debug2(NGX_LOG_DEBUG_HTTP, u->peer.connection->log, 0,
> + "http upstream adjusted cache "
> + "valid_sec:%O, init_age:%O",
> + r->cache->valid_sec, corrected_initial_age);
> + }
> + }
> +#endif
If I'm reading the code correctly, this will add the Age
header to responses which are not cached. This somewhat
contradicts to what RFC 9111 says in the description of the Age
header (https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1-6):
: The presence of an Age header field implies that the response was
: not generated or validated by the origin server for this request.
Please also note that in many cases [free]nginx is expected to be
seen as the origin server, even if it caches some upstream server
responses according to the configuration (in many cases, ignoring
Cache-Control or Expires headers). In this case using the
Age header even on cached responses might not be desired. It
would be great to dig further into possible use cases and make
sure all are properly covered.
> +}
> +
> +
> static ngx_int_t
> ngx_http_upstream_intercept_errors(ngx_http_request_t *r,
> ngx_http_upstream_t *u)
> @@ -2747,6 +2820,7 @@
> status);
> if (valid) {
> r->cache->valid_sec = ngx_time() + valid;
> + u->headers_in.adjusting_valid_sec = 1;
> }
> }
>
> @@ -2952,6 +3026,7 @@
> r->headers_out.status_line = u->headers_in.status_line;
>
> r->headers_out.content_length_n = u->headers_in.content_length_n;
> + r->headers_out.age_n = u->headers_in.age_n;
>
> r->disable_not_modified = !u->cacheable;
>
> @@ -4616,6 +4691,7 @@
>
> if (valid) {
> r->cache->valid_sec = ngx_time() + valid;
> + u->headers_in.adjusting_valid_sec = 1;
> r->cache->error = rc;
> }
> }
> @@ -4891,6 +4967,7 @@
> }
>
> r->cache->valid_sec = ngx_time() + n;
> + u->headers_in.adjusting_valid_sec = 1;
> u->headers_in.expired = 0;
> }
>
> @@ -5053,6 +5130,7 @@
>
> default:
> r->cache->valid_sec = ngx_time() + n;
> + u->headers_in.adjusting_valid_sec = 1;
> u->headers_in.no_cache = 0;
> u->headers_in.expired = 0;
> return NGX_OK;
> @@ -5319,6 +5397,39 @@
>
>
> static ngx_int_t
> +ngx_http_upstream_process_age(ngx_http_request_t *r,
> + ngx_table_elt_t *h, ngx_uint_t offset)
> +{
> + ngx_http_upstream_t *u;
> +
> + u = r->upstream;
> +
> + if (u->headers_in.age) {
> + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
> + "upstream sent duplicate header line: "%V: %V", "
> + "previous value: "%V: %V"",
> + &h->key, &h->value,
> + &u->headers_in.age->key,
> + &u->headers_in.age->value);
> + return NGX_HTTP_UPSTREAM_INVALID_HEADER;
> + }
> +
> + h->next = NULL;
> + u->headers_in.age = h;
> + u->headers_in.age_n = ngx_atoof(h->value.data, h->value.len);
> +
> + if (u->headers_in.age_n == NGX_ERROR) {
> + ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
> + "upstream sent invalid "Age" header: "
> + ""%V: %V"", &h->key, &h->value);
> + return NGX_HTTP_UPSTREAM_INVALID_HEADER;
> + }
Note that returning errors here violates RFC 9111 SHOULD
recommendations, notably
(https://www.rfc-editor.org/rfc/rfc9111.html#section-5.1):
: Although it is defined as a singleton header field, a cache
: encountering a message with a list-based Age field value SHOULD
: use the first member of the field value, discarding subsequent
: ones.
: If the field value (after discarding additional members, as per
: above) is invalid (e.g., it contains something other than a
: non-negative integer), a cache SHOULD ignore the field.
Unless there are good reasons, a better solution might be to
report warnings and follow the recommendations instead.
> +
> + return NGX_OK;
> +}
> +
> +
> +static ngx_int_t
> ngx_http_upstream_copy_header_line(ngx_http_request_t *r, ngx_table_elt_t *h,
> ngx_uint_t offset)
> {
> diff -r 73929a4f3447 -r c81df54e3d03 src/http/ngx_http_upstream.h
> --- a/src/http/ngx_http_upstream.h Thu Jun 20 20:26:24 2024 +0900
> +++ b/src/http/ngx_http_upstream.h Thu Jun 20 20:26:41 2024 +0900
> @@ -287,14 +287,17 @@
>
> ngx_table_elt_t *cache_control;
> ngx_table_elt_t *set_cookie;
> + ngx_table_elt_t *age;
>
> off_t content_length_n;
> time_t last_modified_time;
> + off_t age_n;
See above, "off_t" looks wrong.
[...]
Hope this helps.
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list