[PATCH] Add markdown to mime.types

Maxim Dounin mdounin at mdounin.ru
Wed Aug 28 01:24:25 UTC 2024


Hello!

On Tue, Aug 27, 2024 at 02:08:31PM +0200, Andrea Pappacoda wrote:

> Hi all!
> 
> This is in reply to a relatively old patch adding "text/markdown" to
> mime.types.
> 
> On Tue, 14 Sep 2021 20:08:15 +0300, Maxim Dounin wrote:
> > Hello!
> > 
> > On Thu, Sep 09, 2021 at 10:35:32PM -0400, Abe Massry wrote:
> > 
> > > # HG changeset patch
> > > # User Abe Massry <a at abemassry.com>
> > > # Date 1631238770 14400
> > > #      Thu Sep 09 21:52:50 2021 -0400
> > > # Branch update-mime-types
> > > # Node ID 95a61e228bc19f6b9917671dfd2e6ff52e3e0294
> > > # Parent  a525013b82967148e6e4b7e0eadd23e288001816
> > > Add markdown to mime.types
> > > 
> > > In the chromimum browser a warning is displayed if a markdown
> > > mime type does not appear in the list of mime types on the server.
> > > The browser attempts to download the file but gives a warning
> > > saying that this type of file is usually displayed in the
> > > browser.
> > > 
> > > Files with a mime type of markdown and a file extension of `.md`
> > > should be displayed as plain text in the browser and this
> > > change adds that to the default mime types that will ship with
> > > nginx.
> > > 
> > > diff -r a525013b8296 -r 95a61e228bc1 conf/mime.types
> > > --- a/conf/mime.types	Tue Sep 07 18:21:03 2021 +0300
> > > +++ b/conf/mime.types	Thu Sep 09 21:52:50 2021 -0400
> > > @@ -9,6 +9,7 @@
> > >      application/atom+xml                             atom;
> > >      application/rss+xml                              rss;
> > > 
> > > +    text/markdown                                    md;
> > >      text/mathml                                      mml;
> > >      text/plain                                       txt;
> > >      text/vnd.sun.j2me.app-descriptor                 jad;
> > > 
> > 
> > A side note: the "text/markdown" specification says that the charset
> > attribute is required, and this is not something nginx provides unless
> > the charset module is explicitly used.
> > 
> > (see RFC 7763 and/or
> > https://www.iana.org/assignments/media-types/text/markdown)
> 
> While it is true that the IANA assignment says that the charset
> attribute is required, it does so because of RFC 6838. This RFC, in
> fact, specifies that *all* text/* MIME types "MUST" specify the charset,
> unless that information is already present in the file format itself,
> like required by XML.

RFC 6838 says:

:    If a "charset" parameter is specified, it SHOULD be a required
:    parameter, eliminating the options of specifying a default value.  If
:    there is a strong reason for the parameter to be optional despite
:    this advice, each subtype MAY specify its own default value, or
:    alternatively, it MAY specify that there is no default value.
:    Finally, the "UTF-8" charset [RFC3629] SHOULD be selected as the
:    default.  See [RFC6657] for additional information on the use of
:    "charset" parameters in conjunction with subtypes of text.
:
:    Regardless of what approach is chosen, all new text/* registrations
:    MUST clearly specify how the charset is determined; relying on the
:    US-ASCII default defined in Section 4.1.2 of [RFC2046] is no longer
:    permitted.  If explanatory text is needed, this SHOULD be placed in
:    the additional information section of the registration.

That is, registrations "MUST clearly specify how the charset is 
determined", but the "charset" parameter is not a required 
parameter unless defined as such by a particular registration.

Further, these are requirements for new registrations.  Existing 
registrations already either define charset handling explicitly or 
rely on the US-ASCII default mentioned in the quote.

This is explained with more details in RFC 6657, which says:

:    Regardless of what approach is chosen, all new "text/*" registrations
:    MUST clearly specify how the charset is determined; relying on the
:    default defined in Section 4.1.2 of [RFC2046] is no longer permitted.
:    However, existing "text/*" registrations that fail to specify how the
:    charset is determined still default to US-ASCII.

And in particular about text/plain:

:    The default "charset" parameter value for "text/plain" is unchanged
:    from [RFC2046] and remains as "US-ASCII".

> Hence, in a way, (free)nginx is already going against RFC 6838 whenever
> sending any text/* MIME type, but it cannot really do in any other way-
> only the user can know the actual charset of a given file.

I don't think this conclusion is correct, see above.  More 
specifically, in freenginx mime.types there are the following 
text/* types:

    text/html
    text/css
    text/xml
    text/mathml
    text/plain
    text/vnd.sun.j2me.app-descriptor
    text/vnd.wap.wml
    text/x-component

None of these types define "charset" as a required parameter.

OTOH, I agree that required charset parameter is something that 
cannot be reasonably provided by a server unless explicitly 
configured.  And the basic question is how to handle such 
requirements.  Possible options include:

- avoiding such types,

- ignoring the requirement,

- introducing some way to provide server-specific default value 
  for such types.

Given that .md files are not really used when building sites, 
avoiding the type might be the simplest option, and that's what 
implicitly happens now.
 
> So in my opinion the markdown MIME type should be added, and it is up to
> the user to comply with RFC 6838. Maybe it'd make sense to mention that
> in the mime.types file.

Just ignoring the requirement might also be an option - I don't 
see any issues with existing browsers.  Still, this would be an 
obvious RFC 7763 violation in the common case, which might fight 
back at some point.

Also, the remaining question is whether text/markdown needs to be 
added at all.  It does not seem to be meaningfully used when 
building sites, not in Apache mime.types, and currently I'm not 
able to reproduce the warning claimed in the above commit log with 
Chromium.

Still, I tend to think it should, and probably with both ".md" and 
".markdown" extensions, as Markdown is becoming more and more popular.

Also, it probably worth adding to the default charset_types list 
as well, thus making it easier to add the "charset" attribute as 
required by RFC 7763.

Here is a patch:

# HG changeset patch
# User Maxim Dounin <mdounin at mdounin.ru>
# Date 1724807989 -10800
#      Wed Aug 28 04:19:49 2024 +0300
# Node ID 1729a20708cff738afa5cc32e77b2b50a4b5d91d
# Parent  d6f75dd66761c10d4bfb257ae70a212411b6a69b
MIME: added text/markdown type.

Added text/markdown type for the ".md" and ".markdown" extensions
(https://www.iana.org/assignments/media-types/text/markdown).

Additionally, text/markdown is added to the default charset_types list
of the charset module, making it easier to provide the "charset"
parameter, which is defined as REQUIRED for text/markdown.

Prodded by Andrea Pappacoda,
http://freenginx.org/pipermail/nginx-devel/2024-August/000486.html

diff --git a/conf/mime.types b/conf/mime.types
--- a/conf/mime.types
+++ b/conf/mime.types
@@ -9,6 +9,7 @@ types {
     application/atom+xml                             atom;
     application/rss+xml                              rss;
 
+    text/markdown                                    md markdown;
     text/mathml                                      mml;
     text/plain                                       txt;
     text/vnd.sun.j2me.app-descriptor                 jad;
diff --git a/src/http/modules/ngx_http_charset_filter_module.c b/src/http/modules/ngx_http_charset_filter_module.c
--- a/src/http/modules/ngx_http_charset_filter_module.c
+++ b/src/http/modules/ngx_http_charset_filter_module.c
@@ -127,6 +127,7 @@ static ngx_str_t  ngx_http_charset_defau
     ngx_string("text/html"),
     ngx_string("text/xml"),
     ngx_string("text/plain"),
+    ngx_string("text/markdown"),
     ngx_string("text/vnd.wap.wml"),
     ngx_string("application/javascript"),
     ngx_string("application/rss+xml"),



-- 
Maxim Dounin
http://mdounin.ru/


More information about the nginx-devel mailing list