Sample pages for various character sets

Content Negotiation

Content-negotiation uses the features of the Apache server to server a document based on natural language. The browser sends an http_accept_language request and the server uses a type-map file to find the correct file. Type map files look like this:

URI: start; vary="type,language"

URI: dk.html
Content-type: text/html
Content-language: da

URI: de.html
Content-type: text/html
Content-language: de

URI: jp.html
Content-type: text/html; charset=iso-8859-1
Content-language: ja

This is known to work with Mosaic-2.6-L10N+. The desired language(s) are entered in the Accept-languages box, and the server serves the one which matches first. Mosaic-L10N doesn't, however, select the correct charset using the charset parameter (though it does auto-detect Japanese). E.g., if the browser sends an HTTP_ACCEPT_LANGUAGE header of "en,fr,de" (or "en-GB,fr,de", etc.) and the server has French, Japanese, and German available, the negotiation will yield French, since this is the first one to match.

Language negotiation is now supported in Netscape. Use Options|General|Language preference for Win/Mac. Under Unix set the X resource (in .Xdefaults) Netscape*httpAcceptLanguage, e.g.
"Netscape*httpAcceptLanguage: fr, en-US"

(This was previously stated incorrectly). Specifying a quality factor, e.g.

Content-type: text/html; qs=0.5
does not work properly with HTTP_ACCEPT_LANGUAGE. It is not (currently - 1996, Apache 1.0.0) possible to generate a weighted preference. The qs factor will overrule the browser preference given by the HTTP_ACCEPT_LANGUAGE ordering. It might still be used in a case where one language variant on the server is so bad that it should only be served to someone who can't read anything else. Blank qs is equivalent to a qs of 1.0. Here, Pig Latin is served to someone who requests x-pig-latin and no other listed language.
Quality specification in this way works properly with Arena to select content-type, e.g. to prefer JPEG over GIF or PDF over PostScript. In that case Arena generates an HTTP_ACCEPT header such as "text/html;q=0.8,text/pdf;q=0.6"

To see what languages your browser accepts, see e.g. http://vancouver-webpages.com/cgi-bin/test.cgi

Character sets

A charset modifier may be appended to the Content-type header, like this:
Content-type: text/plain;charset=x-euc-jp
This may be generated using the Asis feature in Apache (set in srm.conf).

This is known to work with Netscape 2.0 for X-11. It overrides the language selection and automatically selects the correct font. Netscape 2.0 (X11) currently uses these locales. Netscape 3.0 does things slightly differently; this page lists the currently (3.0b4) understood languages and charsets.

Other browsers, however, mostly mis-interpret the charset parameter and thinks the document is a binary file.

Netscape 3.0 appears to understand this parameter in a META tag, e.g.

<META HTTP-EQUIV="Content-type" CONTENT="text/html;charset=x-euc-jp">
which will probably be ignored by other browsers, thus may be safe to use (assuming the server does not parse HTTP-EQUIV headers into real http headers).

Language Samples - Content Negotiated

Samples using Content-Negotiation
Here is the Var file.

en-CA,en-GB,en-US,fr,de, Var file
en-CA,en-GB,en-US,fr,de using qs prefers en-CA, Var file
en-CA,en-GB,en-US,fr,de using qs prefers en-GB, Var file
en-CA,en-GB,en-US,fr,de using qs prefers en-US, Var file
en-CA,en-GB,en-US,fr,de using qs prefers French (fr), Var file

Language Samples - text/html with and without charset

GIF of samples

Survey

Please let me know if your browser handles charset switching and/or content-negotiation.

European Servers

Other Multi-lingual URLs