How to set fallback encoding to UTF-8 in Firefox?
Setting fallback encoding to UTF-8 in Firefox has been deliberately blocked - see bugzilla.mozilla.org/show_bug.cgi?id=967981#c4.
Two ways around this that I've been looking at are:
1] Apply some trivial patches to the source and build Firefox yourself to add a Unicode[UTF-8] option to Preferences|Content|Fonts & Colors|Advanced|"Fallback Text Encoding" drop-down menu.
2] Run a local [Apache] httpd server, and set up a Name-based Virtual Server, utfx
, for the utf-8 encoded files in directory /my/utf-8/files
. A utf-8 charset http header can then be generated, which Firefox will recognize and display the file as UTF-8 encoded. Of course, the actual file encoding has to be UTF-8!
a) /etc/httpd/httpd.conf - add:
<VirtualHost *:80>
# This first-listed virtual host is also the default for *:80
ServerName localhost
DocumentRoot "/srv/httpd/htdocs"
</VirtualHost>
<VirtualHost *:80>
ServerName utfx
DocumentRoot "/my/utf-8/files"
<Directory "/my/utf-8/files">
Options Indexes
Require all granted
</Directory>
## show UTF-8 characters in file names:
IndexOptions Charset=UTF-8
## for files with extension html or txt:
AddCharset UTF-8 txt html
## for extensionless files:
<Files *>
ForceType 'text/plain; charset=UTF-8'
</Files>
<Files *\.*>
ForceType None
</Files>
</VirtualHost>
(Re)start the server - apachectl restart
or apachectl graceful
.
b) /etc/hosts - add the domain name for accessing the utf-8 encoded files:
127.0.0.1 utfx
The content-type info being sent by the server can be checked with wget -S <URL>:
wget -S http://utfx/test{æø,.txt,.html} 2>&1 >/dev/null | grep Content-Type
for the three file types (testæø, test.txt, test.html).
The output should be:
Content-Type: text/plain; charset=utf-8
Content-Type: text/plain; charset=utf-8
Content-Type: text/html; charset=utf-8
c) about:config - add New|Boolean:
browser.fixup.domainwhitelist.utfx "true"
then just enter utfx
in the Firefox address bar to get the files list ..
Update: this has been fixed since Firefox 66
UTF-8-encoded HTML (and plain text) files loaded from file: URLs are now supported without
<meta charset="utf-8">
or the UTF-8 BOM
https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/66#HTML
Historical information from 2016
The reasoning behind this behavior seems to be described in Mozilla bugs 815551 (Autodetect UTF-8 by default) and 1071816 (Support loading BOMless UTF-8 text/plain files from file: URLs)
As far as I understand it basically boils down to "one should always specify the encoding as detection is too unreliable".
- For non-local content you should leverage the protocol. With HTTP this would be providing the correct
charset
in theContent-Type
Header - For HTML content you may additionally use the Doctype, i.e.
<meta charset="utf-8" />
- And for anything else the only standard way left ist to specify a BOM...
Mozilla devs seem to be open for a patch that adds a preference setting, so one day it might be possible to open local BOM-less UTF-8 documents in Firefox.
As I have commented in your question I was struggling to obtain the same with the purpose of correctly displaying partial html (encoding is known but there's no meta tag for encoding) from Mutt in Firefox through Mailcap.
In the end I've figure out a command that works, and which may help you too:
uconv --add-signature -f %{charset} -t UTF-8 %s | sponge %s && firefox -new-tab %s & sleep 5
I've discovered that when your UTF-8 encoded file contains BOM, Firefox then assumes it's UTF-8. So I've used the uconv
command to add the BOM signature. Assume that %{charset}
is the input charset and %s
is the filename. The sponge
tool (from the moreutils
package) helps changing the file inplace and the sleep
is just so that Mutt doesn't delete the file before Firefox finishes loading it.
I have not found any other option to set a fallback encoding in Firefox.