Proper .htaccess config for Next.js SSG
- (Wouldn't it be better for NextJS to create a
article\index.html
instead of a file in the root directory?)
Yes! And Next can do that for you:
It is possible to configure Next.js to export pages as index.html files and require trailing slashes,
/about
becomes/about/index.html
and is routable via/about/
. This was the default behavior prior to Next.js 9.To switch back and add a trailing slash, open
next.config.js
and enable theexportTrailingSlash
config:
module.exports = { exportTrailingSlash: true, }
If you request /article
and /article
exists as a physical directory then Apache's mod_dir, will (by default) append the trailing slash in order to "fix" the URL. This is achieved with a 301 permanent redirect - so it will be cached by the browser.
Although having a physical directory with the same basename as a file and using extensionless URLs creates an ambiguity. eg. Is /article
supposed to access the directory /article/
or the file /article.html
. You don't seem to want to allow direct access to directories anyway, so that would seem to resolve that ambiguity.
To prevent Apache mod_dir appending the trailing slash to directories we need to disable the DirectorySlash
. For example:
DirectorySlash Off
But as mentioned, if you have previously visited /article
then the redirect to /article/
will have been cached by the browser - so you'll need to clear the browser cache before this will be effective.
Since you are removing the file extension you also need to ensure that MultiViews is disabled, otherwise, mod_negotiation will issue an internal subrequest for the underlying file, and potentially conflict with mod_rewrite. MultiViews is disabled by default, although some shared hosts do enable it for some reason. From the output you are getting it doesn't look like MultiViews is enabled, but better to be sure...
# Ensure that MutliViews is disabled
Options -MultiViews
However, if you need to be able to access the directory itself then you will need to manually append the trailing slash with an internal rewrite. Although this does not seem to be a requirement here. You should, however, ensure that directory listings are disabled:
# Disable directory listings
Options -Indexes
Attempting to access any directory (that does not ultimately map to a file - see below) and does not contain a DirectoryIndex
document will return a 403 Forbidden response, instead of a directory listing.
Note that the only difference that could occur between following a link to domain/article
, refreshing the page and manually typing domain/article
is caching... either by the browser or any intermediary proxy caches. (Unless you have JavaScript that intercepts the click event on the anchor?!)
You do still need to rewrite requests from /foo
to /foo.html
OR /foo
to /foo/index.html
(see below), depending on how you have configured your site. Although it would be preferable that you choose one or the other, rather than both (as you seem to imply could be the case).
RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_FILENAME}\.html -f RewriteRule ^(.*)$ $1.html
It is unclear how this is seemingly "working" for you currently - unless you are seeing a cached response? When you request /article
, the first condition fails because this exists as a physical directory and the rule is not processed. Even with MultiViews enabled, mod_dir will take priority and append the trailing slash.
The second condition that checks the existence of the .html
file isn't necessarily checking the same file that is being rewritten to. eg. If you request /foo/bar
, where /foo.html
exists, but there is no physical directory /foo
then the RewriteCond
directive checks for the existence of /foo.html
- which is successful, but the request is internally rewritten to /foo/bar.html
(from the captured RewriteRule
pattern) - this results in an internal rewrite loop and a 500 error response being returned to the client. See my answer to the following ServerFault question that goes into more detail behind what is actually happening here.
We can also make a further optimisation if we assume that any URL that contains what looks like a file extension (eg. your static resources .css
, .js
and image files) should be ignored, otherwise we are performing filesystem checks on every request, which is relatively expensive.
So, in order to map (internally rewrite) requests of the form /article
to /article.html
and /article/somearticle
to /article/somearticle.html
you would need to modify the above rule to read something like:
# Rewrite /foo to /foo.html if it exists
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}.html [L]
There is no need to backslash escape a literal dot in the RewriteCond
TestString - the dot carries no special meaning here; it's not a regex.
Then, to handle requests of the form /foo
that should map to /foo/index.html
you can do something like the following:
# Rewrite /foo to /foo/index.html if it exists
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}/index.html -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}/index.html [L]
Ordinarily, you would allow mod_dir to serve the DirectoryIndex
(eg. index.html
), but having omitted the trailing slash from the directory, this can be problematic.
Summary
Bringing the above points together, we have:
# Disable directory indexes and MultiViews
Options -Indexes -MultiViews
# Prevent mod_dir appending a slash to directory requests
DirectorySlash Off
RewriteEngine On
# Rewrite /foo to /foo.html if it exists
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}.html [L]
# Otherwise, rewrite /foo to /foo/index.html if it exists
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}/index.html -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}/index.html [L]
This could be further optimised, depending on your site structure and whether you are adding any more directives to the .htaccess
file. For example:
- you could check for file extensions on the requested URL at the top of the file to prevent any further processing. The
RewriteRule
regex on each subsequent rule could then be "simplified". - Requests that include a trailing slash could be blocked or redirected (to remove the trailing slash).
- If the request is for a
.html
file then redirect to the extensionless URL. This is made slightly more complicated if you are dealing with both/foo.html
and/foo/index.html
. But this is only really necessary if you are changing an existing URL structure.
For example, implementing #1 and #2 above, would enable the directives to be written like so:
# Disable directory indexes and MultiViews
Options -Indexes -MultiViews
# Prevent mod_dir appending a slash to directory requests
DirectorySlash Off
RewriteEngine On
# Prevent any further processing if the URL already ends with a file extension
RewriteRule \.\w{2.4}$ - [L]
# Redirect any requests to remove a trailing slash
RewriteRule (.*)/$ /$1 [R=301,L]
# Rewrite /foo to /foo.html if it exists
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule (.*) $1.html [L]
# Otherwise, rewrite /foo to /foo/index.html if it exists
RewriteCond %{DOCUMENT_ROOT}/$1/index.html -f
RewriteRule (.*) $1/index.html [L]
Always test with a 302 (temporary) redirect before changing to a 301 (permanent) redirect in order to avoid caching issues.