Single Page App + Amazon S3 + Amazon CloudFront + Prerender.io - how to set up?
It's hard to use Prerender.io with a static Amazon S3 site.
You could stand up an nginx/apache server in front of s3: https://myapp.com
-> https://mynginx-server.com
-> https://myBucket.s3-eu-west-1.amazonaws.com/index.html
This solution is less ideal because you lose the closest-location benefit of cloudfront.
This is a good article about a custom solution: http://www.dave.cx/post/23/prerendering-angular-s3/
David was able to generate the static HTML and save them in S3, then use CloudFlare to detect _escaped_fragment_ in the URL and redirect it to the static HTML on S3.
You can use Lambda@Edge to configure CloudFront to send crawler HTTP requests directly to prerender.io.
The basic idea is to have a viewer-request handler which sets a custom HTTP header for requests which should be sent to prerender.io. For example this Lambda@Edge code:
'use strict';
/* change the version number below whenever this code is modified */
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const headers = request.headers;
const user_agent = headers['user-agent'];
const host = headers['host'];
if (user_agent && host) {
if (/baiduspider|Facebot|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator/.test(user_agent[0].value)) {
headers['x-prerender-token'] = [{ key: 'X-Prerender-Token', value: '${PrerenderToken}'}];
headers['x-prerender-host'] = [{ key: 'X-Prerender-Host', value: host[0].value}];
}
}
callback(null, request);
};
The cloudfront distribution must be configured to pass through the X-Prerender-Host and X-Prerender-Token headers.
Finally a origin-request handler changes the origin server if X-Prerender-Token is present:
'use strict';
/* change the version number below whenever this code is modified */
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
if (request.headers['x-prerender-token'] && request.headers['x-prerender-host']) {
request.origin = {
custom: {
domainName: 'service.prerender.io',
port: 443,
protocol: 'https',
readTimeout: 20,
keepaliveTimeout: 5,
customHeaders: {},
sslProtocols: ['TLSv1', 'TLSv1.1'],
path: '/https%3A%2F%2F' + request.headers['x-prerender-host'][0].value
}
};
}
callback(null, request);
};
There's a fully worked example at: https://github.com/jinty/prerender-cloudfront
Have a look at the full solution over here, creating snapshots of your website with grunt and serving them to search engines with nothing more than amazon S3:
AngularJS SEO for static webpages (S3 CDN)
I managed to do this by not using Prerender at all but creating AWS Lambda function that:
- Requests the origin page from CloudFront (it actually is always the same index.html)
- Map the lambda function via API Gateway catch-all proxy
- Study the path and figure out what resource page should be about (in my case it is simply /user/{name}, so I only have to do one use-case
- Make REST API request to get the dynamic data for the user
- Regex replace the existing meta-fields with the dynamic ones
- Return the new index-file with new metas
Configure new origin (new lambda function) and behaviour (map /user/* requests to this new origin). Be sure to use "HTTPS only" Origin Protocol Policy for the origin, as API Gateway is only HTTPS, redirect here will cause the hostname to change.
(If you by accident used the redirect, then you will need to Invalidate "/*" as due to some CloudFront bug the configuration change will not help ; I spent multiple hours debugging this last night)