How to prevent tracking sensitive data in URLs?
Since you are saying that it is your SPA, you might solve the problem by switching from GET requests (which have the parameters inside the URL) to POST requests. I do not know hotjar, but if you tell the tracking service to analyze URLs only, that would be an option worth considering.
Another option frequently used is to obfuscate your parameters in the URL, see e.g. Best way to obfuscate an e-mail address on a website? However, that is never a really safe solution for sensitive data, since the de-ciphering step is too easy, in particular if your man-in-the-middle has all requests ever send to your SPA.
Edit. I just found in the Hotjar allows RegEx. Assuming you could enter a regular expression of URL-parts to exclude.
The general syntax /foo/bar/
means that foo
should be replaced by bar
, in our case, we want to delete the given snippet, that why it is /foo//
.
For the given case of the access token, the regular expression would be
/callback#access_token=[a-zA-Z0-9]{15}//
and respectively for the email part of the URL
/\?email=(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])//
This second RegEx partially taken from How to validate an email address using a regular expression?
It seems to me that it's reasonable to assume that tracking scripts will try to access window.location.href
or similar to get the current url which they will store.
So a possible solution would be create a dynamic scope which has a different value for window.location.href
(with all sensitive info filtered out)
This is how it might work:
// get the tracker script as a string, so you can eval it in a dynamic scope
let trackerScript = 'console.log("Tracked url:", window.location.href)';
// now lets lock it up
function trackerJail(){
let window = {
location: {
// put your filtered url here
href: "not so fast mr.bond"
}
}
eval(String(trackerScript))
}
trackerJail()
If the tracking snippet is wrapped in a function it might be possible to create a dynamic scope for it without running eval
by overriding it's prototype instead. But I'm not sure you can count on tracker scripts being wrapped in a neat function you can modify.
Also, there are a couple more ways the script might try to access the URL, so make sure to cover all the exits
If you control the page and order of scripts, you could read the data from the url then delete it before anything else can get to it.
proofOfConcept.html
<script id="firstThingToLoad.js">
console.log(window.location.href);
const keyRegex = /key=[^&]*/;
const key = window.location.href.match(keyRegex);
console.log("I have key", key);
const href = window.location.href.replace(keyRegex, "");
history.replaceState({}, "", href);
</script>
<script id="someSnoopyCode.js">
console.log("I'm snooping: ", window.location.href);
</script>
<body>
<a href="/?key=secret">Link to private</a>
</body>
Of course the Link to private
should not exist as is. Also, this does break refresh and most navigation in general, though there are ways to catch and save that.