How do I choose whether to trust a particular website?
I believe in cross-fertilizing between sources of information. Most of them start on the internet, but we have more and more ways to link virtual resources (e.g. websites) with the physical world. Following are a few ways I use to see if I can trust a website. I hope it helps :)
- Does the site look professionally made?
- Does the site have a certificate? Is it valid? Please keep in mind that invalid certificates are unfortunately very common.
- If there is a login/e-business part in the site, is it using SSL?
- Do I find any positive/negative feedback on this website on forums/mailing lists?
- Does the whois.net information give me any information that would help me trust the namespace owner?
- In the contact section of the website if there is an address? Using a research engine, can I find other businesses at the same address? Can I find a telephone number of another company to check with them if the site is legit?
- Can I see the building on Google street view? Does this building give me confidence?
A bit more advanced:
- If there is a login feature, do simple injections techniques work?
- Browsing through the HTML/JavaScript code, do I see any reason not to trust that website? Here I'm looking for visible access control flaws, stored logging and passwords...
- If the website is using a CMS (e.g. wordpress, SPIP) is it a "safe" version?
There was been some interesting research on trust metrics: Attack Resistant Trust Metrics, A Model for Trust Metrics Analysis, How to incorporate revocation status information into the trust metrics for public-key certification
The basic concept is building a set of links from you to your target. Unfortunatly web sites dont embody trust. People are the appropriate trust originators. This is a concept easily lost in the modern era where technology is portrayed as sterile and disconnected from its creators and manufacturers.
However all web sites are made by people, at least as far as I know. The problem on some sites is that the content creater is not always recognized. Links are not vouched for.
One principal I apply when trying to gain confidence in a piece of data is diversity of source. If I am seeking an evaluation on a tool I try to get opinions from experts, amature enthusiasits, and from publications.
For example If I was interested in buying a new camera, I would look on user forums, professional blogs, and magazines that cover photography. Of course each group could potentially give wrong or misleeding opinions, but the likeliness of at least two being wrong or misleading is smaller than the likelyness of any one group being wrong or misleading.
Notice that this does not guarantee me good results. It only gains you a higher level of assurance that the evaluation is correct.
Second I devote more resources to collecting evaluation on high value data. Not all questions deserve thorough vetting. The opinions on which web comic will make you laugh don't require more checking because you can trivially confirm the claim. Likewise purchases under $20 US rarely require much though unless they may have an impact on my health or safety. i.e. is the generic pain killer as safe as taking a generic brand? or Should I buy the USB flash drive from vender A or vender B?
The question isn't quite right: Trust isn't binary. I think you really want to know "How can I decide how much to trust a particular website?"
In the end I think it comes down to how much I must trust each particular site.
The sites that I have to trust the most (bank, brokerage, etc) I have a physical offline relationship with. The companies that run those websites have a significant offline reputation and presence; they correspond with me on dead trees via snail mail and they have a phone number where you can talk to a human. This is somewhat outside the scope of the question since you have said the only information you have is from the web itself, but even then you can verify the physical presence via multiple avenues. (Google, WHOIS, Wikipedia, online reviews -- e.g. bank comparison websites, etc.) Also, if I have to "fully" trust a site -- for financial info or other sensitive data -- then I am unlikely to do so unless I can have a reason to trust them that is backed up by a trustworthy significant offline presence.
The sites that I trust the least are those that I don't have to trust. JimBob's Game Zone & Happy Fun Time, for example: hmm, you say you've got this really fun game that I've got to enable java to play? Sorry, but no thanks. (The same is true even for somewhat more reputable sites that still have, say, java-based financial calculators that I'd like to use. I can find an alternative that doesn't expose me to a huge attack surface.)
In between there are sites that you have to trust somewhat, but not fully. In other words, you may need to expose yourself to some risk. For example, H&R Block has a calculator that estimates how much tax I owe for last year. I have to enable scripting and flash for their calculator to work, and I have to enter personal data (e.g. income, family status) -- and it has to be accurate (though not perfectly precise) in order for me to get the answer I want. I don't have to give any identification, so beyond the exposure to scripting and some limited data it's an acceptable risk; I may access via proxy to hide my data from my ISP and hide my location from the site.
Other sites want to collect tons of data from you, they want to identify you personally, etc, and they give you a service in return. Facebook or Google, for example; to a lesser extent, Stack Exchange. I, for one, choose to actively distrust the omnipresent sites: this takes effort since you have to block multiple domains via NoScript or other browser plugins to prevent the company from tracking you around the web. I choose not to use Facebook. I use multiple Google products, but here I've chosen to pay for their service with my data; I still block their tracking domains when I not using their products.
The issue at hand in the linked question is whether to trust an Ubuntu ISO image downloaded from a particular website. Based on what I've written above, I think the answer is that you don't have to trust it much, so you shouldn't. ("Trust, but verify.") Download the image from anywhere reputable enough that you don't waste your time and bandwidth. I'd probably pull the torrent: you don't have to trust any single site. Then verify the hash via multiple channels: check Canonical's website, check other websites, use multiple proxies to check those websites (so you are less likely to be MITM'd), ask on IRC, call a friend, etc.