How can I prevent spam on sites which I control?
The following list is organized by relative ease of implementation, maintenance cost, and effectiveness at spam prevention:
Disable all user-generated content
This is a scorched-earth solution which detracts from the the growth of a user community around your site, however, it is also guaranteed to save you the time and effort of dealing with spam or spam prevention.
Short of disabling user-generated content, there is no guaranteed solution to prevent all spam (or other unwanted content) from appearing, however, a solution which deters most spammers should be sufficient if you also provide your site's visitors with the option to flag content as spam.
Outsource user-generated content management
Services like Disqus allow webmasters to outsource the screening, storage, and publication of user-generated comments. (Note: Use of a third-party service requires extra configuration to ensure that comments will be indexed by search engines)
CAPTCHA
Per Wikipedia, CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". Any automated test designed to prevent a computer from posting content is a CAPTCHA: this includes forcing users to read letters, numbers, and words out of images, do simple word puzzles or math questions, or otherwise "prove" they are people.
The disadvantage of CAPTCHA is that
Most forms of CAPTCHAs provide a level of annoyance for the users.
They are not 100% protective. Note that many of these tests can be completed by computers if a competent programmer decides to invest enough time and effort on the problem
Q&A CAPTCHA
The most effective CAPTCHA for small sites is the question and answer CAPTCHA. A Q&A CAPTCHA is a question that a website asks a user to answer. The question is something that anyone visiting the site would know, but that a computer program would not know. An example question for a site about seo would be "What does SEO stand for". This question would be easy for the average reader of that site to answer, but any computer program would not be able to do so.
NOTE: questions like "what is 1 + 1" do not work well, because they are often used, and the people who build spambots program them to answer such questions correctly.
However, if your site get's a lot of traffic, spammers will program their robots to answer those questions automatically, and the q&a CAPTCHA will no longer be affective.
Hidden Field
If you have a form, and you don't want spammers to be able to use it, a good way to stop them is by using a hidden field. These are very simple to set up: add a redundant field to your form, hide it through css (or JavaScript), and stop anything that tries to enter a value into that field. Normal users will not be able to see the field, and will ignore it, because it is hidden from them, but computer programs employed by spammers will try to enter a value into that field, because they do not process CSS or javascript. In order to beat spambots that load CSS or Javascript, you may add an additional field to the forms with a request to leave it empty. Any human visitor will leave it empty and you can easily block the bots that add data to the field. Do not forget the fact that this may make the site look unprofessional.
Traffic and Content Analysis
Spammers have a limited number of networks and machines to post from (which they will typically use until they no longer work). Traffic analysis solutions gather data from a large number of hosts to determine whether a post contains known spam content or comes from a known spammer's host or network.
There are a variety of third-party CAPTCHA and traffic analysis solutions which are free (or cheap) to use and most open source content management software includes integrated modules for use of services like Akismet and reCAPTCHA.
Block words commonly contained in spam
If you notice that spam on your website commonly contains words that wont (or aren't) used by legitimate users (such as "free links to your site"), then blocking users from posting those words is an affective solution. If you are worried about users who have a legitimate use of those words in their posts having problems posting on your site, you can set the filter so that it ignores posts from established users.
rel="nofollow"
Spammers tend to focus on sites which allow them to post links which search engines will follow (thus improving the search rank of the site they are advertising).
You can make your site less attractive to spammers by adding rel="nofollow"
to any links included in user-generated content, however, this approach may not work, as most spam is automated, and spammers have no way of knowing whether or not a site uses rel="nofollow"
links.
Moderation by Users
Content can be posted by anyone, however, once the content displays on the site it can also be flagged as spam and removed (This option only works in practice if visitors perceive spam content to be relatively uncommon: if spam is allowed to surpass useful comments, most visitors will not bother flagging spam).
Gamification
Gamification is a great way to motivate users to report spam. Consider adding a "flag weight" feature to your site: the more spam users report, the more points them get. This will make hunting down spam more fun, and give people who report spam bragging rights. That will, in turn, encourage users to report spam.
Moderation by Administrators
A human must review every item of content posted before it is published on the site - while this does not prevent spam from being posted, it does prevent spam from being display to the site's visitors (thus reducing the value of the site to human spammers).
User Registration
User registration is an improvement over CAPTCHA because users are only forced to prove that they are human once before being allowed to comment at their convenience - this is not technically a different form of spam prevention, though it does make the removal of spam created by a specific user or group of users (as identified by username, e-mail, IP address, or other identifying factor) easier to enforce.
Moderate New Users
Instead of approving every post, an administrator can review new user registrations to determine whether or not to approve a user based upon whether or not the user's registration is consistent with identified spammers or automated spambots.
Limit New User Capabilities
Human spammers will rarely remember to return to accounts which they have created if they cannot post spam freely on an account - require new users to create a set number of posts (if the community has the ability to flag spam) and/or wait a set amount of time before restrictions on posting links or multiple posts are lifted.
Charge Users For Membership
If you charge for membership, even if the fee is small, spammers will be forced to weigh the cost of membership against the value of posting spam at your site (and pass over your site in favor of easier targets).
Invite Only
If you only allow people who have been invited by other users to register, this will severally cut down on spam (humans usually don't invite robots).
The following is from Project BOTCHA, Drupal.
HoneyPot
Implementation of honeypot-trap. The gist of it is that the field is added to the form with a certain value, which is then modified by JS. Spam is any form submission, the calculated value of which is not the same as we need.
HoneyPot2
The same as above, but using as a source of calculation not the value of a particular field, but the data from CSS.
ObscureUrl
Similar to HoneyPot2: constructed by JS is compared to the need. The difference is that the initial value is passed through the GET-parameter.
Conclusion
Most webmasters will find that a mix of the solutions listed above (with the exception of disallowing user-generated content) works best for their site and at least one solution must be implemented to prevent automated spam from choking visitors' discussions.
We recently eliminated the spam from our Contact Us form with a very simple implementation. We added an input that was labeled "URL:" in the HTML form and made it invisible to the real users. Then, in the form processor, we check to see if it has a value and act accordingly.
The spambots take the bait all the time; they put in a URL to some spammy site. Our script sees that and throws away the comment (actually, we recycle the bits because we're trying to be a greener eco-friendly sort of company). For a while, we'd still store the offending comment in a database table for review but would refuse to email the results anywhere. That's how we know it worked.
With this simple method we went from around 30+ spam "Contact Us" messages a day to ZERO.
Good luck with whatever you choose!