Is link building a white hat SEO activity?
To Begin
I must apologize to those who are in favor of TL;DRs. I have included a list of links Google does not like at the top just for you!
This is one of my famous long answers. But it is important and sometimes a short answer simply will not do. If you know me, sometimes I leave comments with "Short answer. No." Or Yes as the occasion dictates. This is not one of those occasions.
Links Google does not like.
This is just a quick list of link types that Google does not like and would be seen as Black Hat. There may be more.
- Link Exchanges: Trading links in a quid pro quo fashion. You link to me and I will link to you.
- Paid Links: Links that you purchase from a company or person.
- Links in exchange for services or product: Any link where product or services are returned for the link. Another quid pro quo.
- Article Posting: Guest posting is one thing, however, article submission/marketing sites where the same article appears in several places is not appreciated.
- Automated Links: Where links are created using automation, often by software.
- Widget or bug based links: Where a JavaScript program or similar device creates links dynamically/automatically.
- Advertorials: Articles that are designed to be an advertisement. Nofollow should be used.
- Link advertisements: This is where an advertisement link can pass value. Nofollow should be used.
- Commercially branded links: Any links that are designed to have specific commercial value. Nofollow should be used.
- Web Directories: While there are high quality directories, some offer less or little value. Directory submission, especially mass submission, should be taken with caution and consideration. Nofollow should be used.
- Sitewide links: Any link within a header, footer, or sidebar that creates links on every page or very many pages.
- Blogroll links: Often this is a set of links in one blog to other blogs.
- Forum and Blog Signatures: Links created in a users signature block.
- Forum and Blog comments: Excessive use of links in forums and blogs. Often this is a backlink marketing method to participate in forums and blogs or even site comment sections in order to leave a link.
- Overly optimized link text: Links that are designed to rank for specific keywords.
- Contextually irrelevant Links: Links that are not related to the topic of the link page and/or target page.
- Geographically irrelevant links: Links on a web page that is geographically targeted where the target page is not suited.
- Language irrelevant links: Links on a web page of one language where the target page is another language.
- Link velocity: Spikes in backlink velocity that is a signal of rapid backlink building.
- Link Networks: Participating in a link building scheme or network.
- Press Release Sites: Using a PR site to build traffic and links.
- Bookmark Sites: Using a bookmarking site to build traffic and links.
- Resource Pages: While this can be perfectly acceptable activity for a site owner to create resource pages, these links can be seen as spammy. Resource pages can often be too broad and large. Smaller lists are preferred. Nofollow should be used.
- Excessive social media links: It is advisable to participate in social media, however, creating many social media accounts can be seen as gaming.
- Single or limited keyword links: Links with one or a few keywords are seen as gaming. Fully semantic link text is preferred.
- Links outside of content: While this is perfectly acceptable in some cases, excessive links just outside of content, often following content, can be seen as gaming.
- Ad links: Some sites participate in adware mechanisms that link keywords with pop-up style JavaScript ads.
Things have really changed over the past few years especially and over the past decade for sure. And you are right about most SEOs, they have yet to move on with the reality that faces us today. The primary advice of concern is, you guessed it, inbound links (back links). What is funny is exactly how far behind SEO advice really is. Landmark changes are made and SEOs seem to be 5 years or more behind the change.
Let us not kid ourselves, this is a HUGE topic that likely can benefit with a bit of history. I will offer some of that. Then I will get into the conundrum that we all face the best I know how (and yes with some bias). So please forgive me if I do not get it all right or cover every aspect. Like I said, this is a huge topic. Plus, I am getting old.
Historical View
I often refer to the original research paper written in 1997 by two students Sergey Brin and Lawrence Page from Stanford University. This was the beginning of Google and tells us a lot about how Google thinks. I also sometimes refer to landmark points in Googles history where significant changes have occurred. I know for some of you, this is getting old, but please stick (I have learned not to say bear/bare) with me. In the original paper, it would not surprise you, PageRank using backlinks is discussed. But what is said, it often ignored.
The old model of a PR6 page with 2 outgoing links each passing PR3 was never really accurate. While on some measure, this does happen, there is far more to the story. For example, if you think about it, as each page links to another there is value that is not calculated appropriately without revisiting the metrics for each page over and over again. This actually occurs. But also think about this, any site with a large amount of PageRank would then pass huge amounts of rank unnaturally. With this model, it would be almost impossible for lower ranking sites to gain rank short of links from what Google terms super authorities. To combat this, authority caps are created for each site/page with large authority scores. Additionally, each link is evaluated, as it exists, for quality based upon some simple criteria and given a score from 0 to .9. What is actually passed is rank based upon the authority cap calculated by the link value. It is easy to see that the PR6 page is not passing PR3, but something much lower. The effect of this is to create a more natural curve in the PageRank scheme that can be adjusted.
In the old days, way back when wheels were square, it was possible to create several sites that link to each other in an effort to build a rank profile. Spammers know this all too well. In order for this to work one would have to create a super authority site that then passes rank more sparingly and strategically to specific sites that then link to each other. It is a combination of a "spoke and hub" topology with a "mesh" topology where the super authority site is used as a hub to seed the mesh. I did this, admittedly, back in the days before it was really a problem. But it was not long before spammers caught on. Of course, I dismantled my link scheme at this point. DUH! I am not stupid.
So what did Google do? It took advantage of information that was easily collected. At this time in history, there were limited options for registering a site. Network Solutions, for example, was the only place where a .com, .org, and .net sites could be registered. Also, at that time, registrations where not private. They were public. It was far easier to create relationships (called clusters in semantics or realms in science) between sites using registration information which is exactly what Google did. For each site within the search engine, the registration information was collected and written to a special semantic database that would create relationships using any and all of the information available within the registration. As well, Google began to map IP addresses to sites, web hosts, companies, ASN (AS Number) based upon APNIC registrations, GEO locations including countries and locales, and so on. Relationships between sites and networks was also part of the semantic database. Part of what was needed was a way of establishing trust for sites and hence the beginning of TrustRank often totally ignored within the SEO world. While it is far harder to make these relationships these days, Google found it advantageous to become a registrar in order to bypass private registration and add new tools to the mix including information found on the sites themselves. I will not get into all of this, but suffice it to say that Google can go as far as identifying an unsigned work to an author using some of these techniques. But do know this. Sites associated with other lower quality sites, or on lower quality hosts or networks, do suffer a knock in their TrustScore. This also goes to elements found in the registration information. Often, Google can spot a potential spam site solely based upon registration information, host, or network. It really does work.
Another part of the process was to look at link patterns using established pattern maps based upon known link scheme techniques. As well, using AI and learning methods, new patterns can be discovered, confirmed, and the pattern database updated automatically.
This is something that Google keeps a human hand in. Google has been using people to review site quality as a part of seeding AI systems that determine a sites quality and to find new link patterns. From time to time, Google even solicited users to help identify sites that are low in quality and linking schemes for just this reason. It used this new information to not only study, target, and update link patterns, but to seed the AI learning to spot other schemes that may exist.
So what is someone to do with all of this knowledge?
A company can and should be able to link to their other sites without fear of a penalty and they can as long as it is within limits that any thoughtful and honest webmaster would follow. As well, inbound links that follow this same mantra should be fine. Using this massive semantic database and link pattern database, Google does do a fairly decent job of identifying sites that link to each other naturally. But there is a new problem.
What is New
Linking schemes are getting more and more sophisticated and complex. Some of us know about the J.C. Penny link bug from a few years ago. This got Google paranoid and busy as it set out to end this problem once and for all. But this is no easy task as you can imagine.
As well there are tons of sites created to gain search traffic that create massive amounts of links to sites. We know these sites; these SEO performance sites that scrape Alexa and other data and link to the original sites referred to. We can add keyword sites, whois sites, and so on. Google sees these as very low quality with some exception such as domaintools.com and robtex.com. As well, the original rank passing algorithm was seen as a motivator for seeking unnatural links from directory sites, blogs, forums, etc.
While there were some battles won over these sites and links for ranking, Google, and rightly so, has decided that natural links are preferred. As part of this, Google has discounted inbound links as a ranking signal. I assume that step one was as simple as adjusting the authority caps. What was an acceptable link fully out of the control of the site owner created by these junk sites, are now being look to as a very real part of the ranking system. It was always discussed within the SEO community that a healthy ratio of good links versus low value links as compared to other sites was something to achieve. But today (meaning since March 2015), with the disavow tool, it seems that Google prefers that these low quality links not occur and downward pressure applied to mitigate these links using their tool. The idea seems to be to clean-up the link index. For example, topalternate.com and aolstalker.com can create hundreds and thousands of links to any site. With the disavow tool, Google sees an opportunity that webmasters should take advantage of to control how Google sees inbound link. While on one hand, low value links are of no harm, Google prefers that they disappear.
Google prefers natural links.
The Conundrum
It is getting far more difficult for a webmaster to build links. While SEOs discuss endlessly all the methods of creating inbound links, it has gotten almost impossible to do so and has been for a very long time. As you point out, while there are webmasters that are approachable, most are far too busy building their own content and inbound links. Ironic huh? It has become a bit of a feeding frenzy. I stopped seeking inbound links many years ago when the effort resulted in so few links. I also stopped when in some of the Google patents, it was clear that links of this type would be further scrutinized.
However, there is good news. Sorta.
Part of what Google is thinking is that engagement is king along with content. Creating a buzz is all the buzz. Social media is seen as the new link building and authority scheme. Google is looking more and more toward social media, authority, and engagement. The problem is this. Not all sites and indeed highly valuable and truly authoritative (in regard to content) sites are appropriate for the new social media paradigm. For example, think about all of the research written by professors, engineers, and true thought leaders (including Google engineers)? Research is hardly exciting nor is a professor likely to engage in social media to promote content for the masses when it is clearly for a segmented market. This is one area where Google has gotten it wrong and has optimized this form of authoritative content almost out of existence in just the past 3 years.
What supplants inbound links in Googles eyes is social engagement. While there is a duality to this effort; on one hand Google prefers social engagement, on the other hand Google discounts links from Twitter, Facebook, and the like preferring links from LinkedIn, Google+ and so on only to recently look to Facebook once again where access is limited. It has always been my belief that Twitter does more for social engagement than any other social media, and yet these links are temporary. fleeting, and still discounted.
Google took a big leap into this train of thought on June 17th 2015 with what some are calling the News/Trend update. While little is known about this update, one thing is perfectly clear. Trend sites rank higher. This is because with so many trend based queries, Google wanted to make sure these sites rank high to answer the call. And so if your site is not trend based, you have seen a knock in your positions beginning on June the 17th. This is a hint folks!
Where Inbound Links Still Work.
In one word (mostly). Branding. Or something similar. While it is difficult to gain traction in branding a site that is not a business or product brand, after all you have to create a buzz, following branding signals really helps. Specifically, being mentioned within content, being mentioned in relation to another brand, being mentioned as a useful product, etc. With so many bloggers out there, what moves the needle is having outstanding content that people want to link to fully on their own. Think about how many times MOZ is quoted and linked to? We think of this site as a super authority, but it did not start out that way. As each article was written and discovered, links were made. As more links are made and people follow these links, webmasters were more willing to create similar links. Think about it this way. In the online SEO game, there are so many people wanting to enter the market and amass a fortune for themselves. In that, most are not original thinkers and parrot the same ole' cr@p de jure over and over again. Many are just chasing SEO headlines and linking to sites with authority to: one, show their prowess within the community; two, to show their knowledge on the subject; three, to add their particular take on the subject thus establishing themselves as a thought leader; and four, hopefully gain authority in who they link to (as this is also seen as a factor in Googles eyes). But while we know that 99% of these sites are still junk, these sites easily out rank others that are fact based and thoughtful in their presentation avoiding the trends and heaven forbid, may have some real authority. It seems like the junk mongers are winning.
Not every content market lends itself to this level of activity. However, where it does, there is no doubt that social media can get your content in front of those who will link to your content and those who want to get into the game. Tempting for an old guy like me, but a pain in the arse if you hate being social.
Where a site succeeds in creating valuable inbound links are on sites that are popular, trendy, socially engaged, people want to link to, and promote their work. <rant>
Oh yes. It also helps to repeat what passes as common socially liberal knowledge even if it is pure bull. No high brow stuff. Just grab for the low hanging fruit and present it in an engaging way with pretty pictures (another ranking Google factor). While we are at it, go ahead and make a video (yet another factor). Go ahead and dance and sing your own lyrics to an already popular tune. That really seems to help! Let's homogenize the Internet shall we and fill it with useless junk?</rant>
(God that was FUN!)
Black Hat versus White Hat
So as far as Black Hat versus White Hat, you now know the difference. Linking schemes no matter how small are Black Hat while links that are natural and honest are White Hat. It really is that simple. And Google has put a target on the back of anyone who seeks an advantage that is even a little bit shady.
Please feel free to tweet this answer. ;-) (just kidding)
Link building is not black hat SEO unless you are artificially creating links to your web pages (i.e. link farms). But asking other websites to link to you is perfectly fine. After all, you can ask for a link but it doesn't mean you're going to get it.
The reason why it is perfectly fine is, like you said, it is all about quality links. Not coincidentally, the higher the quality the link the more difficult it will be to get that link. So getting low quality links will be easier to do and you may get lots of them, but because their quality is so low you won't see much of an improvement in your rankings if you even see any at all. The quality links, the ones that make a real difference in your rankings, are difficult to get because not only are their fewer of them available, but they tend to have quality content on their website and won't link to a poor website as it will reflect poorly on them. That's why they are so valuable. Not only are the pages the link will be on high quality but it forces the page being linked to to be high quality as well. (A good rule of thumb is: the easier it is to get a link, the lower quality of the link).
You mentioned using high quality content to naturally attract high quality links. That's exactly how you should approach it. Pandering for low quality links has a poor ROI. Writing quality content and letting the links "come to you" is the better long term strategy.
I would avoid directories and the like, if you have quality content then approach similar sites in your niche and show them your content. If they like it then they are likely to link or share it. Reaching out to people in your market and telling them about your content is a great way to garner links, instead of asking for them.