Why aren't search engines indexing my content?
There are a number of reasons your content may not appear in search engine results, however, it is important to note that a search engine's index may contain pages that it doesn't display in its results page.
How to tell if your content is actually indexed
It may actually be difficult to tell if your content is indexed.
- Search for all the documents from your site and see how many are listed
- Google: enter
site:example.com
(whereexample.com
is your domain, there must not be any space after the colon.) - Bing: enter
site: example.com
- Yahoo: enter
site: example.com
(or use advanced search form)
- Google: enter
- Search for a specific document by a unique sentence of eight to twelve words and search for that sentence in quotes. For example, to find this document, you might choose to search for "number of reasons your content may not appear in search engine results"
In addition to above, search for keywords using
inurl:
andintitle:
you may try something like,keyword with another keyword inurl:example.com
this will bring upi pages that are indexed only for specified domain.- Log into webmaster tools to see stats from the search engine itself about how many pages are indexed from the site
- Google Webmaster Tools – Information is available under "Health" » "Index Status". If you have submitted site maps, you can also see how many documents in each site map file have been indexed.
- Bing Webmaster Tools
In some cases, documents may not appear to be indexed via one of these methods, but documents can be found in the index using other methods. For example, webmaster tools may report that few documents are indexed even when you can search for their sentences and find the documents on the search engine. In such a case, the documents are actually indexed.
How content becomes indexed
Before search engines index content, they must find it using a web crawler. You should check your webserver's logs to see whether search engines' crawlers (identified by their user agent – e.g. Googlebot, Bing/MSNbot) are visiting your site.
Larger search engines like Google and Bing typically crawl sites frequently, but the crawler may not know about new site. You can notify search engines to the existence of your site by registering as its webmaster (Google Webmaster Tools, Bing Webmaster Tools) or, if the search engine does not provide this facility, submitting a link to its crawlers (e.g. Yahoo).
How long has your site/content been online?
Search engines may index content very quickly after it has been found, however, these updates are occasionally delayed. Smaller search engines can also be much less responsive and take weeks to index new content.
If your content has only been online for several days and does not have any links from other sites (or its links come from sites which crawlers do not visit frequently) it is probably not indexed. If your site hasn't been live for more than a few months, the search engines may not trust it enough to index much content from it yet.
Has the content been excluded by the webmaster?
This step is especially important if you are taking over a site from someone else and there is an issue with a specific page or directory: check for robots.txt and META robots exclusions and remove them if you want crawlers to index the content being excluded.
Is there a technical issue preventing your content from being indexed?
If you have an established site but specific content is not being indexed (there are no web crawler hits on the URLs where the content resides) the webmaster tools provided by Google and Bing may provide useful diagnostic information.
Google's Crawl Errors documentation provides extensive background on common problems for web crawlers which prevent content from being indexed and, if you use Google Webmaster Tools, you will receive an alert if any of these issues are detected on your site.
Correct errors and misconfigurations as quickly as possible to ensure that all of your site's content is indexed.
Is the content low quality?
Search engines don't index most pages they crawl. They only index the highest quality content. Search engines will not index content if:
- It is spam, gibberish, or nonsense.
- It is found elsewhere. When search engines find duplicate content, they choose only one of the duplicates to index. Usually that is the original that has more reputation and links.
- It is thin. It needs more than a couple lines of original text. Preferably much more. Automatically created pages with little content such as a page for each of your users are unlikely to get indexed.
- It doesn't have enough reputation or links. A page may be buried too deep in your site to rank. Any page without external links and more than a few clicks from the home page is unlikely to get indexed.
Is some of your content indexed, but not all?
If your site has hundreds of pages, Google will almost never choose to index every single page. If you site has tens of thousands of pages, it is very common for Google to choose to index only a small portion of those pages.
Google chooses the number of pages to index from a site based on the site's overall reputation and the quality of the content. Google typically indexes a larger percent of a site over time as the site's reputation grows.