Ask Google to remove thousands of pages from its index after cleaning up from hacked site
Google will automatically remove pages that now return a 404 status. They will get removed 24 hours after Googlebot next tries to crawl them. If you want to speed up the process slightly, return "410 Gone" status for those URLs instead. Then they will be removed without the day grace period after they are next crawled.
The only problem is that it may take Googlebot months to get around to crawling all those dead pages. If you want to speed the crawling up, you have two options:
- Submit each URL individually to the Google Search Console URL removal tool.
- Create a temporary sitemap of all the dead URLs and add that sitemap to Google Search Console. (reference)
To get a list of all the URLs, I would suggest using your server logs. They will have a more complete record of the URLs than a site:
search or Google Search Console. I would use grep
on the command line. If the all the URLs are similar to the URL you posted, you could come up with a regular expression pattern for them. That URL is 31 characters long with letters, dashes, and digits. It ends in a numbers. Maybe something like this. It will look for 15 to 30 of those characters followed by a dash and 4 to 10 digits.
grep -oE '/[0-9a-z\-]{15,30}-[0-9]{4,10}' /var/log/apache2/example.com.log
This problem won't be solved by pinging Google to recrawl your site or resubmitting the sitemap because it would index the new URLs and not delete the old/dummy ones.
The Webmaster tool used for URL removal is the only way to ask Google to remove links from its index, however, it only allows one link at a time, to be submitted for removal.
In order to overcome this, you can use a chrome extension to automate this process. It is a paid tool(about $9) on chrome extensions store but you can get it for free on GitHub.
- Go to this link.
- Download the .zip file.
- Extract and import into the chrome extensions.
Now reload your URL removal tab and you will see an option to upload a .csv or.xls file.
Download the list of URLs that you need to delete from Search console and upload the file here. (These links will be excluded from your sitemap so you would find the list of these URLs easily)
Let the tool do its job because it sure will take time depending upon the number of links you have.