Should I block the Yandex Bot?

Should i block Yandex

Why?
First, if the bot is a legitimate search engine bot (and nothing else), they won't hack you. If not, blocking a User agent won't help, they'll just use another one.
If your password is good, fail2ban is configured, the software is up to date etc., just let them try. If not, you need to fix that, independent of any Yandex bots.

To make sure the problem is actually Yandex, try disallowing it in robots.txt and see if it stops.
No => not Yandex.

(Did set up a new webserver some weeks ago. One hour after going online, had not even a domain yet, a "Googlebot" started trying SQL injections for a non-existent Wordpress. It was fun to watch, as there were no other HTTP requests. But I did not block Google because of that.)


Along with agreeing with @deviantfan 's answer and specifically with this point

First, if the bot is a legitimate search engine bot (and nothing else), they won't hack you. If not, blocking a User agent won't help, they'll just use another one.

I would like to point out that as Yandex as well as another search engine bots in general might not intentionally want to access your backend. Remember bots are crawling the sites by following the links, so imagine if the bad guys would put some of your backend's urls in some other website's pages, and the search engine simply indexed those pages and now is trying to follow the links from there. So, it will look like the search engine is trying to access your backend - but it just crawling the net: it does not know that it is your backend.

Similar thing might happen by accident. Lets say a non-tech savvy user posted a url in some forum, that is only accessible when you are logged in - by crawling the search engine will try to follow those links and you will end up seeing logs as I assume you did.

UPDATE: I think you might want to set in your robots.txt rule to disallow yandex to access specific urls. Btw, you better define specific rule with its name, I am not sure, but it might happen, that yandexbot can ignore User-agent: *, so you can do smth like this(according to your backend urls)

User-agent: Yandex
Disallow: /admin/*

So, in this way you will disallow it to try to access backend urls - matching that pattern, but at the same time it(yandexbot) will be free to crawl another pages of your website.


You should not block the legitimate Yandex bot, but you could verify that it is in fact the legitimiate bot, and not someone just using the Yandex User-Agent.

From: https://yandex.com/support/webmaster/robot-workings/check-yandex-robots.xml

  • Determine the IP address of the user-agent in question using your server logs. All Yandex robots are represented by a set User agent.
  • Use a reverse DNS lookup of the received IP address to determine the host domain name.
  • After determining the host name, you can check whether or not it belongs to Yandex. All Yandex robots have names ending in 'yandex.ru','yandex.net' or 'yandex.com'. If the host name has a different ending, the robot does not belong to Yandex.
  • Finally, make sure that the name is correct. Use a forward DNS lookup to get the IP address corresponding to the host name. It should match the IP address used in the reverse DNS lookup. If the IP addresses do not match it means that the host name is fake.

In fact, almost all big search-engines provide similar ways of verifying the User-Agent. The way this works is because someone can spoof the reverse DNS lookup, but not the forward DNS of that spoofed address.