Is Zabbix the right tool for me?
Solution 1:
I think it would be best to concentrate on answering the specific questions you had, taking into account the size of your planned deployment (~10 monitored hosts).
What are the general disadvantages of Zabbix?
- it won't automagically figure out what to monitor, when to alert you and etc - you will have to think about what metrics you are interested in and configure them upfront
- debugging leaves something to be desired. although with such a small environment help options like forum, irc channel etc should suffice easily
Does Zabbix have a small footprint on boxes it is monitoring?
Yes, definitely. Zabbix can monitor using methods like SNMP, simple network checks (is a port open?), and it also has native agent for many platforms. As the agent is written in C, it has an extremely small footprint (as opposed to bunch of interpreted scripts...). You can easily combine different checks on a single monitored host. Note that you are not limited to monitoring servers, you can also add network devices and other things.
Do I really need to setup an entire other server for it? I currently have a server that is under very light load -- can I dual purpose it?
Depends - if it's running one of the supported operating systems for the server - definitely. For that environment requirements will be really low. Make sure to use default templates only as a guideline, it's suggested to create your own with longer intervals between checks. Basically, Zabbix consists of 3 components - DB, frontend, server. If you desire so, you can reuse existing database server and existing webserver in the company for the first two components, and then run Zabbix server on any supported platform - that's a perfectly valid configuration.
Any specific queries would be very welcome in #zabbix on Freenode.
Solution 2:
I use Zabbix for 2 years now, before I used Nagios...
In my opinion, the big difference is: with Nagios you get a status(OK/WARNING/CRITICAL), with Zabbix you get a data (integer, float, string...)
It's a really good point for Zabbix because:
- you can graph any (numerical) data without 'creating/defining' a graph
- you can 'easily' define alerts/triggers from more than one data value
Usage of agent to easily/rapidly collect basic system data is also very nice.
Disadvantages of Zabbix:
- less known than Nagios
- database to store configuration & data (more difficult to backup & manipulate than flat files)
Solution 3:
What are your goals for monitoring? Uptime? Performance? Billing metrics? Some of the utilities you listed above are better for each of those uses, and some are worse.
For uptime ensurance, we use monit, which is both free, and simple to set up on Unix/Linux systems. That utility monitors whether a process is alive, and ensures that it's not using more than its fair share of resources (CPU, memory) -- and if it's mis-behaving, monit will restart the process.
For performance monitoring, I suggest munin. It is easy to configure, and uses perl/bash/python/whatever as a data collection method. Munin has the ability to collect performance from multiple machines in one place, and builds easy to understand graphs.
For billing metrics (bandwidth consumption), I suggest PRTG. It's not free, but provides professional-level reports and statistics that can easily be used as part of your customer's billing report, if you do that sort of thing. We replaced our Zabbix installation, which required the use of agents on each monitored machine, with PRTG, which uses SNMP, and we have never looked back.
I have also used Zenoss, which was very nice, and was simple to install and configure. Zenoss required a long training period to learn how to get all the metrics we needed.
Solution 4:
I use zabbix to monitor our company's infrastructure (which is only 6 servers + all the networking stuff). I've had zabbix for over two years and it works great. I like the fact that it's all in one app and doesn't require installing tons of plugins. The interface doesn't win any design awards, but it is laid out surprisingly well in terms of functionality. I've had some intermittent hardware problems on our servers in the past and having lots of historical data in zabbix definitely helped a lot in straitening them out.
Some versions seemed to have stability issues and crashed once in a while, but monit took care of that.
I recommend putting zabbix on a separate box(and some decommissioned server hardware from 3-4 years ago will work pretty well) The application itself is not very heavy, but it does put a significant strain on the database(mysql in my case) - saving all the historical data doesn't come cheap.