Monitoring Your Infrastructure with Zabbix

The last time I had an enterprise system monitoring implementation to work with, it was already set up and the organization had been using it for several years.

This time, I got to do a little research and set up a system at home to monitor my various "machines." After playing around with Cacti, I decided that seeing my logs fill with PHP deprecation warnings was a little too scary and started up a zabbix server.

Zabbix definitely takes more resources, both in terms of the server and the machines being monitored. Like everything else, it uses a database backend and has a nice clean web frontend, so no real differences there. By default, you can graph any piece of data you gather (including the version of Zabbix, if you want it), and you can create more complex graphs fairly easily. It took me a few minutes to get a cheap load average graph that works quite well. Paging is actually a more robust "notification" system that supports Email, SMS, Jabber, and custom scripts, so you can extend it to do whatever you can think of (fax? log to a file? print a .pdf from your boss' printer whenever a server goes down?).

For the truly lazy, Zabbix offers a .vmdk built on SuSE that you can just fire up and have a running Zabbix server.

Monitoring, graphing, paging...there are all the same tasks that any monitoring system does.

Where Zabbix's architecture differs from Nagios, Zenoss, Cacti, et. al., is that it's a client-server system. This took me a while to grok, since "client" and "server" aren't what you'd normally expect them to be (think iSCSI). The "Zabbix Server" box dials out to an (access-list restricted) open port on the "client" and retrieves the data from a running service. There's even a Windows version of the client available, if you're into that sort of thing.

This means that there is a daemon running on the client to gather data and send it to the server and a daemon running on the server to periodically go and nab the data from the clients. The clients need to know where to trust connections from (by IP, so watch out if you have AAAA's), and then the hosts have to be manually added to the Zabbix server. For rolling out to an existing infrastructure this is more work than a pure SNMP-monitoring solution that can just scan your network and learn all your hosts.

A side-effect of this is that Zabbix has a proxy server available. You could have a host on each subnet (or at each site) gather all the information, and then submit bulk updates to a central Zabbix server, or implement even more tiers of proxying if your infrastructure requires it.

The biggest "gotcha" to just getting a Zabbix server running is that the front-end isn't necessarily bundled with the server. The only requirement for the web front-end is that it can talk to the database, so you can run the front-end, Zabbix server, and database on 3 different systems if you want to. The front-end is all PHP, so your existing Apache server is more than adequate.

Zabbix also does some basic host-based intrusion detection. If you're going to be running something more beefy (Samhain/Beltane, for example), then this is likely just a waste of cycles. Still, this is leaps and bounds above the option of no host-based intrusion detection.



Duplicate IPv6 Address Detection in Karmic

I've finally gotten my little empire of virtual machines somewhat stable, but a little wrench in the works has been the extra flag I keep needing for ssh: -4.

Sure, I can log in without it, but the systems will try using IPv6 first, since they all have IPv6 addresses in DNS, and then fall back to IPv4 when that times out. I could use my .ssh/config file to force IPv4, but that gets rid of the nagging reminder that I need to fix it.

Turns out IPv6 is broken by default in Karmic (and hopefully fixed in Lucid, but I'm not holding my breath). One of the "features" of IPv6 is Duplicate Address Detection; you should never have the problem of having the same IPv6 address allocated to 2 computers unless you do it intentionally. Of course, EUI-64 addressing should take care of that anyway...

Unfortunately, Karmic's implementation of DAD is horribly broken. Even with unique entries in DNS and a properly configured rtadvd, I kept getting logs filled with:

Apr 25 13:49:59 vm1 kernel: [ 6851.010394] eth0: IPv6 duplicate address detected!
Apr 25 13:50:01 vm2 kernel: [147906.360484] eth0: IPv6 duplicate address detected!
Apr 25 13:50:00 vm3 kernel: [315943.970527] eth0: IPv6 duplicate address detected!

So, I tried to configure IPv6 manually. But, Ubuntu is smarter than me (or so it thinks). Once it's detected that the address is a duplicate, you're not allowed to actually assign it to an interface. All the packets were attempting to be sent out with a source ip of ::1, so they all went out the lo interface which effectively firewalls the whole IPv6 stack from the outside world.

The only solution is to turn off Duplicate Address Detection.

sysctl net.ipv6.conf.all.dad_transmits=0
sysctl net.ipv6.conf.all.accept_dad=0


Well, not quite. See, "all" in sysctl doesn't actually mean "all." I'm not sure what it means, but it seems to be the exact opposite of "all," i.e. "none."

What I actually had to do was explicitely disable DAD on both lo and eth0. lo shouldn't need it, since all IPs on loopback are the same box anyway. If I see an IPv6 loopback packet storm, maybe I'll think about turning DAD back on for lo.

The final sysctl.conf:

net.ipv6.conf.all.dad_transmits = 0
net.ipv6.conf.all.accept_dad = 0
net.ipv6.conf.lo.dad_transmits = 0
net.ipv6.conf.lo.accept_dad = 0
net.ipv6.conf.eth0.dad_transmits = 0
net.ipv6.conf.eth0.accept_dad = 0

Phew! Good thing for me I have this all in Puppet, so it's easy to replicate across all my hosts:

exec { "sysctl -p":
  cwd         => '/',
  path        => '/bin:/sbin:/usr/bin:/usr/sbin',
  refreshonly => true,
  subscribe   => File[sysctl_config]



Copy-on-write Semantics on DABL

I've been thinking of a way to do this for a while, and finally got some time to hack out a prototype yesterday.

First, the problem statement. We have a piece of data (street address) that needs to be associated with multiple entities. Basically, I want a many:many relationship between them, but with a catch. Once the address is attached to an entity, it should be manageable on a per-entity basis.

The obvious solution is to make a copy of the data and just attach the copy to the second entity. That's fine, but it's not unrealistic to have 1,000 entities with the same information. That means 1,000 copies of the data, just in case it might change.

So, I implemented a copy-on-write save() method for DABL, our pet ORM/MVC.

(editor's note: DABL is now hosted on github)

The basic idea is that we need to generate a new ID for the record, then pass that new record off to a subset of the related records. If all the related records are to be modified, there's nothing to be gained (except revision history; if you want that, it's trivial) from creating a new record. Rather than letting the database update all the related records, we have to do it ourselves since we only want to CASCADE some of them.

For the source of my prototype pet project, see DABL cow-save() or fork my nonblocking-random repository on github. The basic database setup is a Userdata table which has many Users. Userdata holds the equivalent of GECOS information (firstname, lastname), while User holds the login information (username, password). The production version of this has much more data in the related table and is written in Propel, which makes it significantly uglier.



Managing your Checking Account with PocketMoney

Balancing your checkbook is one of those chores, like taxes, that everyone hates but does anyway. Of course, unlike taxes, you slack off for a month and the feds don't come busting down your door. So you slack off for another month...

If you're like me, you have several abortive attempts to use the check register that came with your checkbook to manage your money. My current check register even includes a handy three-year calendar for 2003, 2004, and 2005. The only real use I have for this ancient piece of banking technology is to track when I've overdrawn my account. After each time that happens, there's another 1/4 to 1/2 page of studiously usage until I start forgetting again.

Part of the problem with using a check register is that hardly anyone even takes checks anymore. I write around 30 or 40 checks a year; all of them are for paying bills, so I don't even bring my checkbook with me when I go out. I also don't generally carry cash, so that means a lot of miscellaneous debit and credit card transactions. And, lets be honest, who actually wants to carry a checkbook just to record all of those so they can balance their checkbook at the end of the month?

Of course, one thing I do always have with me is my iPhone. And, as they say, there's an app for that (several of them, actually). PocketMoney is the one that I finally settled on. There's a free trial, so really there's no excuse not to at least take it for a test drive. Training myself to use this was definitely easier than training to use a check register, and the built-in budgeting tools are impressive. When I've got the app handy, "I'll record this $1.40 at Starbucks when I get home" is really hard to justify.

One of PocketMoney's most useful features is the different budgeting schedules. If you get paid bi-weekly, enter that as the repeat period. The program will automatically pro-rate your budget based on how many weeks (or partial weeks) are in the given month. Enter Rent monthly, Groceries weekly, Salary bi-weekly, and Auto Insurance quarterly, and the app will take care of all the calculations to give you correct budget breakdowns by any time range you choose.

For me, though, the real power of this app comes when combined with online banking and online bill-pay. The recurring transactions feature lets you schedule automatically repeating transactions, so you never forget about an automatic withdrawal again. Online banking means that I can balance my "checkbook" against my bank account daily, weekly, or just whenever I remember and have an extra 10 minutes to spend on it. For me, this is a whole lot more convenient than remembering at the beginning of the month when I get my bank statement in the mail. Plus, since I have everything synchronized, I know how much money I don't have and can keep from overdrawing yet again.