There are lots of load balancing products on the market today. Some are good, some are bad. Opinions aside, one such product is the Citrix Netscaler.
Now, the Netscaler has built in health checking for many different types of backend service protocols: HTTP, FTP, DNS, and more. However, one protocol not included is NTP. If you happen to be load balancing a farm of NTP servers, and want to ensure you’re not routing to a hosed NTP daemon (even if it’s ping-able), it would be nice to be able to health check the service [and take it out of rotation if it's not responding].
Luckily, the Netscaler provides (with documentation) a very flexible way to write your own custom health checks. Using the system, you can test whatever you want, in any way you want, with just a little Perl. Even better, the custom health check subsystem provides a few useful things to simplify the whole process, including:
- passing in the service IP, and port, as arguments to your script
- passing in a custom argument string (key value pairs, user/pass info, whatever)
- handling the timeout for you from a higher level; if your check doesn’t respond within -respTimeout seconds, it will be considered a failure.
Now, for just the basics, this monitor will actually be pretty easy to put together. We are given the IP/port to send an NTP request to, and we don’t need to worry about a timeout, because the underlying subsystem will handle that for us. All we really need to do is send a request and [hope] for a response!
That being said, we’ll need to do a couple things to get to that point. These things are:
- Load the health check subsystem module.
- Load the Socket module, and create a socket to use to send the request [using the parameters passed in as args].
From that point, we can build and send our request. Since we don’t need to worry about handling a timeout, we can just sit there waiting for a response. First up:
use IO::Socket; use Netscaler::KAS;
We’ll need both of these modules – one will tap into the health check subsystem, and the other is so we can send our UDP-based request.
Next is our function, which we’ll just call ntp_probe. It assigns a few variables from $_[0] (the service IP) and $_[1] (the service port) as arguments, which get passed in from work within probe().
So, the magic-fu to this is that, later on, we will pass our ntp_probe function as a coderef to the KAS probe() command. That will ultimately result in our function being called with the arguments passed in; it is similar to this:
$custom_function->($host,$port,$args);
If you look at the KAS code (and since it’s perl, you can), you can see the exact line; this just gives you an idea of what’s going on.
In any case, after that, we’ll create a new UDP socket, send the NTP request message down it, and either:
- return 1 if something wasn’t defined (ip, port, or sock) – failure!
- return 0 if we got a response – success!
Again, we don’t need to worry about timeouts, since the subsystem will handle that for us. So, here’s the function in it’s entirety:
sub ntp_probe {
my $host = $_[0];
my $port = $_[1];
my $req = "\010"."\0"x47;
if(!$host || !$port) {
return(1,"Host or port not specified.");
}
my $sock = IO::Socket::INET->new(
Proto => "udp",
PeerAddr => $host,
PeerPort => $port,
);
if($sock) {
$sock->send($req);
$sock->recv($_,1);
return 0;
}
return(1,"Could not create socket");
}
Not bad, was it? Now that we have loaded our modules, and defined a function, the next step is to call the probe() method, with a coderef to our new function, to tell KAS what function to use for probing:
probe(\&ntp_probe);
And you’re off to the races! Just don’t forget to start your script with #!/usr/bin/perl, and it should be good to go
Now, you might be asking “how do I actually configure the Netscaler to use this?” That’s also just as easy. Let’s pretend our new script is called customNTP.pl. Just plop it into the /nsconfig/monitors/ directory, and do something like this from the CLI:
add lb monitor "custom-ntp-mon" USER -scriptName customNTP.pl -scriptArgs 1 -dispatcherIP 127.0.0.1 -dispatcherPort 3013 -interval 20 -respTimeout 3
With that line, you will now have a working NTP healthcheck (20 second intervals, 3 second timeout). Just bind it to your service(s) that need it and enjoy!. Note that the type of monitor is USER; this is the type used for custom health checks.
Well, that’s all for this evening. There are bits and pieces of info like this around the web, on the Citrix AppExpert site, etc. And, even though it wasn’t too difficult to figure out, hopefully this will help someone out some day.
Take care!



