Splunk-o-rama
Mar. 24th, 2011 02:56 pmHere's a quick example of why I like Splunk so much:
We have a whole heap of websites. Some unknown number of them are no longer used -- either they never went live, or they've been replaced -- but we don't know which ones. And of course one doesn't want to accidentally nuke an active site being used by clients.
So.
The web hosting boxes are running Splunk in forwarder mode, watching all of /var/local/apache/logs. This tree contains one directory per website, with each directory containing all the Apache logs for that site.
Doing the naive thing and just having Splunk do the default thing with that directory didn't work out so well. It was catching all the logs in there, including useless things we don't care about like mod_jk.log, and it wasn't correctly classifying all the access logs so not all of them were being correctly parsed and presenting useful variables like "clientip" in the search tool.
Change inputs.conf to read thus:
Then restarting the clients, flushing the index on the master, and now only the current access_log for each website is being watched, and is always parsed as access_common.
This will have to be left running for a week or two to get enough data to be sure, but once the index is properly seeded the following search command will produce a lovely table suitable for finding the inactive sites:
Could this have been done using a different tool, or scripted from scratch? Absolutely! But Splunk is much more generic, the skills one picks up by working with it for this specific problem are widely-applicable to any other task where log-analysis is the answer.
I love this tool, and wish it were easier to sell to management. Unfortunately the price-tag is pretty high, and it's not something they use personally so attempts to get them to want to pay for it have not been entirely successful. I suspect there are a lot of companies where the free 500MB/day version is getting significant use by people like me, with the vague hope that eventually it'll become so obvious to those who control the budgets that it's incredibly useful and efficiency-boosting that they'll realise they really ought to just pay up and do it right.
We have a whole heap of websites. Some unknown number of them are no longer used -- either they never went live, or they've been replaced -- but we don't know which ones. And of course one doesn't want to accidentally nuke an active site being used by clients.
So.
The web hosting boxes are running Splunk in forwarder mode, watching all of /var/local/apache/logs. This tree contains one directory per website, with each directory containing all the Apache logs for that site.
Doing the naive thing and just having Splunk do the default thing with that directory didn't work out so well. It was catching all the logs in there, including useless things we don't care about like mod_jk.log, and it wasn't correctly classifying all the access logs so not all of them were being correctly parsed and presenting useful variables like "clientip" in the search tool.
Change inputs.conf to read thus:
[monitor:///var/local/apache/logs] disabled = false followTail = 0 sourcetype = access_common whitelist = .*access_log$
Then restarting the clients, flushing the index on the master, and now only the current access_log for each website is being watched, and is always parsed as access_common.
This will have to be left running for a week or two to get enough data to be sure, but once the index is properly seeded the following search command will produce a lovely table suitable for finding the inactive sites:
sourcetype="access_common" | stats dc(clientip) by source
Could this have been done using a different tool, or scripted from scratch? Absolutely! But Splunk is much more generic, the skills one picks up by working with it for this specific problem are widely-applicable to any other task where log-analysis is the answer.
I love this tool, and wish it were easier to sell to management. Unfortunately the price-tag is pretty high, and it's not something they use personally so attempts to get them to want to pay for it have not been entirely successful. I suspect there are a lot of companies where the free 500MB/day version is getting significant use by people like me, with the vague hope that eventually it'll become so obvious to those who control the budgets that it's incredibly useful and efficiency-boosting that they'll realise they really ought to just pay up and do it right.
(no subject)
Date: 2011-03-24 08:22 am (UTC)All the quadro-gazillion options weren't as easily found out, though, and after an hour or two I gave up in disgust, and went back to swatch.
(no subject)
Date: 2011-03-24 08:34 am (UTC)And the web UI is very "discoverable".
So I'm really not seeing how it's difficult to figure out.
In the same boat
Date: 2011-03-24 03:32 pm (UTC)Kevin
http://google.com/profiles/kefoster
(no subject)
Date: 2011-03-24 11:51 pm (UTC)(no subject)
Date: 2011-03-24 11:57 pm (UTC)But bear in mind that you'll get locked out of searches if you routinely exceed 500MB/day of data unless you're paying for the Enterprise version.
Setup is a complete doddle, so there's very little reason not to play around with it. The documentation is pretty good, too.
(no subject)
Date: 2011-03-25 08:37 am (UTC)Probably it's like Perl - for some people just everything falls into place naturally, for others everything's just completely and utterly alien.
(no subject)
Date: 2011-03-25 09:01 am (UTC)Maybe it didn't do that with the version you looked at. I didn't start using it until 3.0 came along, which was a couple of years ago now. By that point the documentation was very clear and straightforward. The only spot where I could see some people having some difficulty was with getting remote syslog data into it, because to do that without buying a license you need to set up syslog-ng and have that push data into a FIFO for Splunk to read.
You add data by going into the settings section and poking at "Inputs". You search on that data by typing stuff in the textarea. You refine the search by typing more stuff, or by clicking on things (e.g., an IP address) to get results that match that term.
Most of it is about querying the data you've stored. Getting data in is a really small part of the whole.
(no subject)
Date: 2011-03-25 09:19 am (UTC)LOTS of docs on how to do advanced queries and stuff, but nothing at all that I could find (in 1-2 hours of poking) on how to throw the data I want to then massage into it in the first place - should I look in the web-UI, some config file buried in the filesystem, or what?
Which left me with a feeling exactly like the camel book - I heard about Perl, bought the book, and after the first couple pages was 900% more in the dark about it than before. (In the case of Perl it's because I know about variables, pointers, functions, double-pointers and all that from C, but didn't even know how to spell "scalar" (and it went steeply downhill from there - "that's how you deal with a hashref" - uh, and what the fuck is that?). The book wouldn't have been any less useful for me if it was written in Neanderthal. Completely different vocabular than anything I encountered before, no explanation whatsoever for the (to me!) confusing things.)