Revision $Id: webjob-run-tcpdump.base,v 1.9 2006/03/14 20:13:47 klm Exp $ Purpose This recipe demonstrates how to concurrently run tcpdump on a group of IDS sensors to collect network traffic. Once the captures are complete, any output is transferred to the WebJob server where it can be further processed and disseminated. Motivation This recipe grew out of the need to quickly identify all IDS sensors that were in a position to capture traffic that matched a specific tcpdump filter. Requirements Cooking with this recipe requires an operational WebJob server. If you do not have one of those, refer to the instructions provided in the README.INSTALL file that comes with the source distribution. The latest source distribution is available here: http://sourceforge.net/project/showfiles.php?group_id=40788 Each client must be running UNIX and have basic system utilities and WebJob (1.5.0 or higher) installed. The server must be running UNIX and have basic system utilities, Apache, and WebJob (1.5.0 or higher) installed. The commands presented throughout this recipe were designed to be executed within a Bourne shell (i.e., sh or bash). This recipe assumes that you have read and implemented the following recipe: http://webjob.sourceforge.net/Files/Recipes/webjob-run-periodic.txt Time to Implement Assuming that you have satisfied all the requirements/prerequisites, this recipe should take less than one hour to implement. Solution The following steps describe how to implement this recipe. 1. Create the following symlink in your commands directory. Also, make sure that your s_hlc_any script has a revision number of 1.1.1.1 or higher. # ln -s s_hlc_any s_hlc_tcpdump 2. Edit the hourly script (1.4 or higher) and add your list of hostnames to one of the unused groups. If all existing groups are in use, you may need to create a new one (use a meaningful name and/or comment your modifications in the script). For example, suppose you wanted to add a sniffer group for all East coast sensors. To do this, you could declare the following variable in GetHostGroups(): EAST_SNIFFERS="sensor1 sensor2 sensor3" Then, you'd need to update the MY_GROUPS variable as follows: MY_GROUPS="MY_GROUP1 MY_GROUP2 EAST_SNIFFERS" After that, you must modify the appropriate case statements in the regular and oneshot routines to recognize the new group. The example shown below inserts a job to capture 1000 packets to/from the IP address 1.1.1.1. The recommended location for jobs such as this is RunOneShotJobs(). for GROUP in `GetHostGroups` ; do case "${GROUP}" in MY_GROUP1) : # REPLACE WITH ONE OR MORE MY_GROUP1 JOBS ;; MY_GROUP2) : # REPLACE WITH ONE OR MORE MY_GROUP2 JOBS ;; EAST_SNIFFERS) ( egrep -i -v '(RunTimeLimit|TimeoutSignal)' ${WEBJOB_HOME}/etc/upload.cfg ; echo "RunTimeLimit=3590" ; echo "TimeoutSignal=2" ) | \ ${WEBJOB_HOME}/bin/webjob -e -f - s_hlc_tcpdump -n -i eth1 -s 0 -c 1000 -w - 'ip host 1.1.1.1' & ;; esac done There are a few things to point out with the above job: a) The default run timer and timeout signal, if they exist, are removed using egrep, and new/replacement values are added. This is done to prevent jobs from stacking up on the sensor -- that can happen if you schedule multiple oneshots, or if you run jobs on a periodic basis (e.g., hourly). When the run timer is set as shown, webjob will automatically send a SIGINT to tcpdump after 3590 seconds. Note: The default TimeoutSignal is SIGKILL. In this case, sending the default signal would have worked, but any unwritten packets would be lost. Since tcpdump handles SIGINTs, we use them instead. b) The '-c' option is set to 1000 to limit the number of packets collected. If you're just trying to locate where a certain type of traffic is flowing, you don't need to capture a lot of it. c) The tcpdump job is set to run in the background. This allows the hourly script to continue without waiting for the sniffer job to complete. d) The '-s' option may not work on older version of tcpdump. Check your man pages. e) The '-w' option is configured to write binary pcap data to stdout, so expect binary data to be uploaded to the server. Closing Remarks Make sure RunTimeLimit is set to a value that is less than the run period (e.g., hourly). If that is not done, jobs could stack up on the clients. You should also run s_hlc_ps on an hourly basis -- the information provided by this command would be useful if you suspect that jobs are, in fact, stacking up. Run the following command to create s_hlc_ps (do this in the appropriate commands directory): # ln -s s_hlc_any s_hlc_ps Note: A race condition can occur if you run this job on a periodic basis with a RunTimeLimit that is equal (or very close) to the job period. For example, if your job period is hourly, then RunTimeLimit should be less than 3600 -- 10-30 seconds less should be sufficient. If this is not done, you may experience errors similar to this: Main(): WebJobDoRunStage(): WebJobDoKidStage(): execv(): s_hlc_tcpdump, No such file or directory Typically, this means that s_hlc_tcpdump is being deleted by one job while another job is getting ready (or already trying) to execute it -- hence, the race. Credits This recipe was brought to you by Klayton Monroe. References tcpdump(1)