Revision

  $Id: webjob-run-tcpdump.base,v 1.9 2006/03/14 20:13:47 klm Exp $

Purpose

  This recipe demonstrates how to concurrently run tcpdump on a
  group of IDS sensors to collect network traffic.  Once the captures
  are complete, any output is transferred to the WebJob server where
  it can be further processed and disseminated.

Motivation

  This recipe grew out of the need to quickly identify all IDS
  sensors that were in a position to capture traffic that matched
  a specific tcpdump filter.

Requirements

  Cooking with this recipe requires an operational WebJob server.
  If you do not have one of those, refer to the instructions provided
  in the README.INSTALL file that comes with the source distribution.
  The latest source distribution is available here:

    http://sourceforge.net/project/showfiles.php?group_id=40788

  Each client must be running UNIX and have basic system utilities
  and WebJob (1.5.0 or higher) installed.

  The server must be running UNIX and have basic system utilities,
  Apache, and WebJob (1.5.0 or higher) installed.

  The commands presented throughout this recipe were designed to
  be executed within a Bourne shell (i.e., sh or bash).

  This recipe assumes that you have read and implemented the following
  recipe:

    http://webjob.sourceforge.net/Files/Recipes/webjob-run-periodic.txt

Time to Implement

  Assuming that you have satisfied all the requirements/prerequisites,
  this recipe should take less than one hour to implement.

Solution

  The following steps describe how to implement this recipe.


  1. Create the following symlink in your commands directory.  Also,
     make sure that your s_hlc_any script has a revision number of
     1.1.1.1 or higher.

     # ln -s s_hlc_any s_hlc_tcpdump

  2. Edit the hourly script (1.4 or higher) and add your list of
     hostnames to one of the unused groups.  If all existing groups
     are in use, you may need to create a new one (use a meaningful
     name and/or comment your modifications in the script).  For
     example, suppose you wanted to add a sniffer group for all
     East coast sensors.  To do this, you could declare the following
     variable in GetHostGroups():

       EAST_SNIFFERS="sensor1 sensor2 sensor3"

     Then, you'd need to update the MY_GROUPS variable as follows:

       MY_GROUPS="MY_GROUP1 MY_GROUP2 EAST_SNIFFERS"

     After that, you must modify the appropriate case statements in
     the regular and oneshot routines to recognize the new group.

     The example shown below inserts a job to capture 1000 packets
     to/from the IP address 1.1.1.1.  The recommended location for
     jobs such as this is RunOneShotJobs().

       for GROUP in `GetHostGroups` ; do
         case "${GROUP}" in
         MY_GROUP1)
           : # REPLACE WITH ONE OR MORE MY_GROUP1 JOBS
           ;;
         MY_GROUP2)
           : # REPLACE WITH ONE OR MORE MY_GROUP2 JOBS
           ;;
         EAST_SNIFFERS)
           ( egrep -i -v '(RunTimeLimit|TimeoutSignal)' ${WEBJOB_HOME}/etc/upload.cfg ; echo "RunTimeLimit=3590" ; echo "TimeoutSignal=2" ) | \
             ${WEBJOB_HOME}/bin/webjob -e -f - s_hlc_tcpdump -n -i eth1 -s 0 -c 1000 -w - 'ip host 1.1.1.1' &
           ;;
         esac
       done

     There are a few things to point out with the above job:

     a) The default run timer and timeout signal, if they exist,
	are removed using egrep, and new/replacement values are
	added.  This is done to prevent jobs from stacking up on
	the sensor -- that can happen if you schedule multiple
	oneshots, or if you run jobs on a periodic basis (e.g.,
	hourly).  When the run timer is set as shown, webjob will
	automatically send a SIGINT to tcpdump after 3590 seconds.
	Note: The default TimeoutSignal is SIGKILL.  In this case,
	sending the default signal would have worked, but any
	unwritten packets would be lost.  Since tcpdump handles
	SIGINTs, we use them instead.

     b) The '-c' option is set to 1000 to limit the number of
	packets collected.  If you're just trying to locate where
	a certain type of traffic is flowing, you don't need to
	capture a lot of it.

     c) The tcpdump job is set to run in the background.  This
	allows the hourly script to continue without waiting for
	the sniffer job to complete.

     d) The '-s' option may not work on older version of tcpdump.
        Check your man pages.

     e) The '-w' option is configured to write binary pcap data to
	stdout, so expect binary data to be uploaded to the server.

Closing Remarks

  Make sure RunTimeLimit is set to a value that is less than the
  run period (e.g., hourly).  If that is not done, jobs could stack
  up on the clients.  You should also run s_hlc_ps on an hourly
  basis -- the information provided by this command would be useful
  if you suspect that jobs are, in fact, stacking up.  Run the
  following command to create s_hlc_ps (do this in the appropriate
  commands directory):

     # ln -s s_hlc_any s_hlc_ps

  Note: A race condition can occur if you run this job on a periodic
  basis with a RunTimeLimit that is equal (or very close) to the job
  period.  For example, if your job period is hourly, then RunTimeLimit
  should be less than 3600 -- 10-30 seconds less should be sufficient.
  If this is not done, you may experience errors similar to this:

    Main(): WebJobDoRunStage(): WebJobDoKidStage(): execv(): s_hlc_tcpdump, No such file or directory

  Typically, this means that s_hlc_tcpdump is being deleted by one job
  while another job is getting ready (or already trying) to execute it
  -- hence, the race.

Credits

  This recipe was brought to you by Klayton Monroe.

References

  tcpdump(1)