Revision $Id: webjob-pad-rsync.base,v 1.5 2005/07/18 22:07:40 klm Exp $ Purpose This recipe demonstrates how to synchronize files using rsync, OpenSSH, WebJob, and PaD. The rsync process is driven with WebJob and secured through OpenSSH. Dynamic authentication, through PaD, eliminates the need to maintain extra SSH credentials on the client. Motivation The motivation for this recipe was to create a synchronization process that is efficient, encrypted, secure, scalable, and manageable. Independently, rsync provides an efficient method for synchronizing files, OpenSSH provides secure shell access, WebJob provides a scalable framework that is conducive to centralized management, and PaD turns regular files into self-extracting executables. In combination, these tools get the job done and get it done pretty well. Requirements Cooking with this recipe requires an operational WebJob server. If you do not have one of those, refer to the instructions provided in the README.INSTALL file that comes with the source distribution. The latest source distribution is available here: http://sourceforge.net/project/showfiles.php?group_id=40788 Each client must be running UNIX (or Cygwin on Win2K) and have basic system utilities, OpenSSH, rsync, and WebJob installed. The server must be running UNIX and have basic system utilities, PaD utilities, OpenSSH, rsync, and WebJob (1.5.0 or higher) installed. The commands presented throughout this recipe were designed to be executed within a Bourne shell (i.e., sh or bash). Solution The solution is to periodically download webjob_rsync_id.pad and execute the desired rsync command. The following steps describe how to implement this solution. 1. Create a locked user account called 'rsync' on the WebJob server. This account must be locked to prevent unnecessary logins and password authentication. All authentication for this recipe will be done using SSH keys. Before you begin, set the following environment variables in your shell as they will be referenced throughout the remainder of the recipe. # RSYNC_ID="873" # RSYNC_USER="rsync" # RSYNC_GROUP="rsync" # RSYNC_HOME="/usr/home/${RSYNC_USER}" FreeBSD: # pw groupadd ${RSYNC_GROUP} -g ${RSYNC_ID} # pw useradd ${RSYNC_USER} -u ${RSYNC_ID} -g ${RSYNC_GROUP} -d ${RSYNC_HOME} -s /bin/sh -c "Rsync User" Linux and Solaris: # groupadd -g ${RSYNC_ID} ${RSYNC_GROUP} # useradd -u ${RSYNC_ID} -g ${RSYNC_GROUP} -d ${RSYNC_HOME} -s /bin/sh -c "Rsync User" ${RSYNC_USER} Create a root-owned, home directory for the rsync user. This directory should be root-owned so that the rsync user, alone, does not have sufficient privileges to create or modify dot files. # mkdir -p ${RSYNC_HOME} # find ${RSYNC_HOME} -type d -exec chmod 750 {} \; # find ${RSYNC_HOME} -type f -exec chmod 640 {} \; # chown -R 0:${RSYNC_GROUP} ${RSYNC_HOME} 2. Create an SSH key pair for the rsync user on the WebJob server. Then, add the public key along with any options you desire to rsync's authorized_keys file. The key you create in this step will not be protected with a passphrase. Therefore, we recommend that you restrict its use by applying various key options. This is important because the private key will be vulnerable to capture on the client during job execution. Details about the various key options can be found in the "AUTHORIZED_KEYS FILE FORMAT" section of the sshd(8) man page. # KEY_NAME="webjob_rsync_id" # KEY_TYPE="rsa" # ALLOWED_HOSTS="1.1.1.1" # KEY_OPTIONS="from=\"${ALLOWED_HOSTS}\",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty" # mkdir -p ${RSYNC_HOME}/.ssh # ssh-keygen -q -t ${KEY_TYPE} -C ${KEY_NAME} -f ${RSYNC_HOME}/.ssh/${KEY_NAME} -N "" # echo ${KEY_OPTIONS} `cat ${RSYNC_HOME}/.ssh/${KEY_NAME}.pub` > ${RSYNC_HOME}/.ssh/authorized_keys # find ${RSYNC_HOME}/.ssh -type d -exec chmod 750 {} \; # find ${RSYNC_HOME}/.ssh -type f -exec chmod 640 {} \; # chown -R 0:${RSYNC_GROUP} ${RSYNC_HOME}/.ssh 3. Create a data tree for the rsync user. This tree will be used as a sink or source depending on whether rsync is doing a push or a pull, respectively. # RSYNC_TREE=${RSYNC_HOME}/data # mkdir -p ${RSYNC_TREE} # find ${RSYNC_TREE} -type d -exec chmod 750 {} \; # find ${RSYNC_TREE} -type f -exec chmod 640 {} \; # chown -R ${RSYNC_USER}:${RSYNC_GROUP} ${RSYNC_TREE} 4. Create a PaD script that contains the private SSH key. Put this script in the appropriate commands directory. Set the WEBJOB_CLIENT and WEBJOB_COMMANDS environment variables as appropriate for your server. Note: If you only want this key to be available to a particular client, don't set WEBJOB_CLIENT to 'common'. Instead, set it to the specific client ID you want to use -- e.g., 'client_0001'. # WEBJOB_CLIENT=common # WEBJOB_COMMANDS=/var/webjob/profiles/${WEBJOB_CLIENT}/commands # pad-make-script -c ${RSYNC_HOME}/.ssh/${KEY_NAME} | sed 's/PAD_UMASK-022/PAD_UMASK-077/g;' > ${WEBJOB_COMMANDS}/${KEY_NAME}.pad # chown 0:0 ${WEBJOB_COMMANDS}/${KEY_NAME}.pad # chmod 644 ${WEBJOB_COMMANDS}/${KEY_NAME}.pad More details on PaD, can be found here: http://webjob.sourceforge.net/WebJob/PayloadAndDelivery.shtml 5. Create a cron job on each client that periodically executes rsync by way of webjob_rsync_id.pad. The arguments passed to webjob_rsync_id.pad should form a valid rsync command, but with any quotes escaped. The following examples demonstrate how to run rsync in archive mode (-a) over SSH (-e) and with compression (-z) enabled. Note that the PaD script will replace the %payload token with webjob_rsync_id prior to executing rsync. Here's an example rsync push (client --> server): 0 * * * * ${WEBJOB_HOME=/usr/local/webjob}/bin/webjob -e -f ${WEBJOB_HOME}/etc/upload.cfg webjob_rsync_id.pad rsync -avze \"ssh -i %payload -o BatchMode=yes\" /data rsync@1.1.1.1:data/ Here's an example rsync pull (server --> client): 0 * * * * ${WEBJOB_HOME=/usr/local/webjob}/bin/webjob -e -f ${WEBJOB_HOME}/etc/upload.cfg webjob_rsync_id.pad rsync -avze \"ssh -i %payload -o BatchMode=yes\" rsync@1.1.1.1:data /data/ One problem to watch out for is OpenSSH's host key checking. If the server's key isn't known to the client, OpenSSH's default action will be to ask, and that could cause the job to hang. Basically, there are two ways to avoid this situation: make sure the client's known_hosts file contains the appropriate key or disable StrictHostKeyChecking. As a fallback plan, you can also set WebJob's RunTimeLimit control to abort the job after a specified amount of time. Closing Remarks Using cron to invoke webjob_rsync_id.pad is not conducive to a centralized management scheme unless crontabs are also being centrally managed (e.g., via WebJob). A better approach would be to periodically (e.g., hourly) run a centrally managed meta script that runs webjob_rsync_id.pad as a subtask. The SSH keys used in this recipe are dynamic in the sense that they do not persist on the client. However, they are static on the server. A better approach would be to have the server automatically generate a new key for each job that is executed. Credits This recipe was brought to you by Klayton Monroe and Andy Bair. Appendix 1 The following script is a work-in-progress, and it needs a lot more polishing and error handling. Send your patches ;) The following command will extract the webjob_setup_rsync_account script. sed -e '1,/^--- webjob_setup_rsync_account ---$/d; /^--- webjob_setup_rsync_account ---$/,$d' webjob-pad-rsync.txt > webjob_setup_rsync_account --- webjob_setup_rsync_account --- #!/bin/sh -e ###################################################################### # # $Id: webjob_setup_rsync_account,v 1.5 2005/07/29 21:57:35 klm Exp $ # ###################################################################### # # Copyright 2004-2004 The WebJob Project, All Rights Reserved. # ###################################################################### # # Purpose: Prepare a WebJob server for rsync access. # ###################################################################### IFS=' ' PATH=/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/bin:${WEBJOB_HOME=/usr/local/webjob}/bin PROGRAM=`basename $0` Usage() { echo 1>&2 echo "Usage: ${PROGRAM} [-a address] [-c client-id] [-d webjob-basedir] [-g group] [-h home] [-i id] [-t key-type] [-u user] -s {freebsd|linux|solaris}" 1>&2 echo 1>&2 exit 1 } while getopts "a:c:d:g:h:i:s:t:u:" OPTION ; do case "${OPTION}" in a) ALLOWED_HOSTS="${OPTARG}" ;; c) WEBJOB_CLIENT="${OPTARG}" ;; d) WEBJOB_BASEDIR="${OPTARG}" ;; g) TARGET_GROUP="${OPTARG}" ;; h) TARGET_HOME="${OPTARG}" ;; i) TARGET_ID="${OPTARG}" ;; s) SYSTEM_TYPE="${OPTARG}" ;; t) KEY_TYPE="${OPTARG}" ;; u) TARGET_USER="${OPTARG}" ;; *) Usage ;; esac done if [ ${OPTIND} -le $# ] ; then Usage fi if [ -z "${SYSTEM_TYPE}" ] ; then Usage fi TARGET_ID=${TARGET_ID-873} TARGET_USER=${TARGET_USER-rsync} TARGET_GROUP=${TARGET_GROUP-rsync} TARGET_HOME=${TARGET_HOME-/usr/home/${TARGET_USER}} TARGET_TREE=${TARGET_HOME}/data # # Run some conflict tests. # egrep ${TARGET_ID} /etc/passwd > /dev/null 2>&1 ; UID_EXISTS=$? egrep ${TARGET_USER} /etc/passwd > /dev/null 2>&1 ; USER_EXISTS=$? egrep ${TARGET_ID} /etc/group > /dev/null 2>&1 ; GID_EXISTS=$? egrep ${TARGET_GROUP} /etc/group > /dev/null 2>&1 ; GROUP_EXISTS=$? if [ ${UID_EXISTS} -eq 0 -o ${GID_EXISTS} -eq 0 -o ${USER_EXISTS} -eq 0 -o ${GROUP_EXISTS} -eq 0 -o -d ${TARGET_HOME} ] ; then echo "${PROGRAM}: You must remove any uid, gid, user, group, or home dir conflicts before this script will run." 1>&2 exit 2 fi # # Create the rsync account. # case "${SYSTEM_TYPE}" in freebsd) pw groupadd ${TARGET_GROUP} -g ${TARGET_ID} pw useradd ${TARGET_USER} -u ${TARGET_ID} -g ${TARGET_GROUP} -d ${TARGET_HOME} -s /bin/sh -c "Rsync User" ;; linux|solaris) groupadd -g ${TARGET_ID} ${TARGET_GROUP} useradd -u ${TARGET_ID} -g ${TARGET_GROUP} -d ${TARGET_HOME} -s /bin/sh -c "Rsync User" ${TARGET_USER} ;; *) esac # # Create directory structure. # DIRS=" ${TARGET_HOME} ${TARGET_HOME}/.ssh ${TARGET_TREE} " for DIR in ${DIRS} ; do mkdir -p ${DIR} done # # Create SSH key pair. # KEY_NAME="webjob_${TARGET_USER}_id" KEY_TYPE=${KEY_TYPE-rsa} ALLOWED_HOSTS=${ALLOWED_HOSTS-1.1.1.1} KEY_OPTIONS="from=\"${ALLOWED_HOSTS}\",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty" ssh-keygen -q -t ${KEY_TYPE} -C ${KEY_NAME} -f ${TARGET_HOME}/.ssh/${KEY_NAME} -N "" echo ${KEY_OPTIONS} `cat ${TARGET_HOME}/.ssh/${KEY_NAME}.pub` > ${TARGET_HOME}/.ssh/authorized_keys # # Tidy up perms # find ${TARGET_HOME} -type d -exec chmod 750 {} \; find ${TARGET_HOME} -type f -exec chmod 640 {} \; # # Tidy up ownerships # chown -R 0:${TARGET_GROUP} ${TARGET_HOME} chown -R ${TARGET_USER}:${TARGET_GROUP} ${TARGET_TREE} # # Package private SSH key as a PaD file and install in the specified commands directory. # WEBJOB_CLIENT=${WEBJOB_CLIENT-common} WEBJOB_BASEDIR=${WEBJOB_BASEDIR-/var/webjob} WEBJOB_COMMANDS=${WEBJOB_BASEDIR}/profiles/${WEBJOB_CLIENT}/commands pad-make-script -c ${TARGET_HOME}/.ssh/${KEY_NAME} | sed 's/PAD_UMASK-022/PAD_UMASK-077/g;' > ${WEBJOB_COMMANDS}/${KEY_NAME}.pad chown 0:0 ${WEBJOB_COMMANDS}/${KEY_NAME}.pad chmod 644 ${WEBJOB_COMMANDS}/${KEY_NAME}.pad --- webjob_setup_rsync_account ---