In modern DevOps environments, maintaining the integrity and availability of Jenkins' data is essential. One of the key aspects of Jenkins administration is backing up the Jenkins home directory, which contains important configurations, plugins, and job data. In this blog post, we will explore how to automate Jenkins backups using a shell script with rsync, monitor the backup process, and send notifications to Slack for real-time monitoring.
Why Backup Jenkins Home Data?
The Jenkins home directory is the heart of your Jenkins instance. It contains:
Job configurations and builds
Installed plugins and configurations
User settings and credentials
Build logs and artifacts
Regular backups of this directory ensure that in case of a failure (e.g., hardware issues, accidental deletion), you can easily restore Jenkins to its previous state.
The Backup Strategy
Our backup strategy involves the following steps:
Sync Jenkins Home Data: We use rsync, a powerful file synchronization tool, to copy the Jenkins home directory (/mnt/jenkins-home-data-pvc) to a backup destination (e.g., an NFS mount).
Handle Mounts: We check if the mount point is an NFS mount or local storage and decide the direction of the backup (whether to sync from source to destination or vice versa).
Retries and Error Handling: We ensure reliability by implementing retry logic. If the backup fails, the script retries the operation up to three times before sending an alert.
Notifications: We integrate with Slack to send real-time notifications in case of errors or interruptions during the backup process.
The Script Breakdown
Here’s an overview of the key components of the backup script.
1. Configuring Paths and Variables
The script starts by defining several key paths:
# Path to the Kubernetes config file
CONFIG="/path/to/your/kubeconfig"
# Log directory for backup logs
LOG_DIR="/var/log/jenkins-backup-logs"
# Source and destination directories for rsync
SOURCE="/mnt/jenkins-home-data-pvc" # Source path (e.g., Jenkins PVC)
DESTINATION="/data/starstore/jenkins2/JENKINS-HOME" # Destination path (e.g., NFS mount)
# Mount point directory (e.g., where the NFS is mounted)
MOUNT_POINT="/data/starstore/jenkins2"
# Slack webhook URL for notifications (replace with your actual webhook URL)
SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
These paths define the source and destination for the backup, the log directory for backup logs, and the Slack webhook URL for sending notifications.
2. Rsync Functions for Backup
The script uses rsync to perform the backup. It defines two primary functions to handle syncing:
Sync from Source to Destination: This syncs data from the Jenkins home directory to the backup destination.
coderun_rsync_to_destination() {
local retries=3
local count=0
log_info "Starting rsync from source to destination: rsync -Pavhz --delete $SOURCE/* $DESTINATION"
while (( count < retries )); do
if rsync -Pavhz --delete "$SOURCE"/* "$DESTINATION"/ >> >(while read -r line; do echo "$(date +'%Y-%m-%d %H:%M:%S') $line"; done >> "$LOG_FILE" 2>&1); then
log_info "rsync to destination completed successfully"
return 0
else
log_error "rsync to destination encountered an error, attempt $((count + 1))/$retries"
count=$((count + 1))
sleep 60 # Wait for 60 seconds before retrying
fi
done
log_error "rsync to destination failed after $retries attempts"
send_slack_message "Jenkins Home DIR sync to NFS encountered an error after $retries attempts"
return 1
}
Sync from Destination to Source: In case the storage is not local, this function synchronizes data from the backup destination back to the source.
coderun_rsync_to_source() {
local retries=3
local count=0
log_info "Starting rsync from destination to source: rsync -Pavhz --delete $DESTINATION/* $SOURCE"
while (( count < retries )); do
if rsync -Pavhz --delete "$DESTINATION"/* "$SOURCE"/ >> >(while read -r line; do echo "$(date +'%Y-%m-%d %H:%M:%S') $line"; done >> "$LOG_FILE" 2>&1); then
log_info "rsync to source completed successfully"
return 0
else
log_error "rsync to source encountered an error, attempt $((count + 1))/$retries"
count=$((count + 1))
sleep 60 # Wait for 60 seconds before retrying
fi
done
log_error "rsync to source failed after $retries attempts"
send_slack_message "Jenkins Home DIR sync to local encountered an error after $retries attempts"
return 1
}
3. Mount and Storage Class Checks
Before performing the backup, the script verifies the storage class of the Jenkins PVC and checks if the destination is an NFS mount:
# Function to check if the Kubernetes storage class is local-storage
is_local_storage() {
kubectl --kubeconfig "$CONFIG" get pvc -n jenkins -o=jsonpath='{.items[*].spec.storageClassName}' | grep -q "local-storage"
}
# Function to check if the destination is an NFS mount
is_nfs_mount() {
if mountpoint -q "$MOUNT_POINT"; then
if grep "$MOUNT_POINT" /proc/mounts | grep -q nfs; then
return 0 # It is an NFS mount
else
log_error "$MOUNT_POINT is a mount point but not an NFS mount"
send_slack_message "$MOUNT_POINT is a mount point but not an NFS mount"
return 1
fi
else
log_error "$MOUNT_POINT is not a mount point"
send_slack_message "$MOUNT_POINT is not a mount point"
return 1
fi
}
4. Signal Handling and Notifications
The script also handles interruptions (e.g., from SIGINT, SIGTERM, SIGHUP) and sends the last few lines of the log to Slack:
# Function to handle signals (SIGINT, SIGTERM, SIGHUP)
handle_signal() {
local signal_name="$1"
log_warn "Received $signal_name signal. Notifying..."
# Get the last 10 lines of the log file for debugging
local log_snippet
log_snippet=$(tail -n 10 "$LOG_FILE")
# Send the log snippet along with the message
send_slack_message "rsync process received $signal_name signal and was interrupted. Last log lines:\n\`\`\`$log_snippet\`\`\`"
}
5. Main Loop and Timing
Finally, the script runs in a loop, checking the storage configuration and executing the backup. It pauses for 6 minutes between each iteration to allow for regular backups:
codemain_loop() {
while true; do
# Update the log file name each loop iteration to reflect the current date
LOG_FILE="$LOG_DIR/jenkins-backup-$(date +'%Y-%m-%d').log"
log_info "Checking if the storage class is local-storage..."
if is_local_storage; then
log_info "Checking if $MOUNT_POINT is an NFS mount..."
if is_nfs_mount; then
log_info "Storage class is local-storage and $MOUNT_POINT is an NFS mount. Starting rsync from source to destination..."
if run_rsync_to_destination; then
log_info "Sync to destination completed successfully. Waiting for next sync interval..."
else
log_error "Sync to destination failed. Retrying in the next interval."
fi
else
log_warn "Skipping rsync because $MOUNT_POINT is not an NFS mount."
fi
else
log_info "Checking if $MOUNT_POINT is an NFS mount..."
if is_nfs_mount; then
log_info "Storage class is not local-storage and $MOUNT_POINT is an NFS mount. Starting rsync from destination to source..."
if run_rsync_to_source; then
log_info "Sync to source completed successfully. Waiting for next sync interval..."
else
log_error "Sync to source failed. Retrying in the next interval."
fi
else
log_warn "Skipping rsync because $MOUNT_POINT is not an NFS mount."
fi
fi
sleep 360 # Sleep for 6 minutes (360 seconds) before the next sync
done
}
Conclusion
This script automates the backup process for Jenkins home data, ensuring your Jenkins instance is always backed up and recoverable. By using rsync, we can efficiently synchronize data between local storage and NFS, while Slack notifications keep you informed of the backup status. By setting up a cron job to run this script periodically, you can ensure your Jenkins environment is well-protected against data loss.
Feel free to customize this blog post further to match your GitBook’s style and audience.