Backing up audits in Apache Solr
In self-managed deployments, audit logs are stored in Apache Solr, which is used by Privacera Portal to display audit logs. If you want to back up the audit logs in Apache Solr, you can use the script provided in this section.
It is highly recommended to configure the Audit Server to send audit logs to external storage, such as GCS, ADLS, or S3.
Prerequisites
Prerequisite | Description |
Apache Solr | In Self Managed deployments, Apache Solr is installed by default |
Backup Script
To manually back up audits from the ranger_audits collection, follow these steps:
Create a script file named backup_solr_docs.sh
with the following content:
Update the value to SOLR_URL and ensure it is accessible from the machine where the script is executed.
Script to Backup Audits in Apache Solr
backup_solr_docs.sh |
---|
| #!/bin/bash
# Solr configuration
# Update with your Solr URL and make sure it is accessible from the machine where the script is executed
SOLR_URL="https://localhost:8983/solr" # Update with your Solr URL
COLLECTION_NAME="ranger_audits" # Replace with your collection name
DATE_FIELD="evtTime" # Replace with your date field name
# Input arguments: start and end date
START_DATE=$1
END_DATE=$2
OUTPUT_DIR=${3:-"./solr_backup"}
# Create output directory if it doesn't exist
mkdir -p "${OUTPUT_DIR}"
# Function to get the total number of documents
get_total_documents() {
QUERY="q=*:*&fq=:evtTime[${START_DATE}T00:00:00Z TO ${END_DATE}T23:59:59Z]&rows=0"
RESPONSE=$(curl -s -G "${SOLR_URL}/${COLLECTION_NAME}/select" --data-urlencode "q=*:*" --data-urlencode "fq=evtTime:[2024-01-01T00:00:00Z TO 2024-12-31T23:59:59Z]" --data "rows=0")
TOTAL_DOCS=$(echo "${RESPONSE}" | jq '.response.numFound')
if [[ -z "${TOTAL_DOCS}" || "${TOTAL_DOCS}" == "null" ]]; then
echo "Error: Unable to determine total documents. Check query or Solr response."
exit 1
fi
echo "${TOTAL_DOCS}"
}
# Backup documents with pagination
backup_documents() {
TOTAL_DOCS=$(get_total_documents)
echo "Total documents to back up: ${TOTAL_DOCS}"
# Solr pagination parameters
BATCH_SIZE=1000
START=0
while [ "${START}" -lt "${TOTAL_DOCS}" ]; do
echo "Backing up documents ${START} to $((${START} + ${BATCH_SIZE}))..."
# Encode query parameters to handle special characters properly
QUERY="fq=evtTime%3A%5B${START_DATE}T00%3A00%3A00Z%20TO%20${END_DATE}T23%3A59%3A59Z%5D&indent=true&q.op=OR&q=*%3A*&rows=1000&sort=evtTime%20desc&start=${START}&rows=${BATCH_SIZE}"
# Define output file
OUTPUT_FILE="${OUTPUT_DIR}/backup_${START}_to_$((${START} + ${BATCH_SIZE})).json"
# Execute curl with the encoded query
curl -s "${SOLR_URL}/${COLLECTION_NAME}/select?${QUERY}" -o "${OUTPUT_FILE}"
# Log output file
if [ $? -eq 0 ]; then
echo "Backup saved to ${OUTPUT_FILE}"
else
echo "Error: Backup failed for documents ${START} to $((${START} + ${BATCH_SIZE}))"
fi
# Increment start for the next batch
START=$((START + BATCH_SIZE))
done
echo "Backup completed. Files saved in ${OUTPUT_DIR}"
}
# Main execution
if [ -z "${START_DATE}" ] || [ -z "${END_DATE}" ]; then
echo "Usage: $0 <start_date> <end_date> [output_dir]"
exit 1
fi
backup_documents
|
- Make the Script Executable
Bash |
---|
| chmod +x backup_solr_docs.sh
|
- Run the Script
Bash |
---|
| #./delete_ranger_audits.sh ${delete_doc_from_date} ${delete_doc_to_date}
./backup_solr_docs.sh 2024-01-01 2024-12-31 /home/privacera
|