Backup & Recovery
This page describes the backup strategies and disaster recovery procedures available for AutoMem deployments. It covers the three-layer backup architecture, automated backup methods, configuration options, backup formats, and four recovery paths for different failure scenarios. For monitoring backup health and detecting data drift, see Health Monitoring.
Backup Architecture Overview
Section titled “Backup Architecture Overview”AutoMem implements a defense-in-depth backup strategy with three independent layers:
graph TB
subgraph "Layer 1: Persistent Volumes"
RailwayVol["Railway Volume Snapshots<br/>Automatic every 24h<br/>Recovery: 5 minutes"]
VolumeFeatures["- One-click restore<br/>- Railway Dashboard access<br/>- Included with Railway Pro<br/>- FalkorDB only"]
end
subgraph "Layer 2: Dual Storage Redundancy"
FalkorDB["FalkorDB<br/>Canonical record<br/>Graph + metadata"]
Qdrant["Qdrant<br/>Backup record<br/>Vectors + payloads"]
RecoverQdrant["recover_from_qdrant.py<br/>Recovery: 10 minutes<br/>Success rate: 99.7%"]
FalkorDB <-->|"Redundancy"| Qdrant
Qdrant --> RecoverQdrant
RecoverQdrant --> FalkorDB
end
subgraph "Layer 3: Automated Backups"
GitHubWorkflow[".github/workflows/backup.yml<br/>Every 6 hours<br/>Free tier: 2000 min/month"]
BackupScript["backup_automem.py<br/>Compressed JSON exports"]
LocalBackups["./backups/<br/>Last 7-14 backups"]
S3Backups["S3 Storage<br/>Cross-region replication<br/>Recovery: 30 minutes"]
GitHubWorkflow --> BackupScript
BackupScript --> LocalBackups
BackupScript --> S3Backups
end
Failure["Data Loss Event"] --> Layer1Check{"Layer 1<br/>Available?"}
Layer1Check -->|Yes| RailwayVol
Layer1Check -->|No| Layer2Check{"Layer 2<br/>Available?"}
Layer2Check -->|Yes| RecoverQdrant
Layer2Check -->|No| Layer3Check{"Layer 3<br/>Available?"}
Layer3Check -->|Yes| S3Backups
Layer3Check -->|No| DataLoss["Complete data loss<br/>Start fresh"]
Layer Responsibilities
Section titled “Layer Responsibilities”| Layer | Mechanism | Recovery Speed | Scope | Platform Lock |
|---|---|---|---|---|
| Infrastructure | Railway volume snapshots | Instant | FalkorDB only | Yes (Railway) |
| Dual Storage | Real-time dual writes | Immediate | Both databases | No |
| Application | Script exports | Minutes | Both databases | No |
| Automated | Scheduled execution | N/A (prevention) | Both databases | No |
Infrastructure Backups (Layer 1)
Section titled “Infrastructure Backups (Layer 1)”Railway Volume Snapshots
Section titled “Railway Volume Snapshots”Railway provides automatic volume backups for the FalkorDB persistent volume configured in the deployment template.
The FalkorDB service uses a persistent volume mounted at /var/lib/falkordb/data. The Redis persistence settings ensure data durability with the following configuration:
- RDB snapshots every 60 seconds if at least 1 write
- AOF (Append-Only File) for write-ahead logging
fsyncevery second for durability
Accessing and restoring snapshots:
- Railway Dashboard → FalkorDB service
- “Backups” tab shows snapshot history
- One-click restore from any snapshot
Limitations:
- Only covers FalkorDB (Qdrant not included)
- Cannot export or download backups
- Platform-locked to Railway
- Best for quick recovery from recent failures
Application-Level Backups (Layer 3)
Section titled “Application-Level Backups (Layer 3)”backup_automem.py Script
Section titled “backup_automem.py Script”The core backup script at scripts/backup_automem.py exports data from both FalkorDB and Qdrant to compressed JSON files.
Backup Format
Section titled “Backup Format”FalkorDB Export Structure:
The FalkorDB export captures the entire Redis keyspace including:
- Memory nodes with all properties
- Relationship edges
- Metadata and indices
- Graph structure information
Qdrant Export Structure:
The Qdrant export includes:
- Vector embeddings (768-dimensional or 1024-dimensional depending on provider)
- Payload data (memory content, metadata, tags)
- Point IDs mapped to memory IDs
- Collection configuration
Both exports are compressed using gzip with .json.gz extension, typically achieving 70-80% compression ratio.
Command-Line Usage
Section titled “Command-Line Usage”# Basic backup - creates timestamped backups in ./backups/falkordb/ and ./backups/qdrant/python scripts/backup_automem.py
# With retention policy - deletes backups older than 7 days after creating new onespython scripts/backup_automem.py --cleanup --keep 7
# Custom directorypython scripts/backup_automem.py --output /path/to/backups
# With S3 upload - requires boto3 and AWS credentials set via environment variablespython scripts/backup_automem.py --s3-bucket automem-backupsLocal Filesystem Storage
Section titled “Local Filesystem Storage”Backups are written to timestamped subdirectories:
backups/├── falkordb/│ ├── falkordb_20251020_143000.json.gz│ ├── falkordb_20251020_203000.json.gz│ └── ...└── qdrant/ ├── qdrant_20251020_143000.json.gz ├── qdrant_20251020_203000.json.gz └── ...The --cleanup --keep N flag removes backups older than N days based on filename timestamp parsing.
S3 Cloud Storage
Section titled “S3 Cloud Storage”s3://automem-backups/├── falkordb/│ ├── falkordb_20251020_143000.json.gz│ └── ...└── qdrant/ ├── qdrant_20251020_143000.json.gz └── ...S3 Cost Estimation:
| Component | Formula | Example |
|---|---|---|
| Storage | $0.023/GB/month | 100MB backup = $0.0023/month |
| PUT requests | $0.005/1000 requests | 4 backups/day = $0.60/year |
| GET requests (restore) | $0.0004/1000 requests | Negligible |
| Total (100MB, every 6h) | - | ~$0.30/month |
Automated Backup Methods (Layer 4)
Section titled “Automated Backup Methods (Layer 4)”GitHub Actions Workflow
Section titled “GitHub Actions Workflow”The recommended automation method uses GitHub Actions to run backups on a schedule without consuming Railway resources.
sequenceDiagram
participant GHA as "GitHub Actions<br/>backup.yml"
participant TCP as "Railway TCP Proxy<br/>monorail.proxy.rlwy.net"
participant FDB as "FalkorDB<br/>:6379"
participant QDR as "Qdrant Cloud<br/>HTTPS API"
participant Script as "backup_automem.py"
participant S3 as "S3 Bucket<br/>automem-backups"
Note over GHA: Triggered every 6 hours<br/>or manually via workflow_dispatch
GHA->>GHA: Checkout code
GHA->>GHA: Install dependencies<br/>requirements.txt + boto3
Note over GHA,TCP: Connectivity check (critical)
GHA->>GHA: Validate FALKORDB_HOST<br/>Must NOT be *.railway.internal
GHA->>TCP: Test TCP connection<br/>timeout 10s
TCP->>FDB: Forward connection
FDB-->>TCP: Connection OK
TCP-->>GHA: ✅ Connectivity verified
Note over GHA,S3: Backup execution
GHA->>Script: Execute with env vars<br/>FALKORDB_*, QDRANT_*, AWS_*
Script->>TCP: Connect via TCP Proxy
TCP->>FDB: Redis protocol commands
FDB-->>TCP: Export graph data
TCP-->>Script: Graph JSON
Script->>QDR: HTTPS GET /collections/memories/points
QDR-->>Script: Vector data with payloads
Script->>Script: Compress to .json.gz<br/>backups/falkordb/<br/>backups/qdrant/
alt S3 Upload Enabled
Script->>S3: Upload via boto3<br/>s3://automem-backups/
S3-->>Script: Upload complete
end
Script-->>GHA: Exit 0 (success)
GHA->>GHA: Log backup summary<br/>File sizes and timestamps
The workflow is defined in .github/workflows/backup.yml and triggers every 6 hours or manually via workflow_dispatch.
Required GitHub Secrets
Section titled “Required GitHub Secrets”| Secret | Purpose | Example | Used By |
|---|---|---|---|
FALKORDB_HOST | Railway TCP proxy domain | monorail.proxy.rlwy.net | redis.Redis() connection |
FALKORDB_PORT | Railway TCP proxy port | 12345 | redis.Redis() connection |
FALKORDB_PASSWORD | FalkorDB authentication | Generated by Railway | redis.Redis(password=) |
QDRANT_URL | Qdrant endpoint | https://xyz.qdrant.io | QdrantClient(url=) |
QDRANT_API_KEY | Qdrant authentication | API key from Qdrant Cloud | QdrantClient(api_key=) |
AWS_ACCESS_KEY_ID | S3 upload (optional) | AWS credentials | boto3.client('s3') |
AWS_SECRET_ACCESS_KEY | S3 upload (optional) | AWS credentials | boto3.client('s3') |
AWS_DEFAULT_REGION | S3 region (optional) | us-east-1 | boto3.client('s3', region_name=) |
The TCP proxy endpoint is found in Railway Dashboard → FalkorDB service → Settings → Networking → TCP Proxy.
Railway Backup Service
Section titled “Railway Backup Service”For users who prefer Railway-hosted backups, scripts/Dockerfile.backup provides a containerized backup service that runs continuously.
The Dockerfile defines a Python 3.11 Alpine container that installs dependencies, copies the backup script, creates the output directory, and runs an infinite loop with backup and sleep cycles.
Railway deployment configuration:
- Builder: Dockerfile
- Dockerfile Path:
scripts/Dockerfile.backup - Root Directory:
/(project root) - Environment Variables: Same as
memory-service:FALKORDB_HOST,FALKORDB_PORT,FALKORDB_PASSWORD,QDRANT_URL,QDRANT_API_KEY, plus optional AWS credentials
Resource usage: Approximately $1-2/month on Railway Pro (minimal CPU/memory during sleep cycles).
Backup Configuration Reference
Section titled “Backup Configuration Reference”Environment Variables
Section titled “Environment Variables”| Variable | Required | Default | Description |
|---|---|---|---|
FALKORDB_HOST | Yes | - | FalkorDB hostname or IP |
FALKORDB_PORT | Yes | 6379 | FalkorDB Redis port |
FALKORDB_PASSWORD | Yes | - | FalkorDB authentication password |
FALKORDB_GRAPH | No | memories | Graph database name |
QDRANT_URL | Yes* | - | Qdrant endpoint URL |
QDRANT_API_KEY | Yes* | - | Qdrant API authentication |
QDRANT_COLLECTION | No | memories | Qdrant collection name |
AWS_ACCESS_KEY_ID | No | - | For S3 upload |
AWS_SECRET_ACCESS_KEY | No | - | For S3 upload |
AWS_DEFAULT_REGION | No | us-east-1 | S3 region |
*Qdrant variables optional if system is running without vector storage.
Retention Policy Recommendations
Section titled “Retention Policy Recommendations”| Use Case | Backup Frequency | Retention Period | Storage Location | Estimated Cost |
|---|---|---|---|---|
| Personal/Development | Every 24 hours | 7 days | Local only | $0 |
| Team/Small Production | Every 6 hours | 14 days | Local + S3 | ~$0.50/month |
| Production | Every 1-6 hours | 30 days | S3 with versioning | ~$2-5/month |
| Enterprise | Every 1 hour | 90 days + archive | S3 + cross-region | ~$10-20/month |
Backup Method Comparison
Section titled “Backup Method Comparison”| Method | Scope | Speed | Automation | Platform Lock | Cost | Best For |
|---|---|---|---|---|---|---|
| Railway Volumes | FalkorDB only | Instant | Automatic | Yes | Included | Quick recovery |
| GitHub Actions | Both databases | 5-10 min | Scheduled | No | Free | Most users |
| Railway Service | Both databases | 5-10 min | Continuous | Partial | $1-2/mo | Railway-centric |
| Manual Script | Both databases | 5-10 min | Manual | No | Free | Development |
Disaster Recovery
Section titled “Disaster Recovery”Recovery Decision Matrix
Section titled “Recovery Decision Matrix”| Failure Scenario | Data Available | Recovery Method | Primary Tool | RTO |
|---|---|---|---|---|
| FalkorDB data loss | Qdrant intact | Qdrant-based rebuild | recover_from_qdrant.py | 2-5 min |
| FalkorDB persistence disabled | Qdrant intact | Qdrant-based rebuild | recover_from_qdrant.py | 2-5 min |
| Qdrant data loss | FalkorDB intact | Background re-embedding | Enrichment queue | 30-60 min |
| Both databases corrupted | Backup files (S3/local) | File restoration | restore_from_backup.py + recovery | 10-20 min |
| Railway volume failure | Railway snapshots | Volume restore | Railway dashboard | 5-10 min |
| Drift detected (5-50%) | Both databases available | Selective sync | health_monitor.py --auto-recover | 1-2 min |
Recovery Decision Tree
Section titled “Recovery Decision Tree”flowchart TD
Start["Data Loss Detected"] --> Assess["Assess Damage"]
Assess --> FalkorLost{"FalkorDB<br/>lost?"}
Assess --> QdrantLost{"Qdrant<br/>lost?"}
FalkorLost -->|Yes| QdrantIntact{"Qdrant<br/>intact?"}
FalkorLost -->|No| NoAction1["No action needed"]
QdrantIntact -->|Yes| RecoverQdrant["Use recover_from_qdrant.py<br/>⚡ 10 min / 99.7% success"]
QdrantIntact -->|No| BothLost["Both databases lost"]
QdrantLost -->|Yes| FalkorIntact{"FalkorDB<br/>intact?"}
QdrantLost -->|No| NoAction2["No action needed"]
FalkorIntact -->|Yes| RebuildQdrant["Rebuild Qdrant from FalkorDB<br/>POST /admin/reembed<br/>⚡ 15-20 min"]
FalkorIntact -->|No| BothLost
BothLost --> RailwayBackup{"Railway volume<br/>backups?"}
RailwayBackup -->|Yes| RailwayRestore["Railway Dashboard restore<br/>⚡ 5 min / FalkorDB only<br/>Then rebuild Qdrant"]
RailwayBackup -->|No| S3Backup{"S3 or local<br/>backups?"}
S3Backup -->|Yes| S3Restore["1. Restore Qdrant from S3<br/>2. recover_from_qdrant.py<br/>⏱️ 30 min total"]
S3Backup -->|No| CompleteFailure["Complete data loss<br/>Start fresh"]
RecoverQdrant --> Verify["Verify Integrity<br/>Compare counts<br/>Check sample memories"]
RebuildQdrant --> Verify
RailwayRestore --> Verify
S3Restore --> Verify
Method 1: Qdrant-Based Recovery
Section titled “Method 1: Qdrant-Based Recovery”The fastest and most reliable recovery method. Uses Qdrant’s vector payloads to rebuild the entire FalkorDB graph structure.
When to use:
- FalkorDB lost all data (container restart, persistence misconfiguration)
- FalkorDB corrupted but Qdrant intact
- Health monitor detects >50% drift with Qdrant having more data
Recovery process details:
| Function | Purpose | Code Location |
|---|---|---|
qdrant_client.scroll() | Fetch all vectors with payloads | Qdrant SDK call |
_filter_reserved_fields() | Remove type, confidence from metadata | scripts/recover_from_qdrant.py |
CREATE MERGE (m:Memory) | Rebuild memory nodes | Cypher query in recovery loop |
| Relationship extraction | Parse metadata.relationships array | Recovery loop logic |
Execution steps:
# Step 1: Verify Qdrant availabilitycurl https://your-qdrant-url/collections/memories
# Step 2: Run recovery script with database environment variablesFALKORDB_HOST=falkordb.railway.internal \FALKORDB_PORT=6379 \FALKORDB_PASSWORD=your-password \QDRANT_URL=https://your-qdrant-url \QDRANT_API_KEY=your-key \python scripts/recover_from_qdrant.py
# Step 3: Monitor recovery progress in stdoutKnown limitations:
- 99.7% recovery rate: In testing, 2/780 memories failed due to malformed data
- Relationship loss: If relationships were stored only in FalkorDB (not in Qdrant payload), they won’t be recovered
- Recent writes: Memories written in the last 2 seconds (before embedding completes) may be missing from Qdrant
Method 2: Backup File Restoration
Section titled “Method 2: Backup File Restoration”Used when both databases are lost or corrupted. Restores from compressed JSON backups stored locally or in S3.
When to use:
- Both FalkorDB and Qdrant lost
- Recovery from specific point in time needed
- Testing disaster recovery procedures
Backup file structure:
backups/├── falkordb/│ ├── falkordb_20251020_143000.json.gz│ └── falkordb_20251020_083000.json.gz└── qdrant/ ├── qdrant_20251020_143000.json.gz └── qdrant_20251020_083000.json.gzEach Qdrant backup contains an array of point objects with id, vector (768-dimensional float array), and payload (content, memory_id, tags, importance, type, created_at).
Restoration Flow
Section titled “Restoration Flow”sequenceDiagram
participant Operator
participant S3 as S3 Bucket / Local
participant Restore as restore_from_backup.py
participant Qdrant
participant Recovery as recover_from_qdrant.py
participant FalkorDB
Operator->>S3: Download backup<br/>aws s3 cp or local copy
Operator->>Restore: Execute with backup path
Restore->>Restore: Decompress JSON.gz<br/>gzip.open()
Restore->>Restore: Parse JSON array<br/>json.load()
loop For each point
Restore->>Qdrant: Upsert point<br/>qdrant_client.upsert()
end
Restore-->>Operator: Qdrant restoration complete<br/>780 points restored
Operator->>Recovery: python recover_from_qdrant.py
Recovery->>Qdrant: Scroll all points
Recovery->>FalkorDB: Rebuild graph
Recovery-->>Operator: Full recovery complete
Execution steps:
# Step 1: Download backup files from S3aws s3 cp s3://automem-backups/qdrant/qdrant_20251020_143000.json.gz ./
# Step 2: Decompress backupgunzip qdrant_20251020_143000.json.gz
# Step 3: Restore to Qdrantpython scripts/restore_from_backup.py --file qdrant_20251020_143000.json
# Step 4: Rebuild FalkorDB from Qdrantpython scripts/recover_from_qdrant.pyRecovery Time Estimate
Section titled “Recovery Time Estimate”| Database Size | Decompress | Qdrant Upload | FalkorDB Rebuild | Total |
|---|---|---|---|---|
| 100 memories | <1 sec | 5 sec | 10 sec | ~15 sec |
| 1,000 memories | 2 sec | 30 sec | 60 sec | ~2 min |
| 10,000 memories | 10 sec | 5 min | 10 min | ~15 min |
| 100,000 memories | 30 sec | 30 min | 60 min | ~90 min |
Method 3: Railway Volume Restore
Section titled “Method 3: Railway Volume Restore”Uses Railway’s built-in volume snapshots for instant recovery of FalkorDB data.
When to use:
- FalkorDB volume corruption
- Need to rollback to specific snapshot
- Quick recovery without external dependencies
Restoration steps:
- Log in to Railway dashboard, navigate to FalkorDB service, click “Volumes” tab
- View available snapshots (sorted by date), note timestamp and size
- Click “Restore” next to chosen snapshot and confirm (irreversible action)
- Railway stops the FalkorDB service, replaces volume with snapshot data, restarts the service
- Verify service health with
curl https://your-automem-url/health
Limitations:
- Railway-locked: Cannot export snapshots outside Railway platform
- FalkorDB only: Does not restore Qdrant data
- Snapshot frequency: Default 24-hour intervals (may lose up to 24 hours of data)
- Retention policy: Depends on Railway plan (Pro plan: 30 days)
Method 4: Drift-Based Selective Recovery
Section titled “Method 4: Drift-Based Selective Recovery”Uses the health monitor to detect and automatically fix inconsistencies between FalkorDB and Qdrant without full recovery.
When to use:
- Health monitor detects 5-50% drift
- Both databases online but inconsistent
- Partial write failures suspected
- Preventive maintenance
# Enable auto-recovery on health monitor serviceHEALTH_MONITOR_AUTO_RECOVER=true
# Or trigger manual recoverypython scripts/health_monitor.py --auto-recoverRecovery Validation
Section titled “Recovery Validation”After any recovery procedure, verify system integrity before returning to production.
graph TB
Start[Recovery Complete]
Start --> V1{1. Database<br/>Connectivity}
V1 -->|Pass| V2{2. Memory<br/>Count Match}
V1 -->|Fail| Fix1[Check connection strings<br/>Verify credentials]
V2 -->|Pass| V3{3. Relationship<br/>Integrity}
V2 -->|Fail| Fix2[Re-run recovery<br/>Check for errors]
V3 -->|Pass| V4{4. Embedding<br/>Coverage}
V3 -->|Fail| Fix3[Investigate missing edges<br/>Check entity extraction]
V4 -->|Pass| V5{5. Search<br/>Functionality}
V4 -->|Fail| Fix4[Re-run embedding worker<br/>Verify Qdrant indexes]
V5 -->|Pass| Complete[✓ Validation Complete<br/>Resume Operations]
V5 -->|Fail| Fix5[Test vector search<br/>Rebuild Qdrant collection]
Fix1 --> V1
Fix2 --> V2
Fix3 --> V3
Fix4 --> V4
Fix5 --> V5
Validation Procedures
Section titled “Validation Procedures”1. Database Connectivity:
curl https://your-automem-url/health2. Memory Count Verification:
# Check FalkorDB count matches Qdrant countcurl https://your-automem-url/health | jq '.statistics'3. Relationship Integrity:
# Use analyze endpoint to check relationship distributioncurl https://your-automem-url/analyze | jq '.relationship_types'4. Embedding Coverage:
# Verify all memories have embeddingscurl https://your-automem-url/analyze | jq '.embedding_coverage'5. Search Functionality:
# Test recall with a known querycurl -X GET "https://your-automem-url/recall?query=test" \ -H "Authorization: Bearer your-api-token"Post-Recovery Monitoring
Section titled “Post-Recovery Monitoring”Monitor the system for 24 hours after recovery:
| Metric | Check Frequency | Threshold | Action if Exceeded |
|---|---|---|---|
| Drift percentage | Every 5 min | >5% | Investigate write failures |
| Enrichment queue depth | Every 15 min | >100 | Check worker health |
| Embedding queue depth | Every 15 min | >500 | Verify OpenAI API key |
| API error rate | Every 5 min | >1% | Check logs for errors |
| Response time (p95) | Every 15 min | >2s | Investigate slow queries |
Recovery Time Objectives (RTO)
Section titled “Recovery Time Objectives (RTO)”| Scenario | Method | Detection | Execution | Validation | Total RTO |
|---|---|---|---|---|---|
| FalkorDB lost, Qdrant intact | Qdrant recovery | Immediate | 2-3 min | 1 min | 3-4 min |
| FalkorDB persistence disabled | Qdrant recovery | Immediate | 2-3 min | 1 min | 3-4 min |
| Both databases corrupted | Backup restore | Varies | 5-10 min | 2 min | 7-12 min |
| Railway volume failure | Volume restore | Immediate | 3-5 min | 1 min | 4-6 min |
| 5-20% drift detected | Selective sync | Auto (5 min) | 1-2 min | 1 min | 7-8 min |
| Qdrant lost, FalkorDB intact | Re-embedding | Immediate | 30-60 min | 5 min | 35-65 min |
Recovery Point Objective (RPO):
- Best case: 0 seconds (Qdrant recovery, dual storage)
- Typical case: 2 seconds (embedding queue latency)
- Worst case: 6 hours (last automated backup)
Troubleshooting Recovery Failures
Section titled “Troubleshooting Recovery Failures”Recovery Script Fails with Connection Error
Section titled “Recovery Script Fails with Connection Error”Problem: recover_from_qdrant.py exits with “Failed to connect to FalkorDB”
Solution: Verify FALKORDB_HOST, FALKORDB_PORT, and FALKORDB_PASSWORD environment variables match the FalkorDB service configuration. For Railway deployments, use the internal hostname falkordb.railway.internal (or TCP proxy endpoint for external access).
Qdrant Recovery Shows 0% Success Rate
Section titled “Qdrant Recovery Shows 0% Success Rate”Problem: Script reports “Recovered 0/780 memories”
Solutions:
- If collection doesn’t exist: Use backup restoration method (Method 2)
- If payloads are missing: Qdrant was used for vectors only; recovery not possible without payloads
Memory Types Corrupted After Recovery
Section titled “Memory Types Corrupted After Recovery”Problem: /analyze shows invalid memory types like “str”, “int”, “boolean”
Cause: Using old version of recover_from_qdrant.py without RESERVED_FIELDS filtering
Solution: Update to v0.5.0+ of AutoMem. If already on old version, run scripts/cleanup_memory_types.py to fix corrupted types.
Drift Persists After Recovery
Section titled “Drift Persists After Recovery”Problem: Health monitor still reports >5% drift after running recovery
Solutions:
- If FalkorDB has more: Recent Qdrant writes may have failed; check Qdrant API key
- If Qdrant has more: FalkorDB may be read-only; check
REDIS_ARGSincludes--saveand--appendonly - If inconsistent: May need bidirectional sync; open a GitHub issue
Testing Recovery Procedures
Section titled “Testing Recovery Procedures”Regular recovery testing ensures procedures work when needed.
Recommended Testing Schedule
Section titled “Recommended Testing Schedule”| Environment | Test Frequency | Test Type | Data Source |
|---|---|---|---|
| Development | Weekly | Full recovery | Test data |
| Staging | Monthly | Full recovery | Production replica |
| Production | Quarterly | Validation only | Verify backup integrity |
Verifying Backup Integrity
Section titled “Verifying Backup Integrity”# Check backup files exist and have reasonable sizesls -lh backups/falkordb/ backups/qdrant/
# Validate memory counts from backupzcat backups/qdrant/qdrant_latest.json.gz | jq 'length'
# Test decompressiongunzip -t backups/falkordb/falkordb_latest.json.gz && echo "OK"