Skip to content

Admin Operations

Administrative endpoints require elevated privileges (ADMIN_API_TOKEN) for managing enrichment processing and embedding generation. These operations are intended for maintenance, debugging, and bulk data operations.

For standard memory operations (store, recall, update, delete), see Memory Operations. For consolidation scheduling, see Consolidation Operations.


Admin operations require dual authentication:

  1. Standard API Token (AUTOMEM_API_TOKEN) — Required for all endpoints except /health
  2. Admin Token (ADMIN_API_TOKEN) — Additional token for privileged operations
Token TypeHeader MethodsQuery ParameterEnvironment Variable
API TokenAuthorization: Bearer <token> / X-API-Key: <token>?api_key=<token>AUTOMEM_API_TOKEN
Admin TokenX-Admin-Token: <token> / X-Admin-Api-Key: <token>?admin_token=<token>ADMIN_API_TOKEN
graph TB
    subgraph "Authentication Layers"
        Layer1["Layer 1: API Token<br/>AUTOMEM_API_TOKEN<br/>Guards all endpoints except /health"]
        Layer2["Layer 2: Admin Token<br/>ADMIN_API_TOKEN<br/>Guards privileged operations"]
    end

    subgraph "Protected Operations"
        Reprocess["/enrichment/reprocess"]
        Reembed["/admin/reembed"]
    end

    subgraph "Public Operations (no auth)"
        Status["/enrichment/status<br/>(unauthenticated)"]
    end

    subgraph "Also Public"
        Health["/health<br/>(no auth)"]
    end

    Layer1-->Layer2
    Layer2-->Reprocess
    Layer2-->Reembed
Status CodeResponseMeaning
401 Unauthorized{"error": "Unauthorized"}Missing or invalid AUTOMEM_API_TOKEN
401 Admin authorization required{"error": "Admin authorization required"}Missing or invalid ADMIN_API_TOKEN
403 Admin token not configured{"error": "Admin token not configured"}Server has no ADMIN_API_TOKEN environment variable set

Authentication: None required

Purpose: Monitor the enrichment pipeline’s health and processing statistics. This endpoint provides visibility into background processing without requiring authentication.

{
"status": "running",
"queue_depth": 3,
"pending": 2,
"inflight": 1,
"processed": 1247,
"failed": 3
}
FieldTypeDescription
statusstring"running" if enrichment worker is active, "stopped" if worker thread is dead
queue_depthintegerTotal jobs in queue (pending + inflight)
pendingintegerCount of memories waiting to be processed
inflightintegerCount of memories currently being processed
processedintegerTotal enrichment attempts completed since service start
failedintegerTotal failed enrichment attempts since service start
Terminal window
curl "https://your-automem-instance/enrichment/status"
ObservationLikely CauseAction
status: "stopped"Worker thread crashedCheck application logs for exceptions, restart service
queue_depth increasingWorker processing slower than intakeMonitor inflight, check for spaCy or OpenAI issues
High failed countEnrichment logic errorsReview application logs, check Qdrant connectivity
inflight stuckWorker deadlockedRestart enrichment worker or service

Authentication: API token + Admin token

Purpose: Force re-enrichment of specific memories. Useful after:

  • Updating enrichment logic or configuration
  • Adding spaCy model capabilities
  • Fixing corrupted enrichment metadata
  • Recovering from systematic enrichment failures
ParameterTypeRequiredDescription
idsarray[string]YesList of memory UUIDs to reprocess (non-empty)

Reprocessing always forces re-queuing regardless of current pending/in-flight state.

Terminal window
curl -X POST https://your-automem-instance/enrichment/reprocess \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"ids": [
"a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"b2c3d4e5-f6a7-8901-bcde-f12345678901"
]
}'
FieldTypeDescription
statusstringAlways "queued"
countintegerNumber of memories successfully added to enrichment queue
idsarray[string]List of memory UUIDs that were queued
{
"status": "queued",
"count": 2,
"ids": [
"a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"b2c3d4e5-f6a7-8901-bcde-f12345678901"
]
}
graph TB
    Request["POST /enrichment/reprocess<br/>{ids: [...]}"]
    Auth["_require_admin_token()"]
    Validate["Validate Request"]

    subgraph "Queueing Loop"
        Enqueue["enqueue_enrichment()<br/>with forced=True"]
        IncrQueued["queued_count++"]
    end

    Response["Return 202:<br/>status, count, ids"]

    Request-->Auth
    Auth-->Validate
    Validate-->Enqueue
    Enqueue-->IncrQueued
    IncrQueued-->Response

The reprocessing operation performs these steps:

  1. Validation Phase — Validates that the ids array is non-empty
  2. Queueing — Calls enqueue_enrichment(memory_id, forced=True, attempt=0) which:
    • Acquires state.enrichment_lock
    • Adds memory ID to state.enrichment_pending
    • Puts EnrichmentJob(memory_id, attempt=0, forced=True) in queue
  3. Background Processing — The enrichment_worker() thread picks up jobs and calls enrich_memory() which:
    • Extracts entities via spaCy (if installed)
    • Creates temporal PRECEDED_BY edges
    • Finds semantic neighbors via Qdrant
    • Creates SIMILAR_TO relationships
    • Detects patterns and creates EXEMPLIFIES edges
    • Updates metadata.enriched_at timestamp

Authentication: API token + Admin token

Purpose: Regenerate embeddings for all memories in batches. Critical for:

  • Migrating to a different embedding model
  • Recovering from Qdrant data loss
  • Fixing corrupted embeddings
  • Bulk embedding generation after initial import
ParameterTypeRequiredDescription
batch_sizeintegerNoEmbeddings per OpenAI API call. Default: 32. Max recommended: 100
limitintegerNoMax memories to process. If omitted, processes all memories in database
forcebooleanNoRe-embed memories even if embeddings already exist. Default: false
Terminal window
curl -X POST https://your-automem-instance/admin/reembed \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 32}'
graph TB
    Request["POST /admin/reembed<br/>{batch_size: 32, limit: 1000}"]
    Auth["_require_admin_token()"]
    Init["Initialize OpenAI client<br/>init_openai()"]

    FetchIDs["Query FalkorDB:<br/>MATCH (m:Memory)<br/>RETURN m.id<br/>LIMIT $limit"]

    subgraph "Batch Processing Loop"
        Slice["Slice next batch_size IDs"]
        FetchContent["Query content for batch:<br/>MATCH (m:Memory)<br/>WHERE m.id IN $ids<br/>RETURN m.id, m.content"]

        CallOpenAI["OpenAI API:<br/>embeddings.create()<br/>model=text-embedding-3-small<br/>input=[contents]"]

        UpdateQdrant["Qdrant: upsert()<br/>PointStruct<br/>id, vector, payload"]

        IncrCount["processed_count += batch_size"]
    end

    Response["Return summary:<br/>status, processed, failed,<br/>total, batch_size,<br/>metadata_preserved"]

    Request-->Auth
    Auth-->Init
    Init-->FetchIDs
    FetchIDs-->Slice
    Slice-->FetchContent
    FetchContent-->CallOpenAI
    CallOpenAI-->UpdateQdrant
    UpdateQdrant-->IncrCount
    IncrCount-->|More batches?|Slice
    IncrCount-->|Done|Response
FieldTypeDescription
statusstringResult status
processedintegerNumber of memories successfully re-embedded
failedintegerNumber of memories that failed re-embedding
totalintegerTotal memory count in database at operation start
batch_sizeintegerBatch size used (from request or default 32)
metadata_preservedbooleanWhether existing metadata was preserved during re-embedding
{
"status": "success",
"processed": 1000,
"failed": 0,
"total": 1000,
"batch_size": 32,
"metadata_preserved": true
}

Phase 1: Memory Enumeration

Fetches all memory IDs (or up to limit) from FalkorDB:

MATCH (m:Memory) RETURN m.id LIMIT $limit

Phase 2: Batch Content Retrieval

For each batch of batch_size IDs, queries FalkorDB for content. Missing memories are logged but don’t halt processing.

Phase 3: OpenAI Embedding Generation

Generates embeddings for entire batch in single API call. OpenAI’s text-embedding-3-small model:

  • Dimension: 1024
  • Context window: 8191 tokens
  • Cost: $0.00002 per 1K tokens

Phase 4: Qdrant Update

Embeddings are written to Qdrant only. Qdrant failures are logged but don’t halt the operation (graceful degradation). FalkorDB graph data is not modified by this operation.

Batch SizeOpenAI API Calls (1000 memories)Approx TimeCost (1000 memories)
10100~5 minutes$0.06
3232~2 minutes$0.06
5020~1 minute$0.06
10010~30 seconds$0.06

Recommendations:

  • Default 32 balances API call overhead and failure blast radius
  • Use 100 for large migrations (>10K memories) with stable OpenAI access
  • Use 10 during testing or with rate-limited OpenAI keys
  • Monitor processed count to detect stalls mid-operation

The operation continues even if individual batches fail:

ErrorCauseBehavior
OpenAI API rate limitExceeded quotaRetries with exponential backoff (handled by OpenAI SDK)
Missing memory contentDeleted between enumeration and fetchLogged, skipped, processing continues
Qdrant connection failureNetwork issue or Qdrant downLogged, FalkorDB still updated (graceful degradation)
Invalid content formatNull or non-string contentLogged, skipped

All errors are logged with structured context:

logger.exception("Failed to generate embeddings for batch", extra={"batch_ids": ids})

Authentication: API token + Admin token

Purpose: Perform non-destructive drift repair between FalkorDB and Qdrant. Detects and reconciles discrepancies without deleting data.

ParameterTypeRequiredDescription
batch_sizeintegerNoNumber of memories to process per batch. Default: 32
dry_runbooleanNoIf true, report drift without making changes. Default: false
Terminal window
curl -X POST https://your-automem-instance/admin/sync \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 32, "dry_run": false}'

Admin operations can:

  • Force expensive OpenAI API calls (re-embedding entire database)
  • Trigger resource-intensive enrichment reprocessing
  • Access operational metrics (enrichment statistics)

Without admin token protection, a compromised API token could:

  1. Generate thousands of dollars in OpenAI costs via repeated re-embedding
  2. Overload enrichment workers with duplicate jobs
  3. Enumerate all memory IDs via reprocess endpoint
PracticeRationaleImplementation
Separate tokensLimits blast radius of API token compromiseUse different values for AUTOMEM_API_TOKEN and ADMIN_API_TOKEN
Rotate periodicallyReduces window of exposureRegenerate tokens monthly, update all clients
Restrict admin accessMinimize privilege escalation riskShare admin token only with operations team
Use headers, not query paramsPrevents token leakage in logsPrefer Authorization: Bearer and X-Admin-Token headers
Monitor admin operationsDetect anomalous usageAlert on high-frequency /admin/reembed calls
Audit admin callsForensic capabilityLog admin operations with IP, timestamp, token hash

After spaCy model upgrades or enrichment logic changes:

Terminal window
# 1. Get all memory IDs that need reprocessing
MEMORY_IDS=$(curl -s "https://your-instance/recall?limit=100&query=*" \
-H "Authorization: Bearer $TOKEN" | jq -r '[.results[].memory.memory_id]')
# 2. Reprocess them
curl -X POST https://your-instance/enrichment/reprocess \
-H "Authorization: Bearer $TOKEN" \
-H "X-Admin-Token: $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"ids\": $MEMORY_IDS, \"force\": true}"

When switching from text-embedding-3-small (768-d) to text-embedding-3-large (3072-d):

  1. Update VECTOR_SIZE environment variable
  2. Recreate the Qdrant collection with new dimensions
  3. Run /admin/reembed with batch_size=50 to regenerate all embeddings

If Qdrant data is corrupted or lost but FalkorDB is intact:

Terminal window
curl -X POST https://your-automem-instance/admin/reembed \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 50}'

See Operations / Health for complete recovery procedures.