Memory Model
This page describes the complete data model for memories in AutoMem, including their structure, properties, classification taxonomy, and storage representation. For information about how memories relate to each other, see Relationship Types. For details on how memories are searched and retrieved, see Hybrid Search.
Overview
Section titled “Overview”Every memory in AutoMem is a structured data object with both required and optional properties. Memories are classified into specific types (Decision, Pattern, Preference, etc.), enriched with metadata, and stored redundantly in both FalkorDB (graph) and Qdrant (vector) databases. The memory model supports temporal validity windows, confidence scoring, and hierarchical tag organization.
graph LR
Memory["Memory Node"]
Memory --> id["id: UUID"]
Memory --> content["content: string"]
Memory --> type["type: Memory Type"]
Memory --> timestamp["timestamp: ISO datetime"]
Memory --> importance["importance: 0.0-1.0"]
Memory --> confidence["confidence: 0.0-1.0"]
Memory --> tags["tags: string[]"]
Memory --> metadata["metadata: JSON"]
Memory --> embedding["embedding: float[768]"]
Memory Schema
Section titled “Memory Schema”Core Properties
Section titled “Core Properties”All memories contain the following properties:
| Property | Type | Required | Description |
|---|---|---|---|
id | string (UUID) | Yes | Unique identifier, auto-generated if not provided |
content | string | Yes | The actual memory text (minimum 1 character) |
type | string | Yes | Classification from MEMORY_TYPES set |
confidence | float | Yes | Classification confidence score (0.0-1.0) |
importance | float | Yes | User-specified or derived importance (0.0-1.0, default: 0.5) |
timestamp | string (ISO 8601) | Yes | When the memory was created or occurred |
tags | array[string] | No | Hierarchical categorization tags (e.g., ["project:automem", "decision"]) |
tag_prefixes | array[string] | No | Pre-computed lowercase tag prefixes for fast filtering |
metadata | object | No | Flexible JSON object for custom fields |
embedding | array[float] | No | 768-dimensional vector for semantic search |
Temporal Properties
Section titled “Temporal Properties”| Property | Type | Required | Description |
|---|---|---|---|
t_valid | string (ISO 8601) | No | When this memory becomes valid (future-dated memories) |
t_invalid | string (ISO 8601) | No | When this memory expires or becomes invalid |
updated_at | string (ISO 8601) | Yes | Last modification timestamp |
last_accessed | string (ISO 8601) | Yes | Last retrieval or access timestamp |
Enrichment Properties
Section titled “Enrichment Properties”| Property | Type | Description |
|---|---|---|
enriched_at | string (ISO 8601) | When enrichment pipeline processed this memory |
enrichment_attempts | integer | Number of enrichment attempts (max 3) |
summary | string | Auto-generated summary (first sentence, max 240 chars) |
entities | object | Extracted entities: tools, projects, people, concepts, organizations |
State Properties
Section titled “State Properties”| Property | Type | Description |
|---|---|---|
archived | boolean | Whether memory is archived (excluded from search by default) |
relevance_score | float | Dynamic score computed by consolidation engine |
Memory Type Taxonomy
Section titled “Memory Type Taxonomy”AutoMem classifies all memories into one of seven semantic types. The classification system is defined in MEMORY_TYPES:
Type Definitions
Section titled “Type Definitions”graph TB
Memory["Memory"]
Decision["Decision<br/>Choices made, selected options"]
Pattern["Pattern<br/>Recurring behaviors, typical approaches"]
Preference["Preference<br/>Likes/dislikes, favorites"]
Style["Style<br/>Communication approach, formatting"]
Habit["Habit<br/>Regular routines, repeated actions"]
Insight["Insight<br/>Discoveries, learnings, realizations"]
Context["Context<br/>Situational background, circumstances"]
Memory --> Decision
Memory --> Pattern
Memory --> Preference
Memory --> Style
Memory --> Habit
Memory --> Insight
Memory --> Context
| Type | Description | Example Content |
|---|---|---|
Decision | Strategic or technical choices, selected options | ”Chose PostgreSQL over MongoDB for ACID compliance” |
Pattern | Recurring behaviors, typical approaches, consistent tendencies | ”Usually write integration tests before deploying” |
Preference | Likes/dislikes, favorites, personal or team tastes | ”Prefer async/await over callbacks for clarity” |
Style | Communication approach, formatting preferences, tone | ”Write commit messages in imperative mood” |
Habit | Regular routines, repeated actions, scheduled activities | ”Run pytest every morning before starting work” |
Insight | Discoveries, learnings, realizations, key findings | ”Realized embedding batching reduces costs by 40%“ |
Context | Situational background, circumstances, what was happening | ”During database migration sprint in Q3 2024” |
Classification System
Section titled “Classification System”Architecture
Section titled “Architecture”AutoMem employs a hybrid classification system that balances speed, cost, and accuracy:
- Explicit Type (preferred): Client provides
typefield in POST request - Regex Pattern Matching (fast, free): Checks content against predefined patterns
- LLM Fallback (accurate, ~$0.02/month): Uses GPT-4o-mini when patterns don’t match
graph TB
Start["Memory Content"]
Explicit{"Type<br/>provided?"}
Validate{"Valid<br/>type?"}
Regex["Check PATTERNS dict"]
Match{"Regex<br/>match?"}
LLM["Call GPT-4o-mini<br/>$0.0001 per request"]
LLMValid{"Valid<br/>type?"}
Decision["Validated Type<br/>+ Confidence"]
Fallback["Context<br/>confidence: 0.5"]
Start --> Explicit
Explicit -->|Yes| Validate
Explicit -->|No| Regex
Validate -->|Yes| Decision
Validate -->|No| Fallback
Regex --> Match
Match -->|Yes| Decision
Match -->|No| LLM
LLM --> LLMValid
LLMValid -->|Yes| Decision
LLMValid -->|No| Fallback
Pattern-Based Classification
Section titled “Pattern-Based Classification”The MemoryClassifier class defines regex patterns for each memory type:
| Type | Example Patterns |
|---|---|
Decision | r"decided to", r"chose (\w+) over", r"going with", r"opted for" |
Pattern | r"usually", r"typically", r"tend to", r"often", r"consistently" |
Preference | r"prefer", r"like.*better", r"favorite", r"rather than" |
Style | r"wrote.*in.*style", r"communicated", r"formatted as" |
Habit | r"always", r"every time", r"daily", r"weekly", r"routine" |
Insight | r"realized", r"discovered", r"learned that", r"figured out" |
Context | r"during", r"while working on", r"in the context of", r"when" |
Confidence Calculation:
- Base confidence:
0.6for single pattern match - Boost:
+0.1for each additional pattern match (max0.95)
LLM Classification
Section titled “LLM Classification”When regex patterns don’t match (approximately 30% of cases), the system falls back to GPT-4o-mini:
System Prompt Structure:
You are a memory classification system. Classify each memory into exactly ONE of these types:- Decision: Choices made, selected options, what was decided- Pattern: Recurring behaviors, typical approaches, consistent tendencies- Preference: Likes/dislikes, favorites, personal tastes- Style: Communication approach, formatting, tone used- Habit: Regular routines, repeated actions, schedules- Insight: Discoveries, learnings, realizations, key findings- Context: Situational background, what was happening, circumstances
Return JSON with: {"type": "<type>", "confidence": <0.0-1.0>}Request Format:
- Model:
gpt-4o-mini - Temperature:
0.3(deterministic) - Max tokens:
50 - Response format: JSON object
- Input limit: First 1000 characters of content
Validation:
- Returned type must exist in
MEMORY_TYPES - Invalid types fall back to
"Context"with confidence0.5
The full classification sequence, including how POST /memory decides between explicit type, regex, and LLM:
sequenceDiagram
participant Client
participant API as "/memory endpoint"
participant Classifier as "MemoryClassifier"
participant OpenAI as "OpenAI GPT-4o-mini"
participant Falkor as "FalkorDB"
Client->>API: "POST /memory<br/>{content, type?}"
alt type explicitly provided
API->>API: "Use provided type"
else type not provided
API->>Classifier: "classify(content)"
Classifier->>Classifier: "Try pattern matching"
alt patterns match
Classifier-->>API: "type, confidence"
else no pattern match AND use_llm=true
Classifier->>OpenAI: "GPT-4o-mini classification"
OpenAI-->>Classifier: "type, confidence"
Classifier-->>API: "type, confidence"
else no LLM
Classifier-->>API: "Context (default), 0.5"
end
end
API->>Falkor: "MERGE Memory node with type"
Falkor-->>API: "memory_id"
API-->>Client: "201 Created"
Temporal Validity Model
Section titled “Temporal Validity Model”Memories support optional temporal validity windows:
Validity Properties
Section titled “Validity Properties”| Property | Purpose | Use Case |
|---|---|---|
t_valid | Earliest time this memory is valid | Future-dated reminders, scheduled knowledge |
t_invalid | Latest time this memory is valid | Expiring credentials, time-bound decisions |
Query Behavior
Section titled “Query Behavior”By default, /recall excludes memories where:
now < t_valid(not yet valid)now >= t_invalid(expired)
Override: Use time_query or explicit start/end parameters to include expired memories.
Example Scenarios:
- Future-dated memory: Set
t_validto a future date — the memory is not searchable until that date arrives. - Expiring credential: Set
t_invalidto an expiry date — the memory is automatically excluded from search after that date.
Metadata Structure
Section titled “Metadata Structure”The metadata field is a flexible JSON object for storing arbitrary key-value pairs. Common patterns:
Enrichment Metadata
Section titled “Enrichment Metadata”Automatically added by the enrichment pipeline:
| Field | Type | Description |
|---|---|---|
metadata.enriched_at | string | ISO timestamp of enrichment |
metadata.enrichment_attempts | integer | Retry count (max 3) |
metadata.summary | string | Auto-generated summary |
metadata.entities | object | Extracted entities by category |
The metadata.entities object is structured by entity category:
{ "entities": { "tools": ["pytest", "docker"], "projects": ["automem"], "people": ["sarah"], "concepts": ["embedding", "vector search"], "organizations": ["OpenAI"] }}Reserved Fields
Section titled “Reserved Fields”The following fields are reserved and should not be placed in metadata (they are top-level properties):
type,confidence,content,timestamp,tags,tag_prefixesimportance,embedding,id,archived,relevance_scoret_valid,t_invalid,updated_at,last_accessed
Storage Representation
Section titled “Storage Representation”FalkorDB (Graph Database)
Section titled “FalkorDB (Graph Database)”Memories are stored as nodes with the Memory label. Node properties directly map to the memory schema:
Graph Features:
- Relationships connect memories (see Relationship Types)
- Cypher queries enable graph traversal
- Indexes on
tags,tag_prefixes,timestamp,importance
Qdrant (Vector Database)
Section titled “Qdrant (Vector Database)”Memories are stored as points with payloads mirroring FalkorDB properties:
Qdrant Features:
- Keyword indexes on
tagsandtag_prefixesfor fast filtering - HNSW index on vectors for approximate nearest neighbor search
- Payload filters support complex boolean queries
Dual Storage Strategy:
- FalkorDB is the source of truth (required)
- Qdrant is optional but enables semantic search
- Data is written to both databases synchronously
- Qdrant can rebuild FalkorDB if graph data is lost
Memory Lifecycle
Section titled “Memory Lifecycle”stateDiagram-v2
[*] --> Created: POST /memory
Created --> EnrichmentQueued: Immediate<br/>Queue job
EnrichmentQueued --> Enriching: Worker picks up
Enriching --> Enriched: Success<br/>Entities extracted<br/>Relationships created
Enriching --> EnrichmentFailed: Failure<br/>Retry (max 3)
EnrichmentFailed --> Enriching: Backoff retry
EnrichmentFailed --> Abandoned: Max attempts
Enriched --> Active: Searchable<br/>In graph
Active --> Updated: PATCH /memory/:id
Updated --> ReEnrichment: Content changed
ReEnrichment --> Active
Active --> Decaying: Consolidation<br/>relevance_score decreases
Decaying --> Archived: Forget task<br/>Low relevance
Archived --> [*]: DELETE /memory/:id
Active --> [*]: DELETE /memory/:id
Lifecycle Stages
Section titled “Lifecycle Stages”-
Created (
POST /memory):- Memory written to FalkorDB
- Embedding queued for generation
- Enrichment job queued
- Returns
201 Createdimmediately
-
Enrichment Queued:
- Job added to in-memory queue
- Worker thread polls queue every 2 seconds
- Max 3 enrichment attempts with exponential backoff
-
Enriching:
- Entity extraction (spaCy NER + regex)
- Temporal links created (
OCCURRED_BEFORE) - Semantic neighbors found (
SIMILAR_TO) - Pattern detection (
EXEMPLIFIES) - Summary generation (optional)
-
Active:
- Fully searchable via
/recall - Included in graph queries
- Participates in consolidation
- Fully searchable via
-
Decaying:
relevance_scoredecreases exponentially- Decay rate: hourly consolidation task
- Formula:
relevance = base_score * exp(-decay_rate * age)
-
Archived:
- Marked
archived: true - Excluded from search by default
- Can be restored or deleted
- Marked
-
Deleted:
- Removed from both FalkorDB and Qdrant
- Relationships cleaned up
- Irreversible operation
Implementation Details
Section titled “Implementation Details”Memory Creation Flow
Section titled “Memory Creation Flow”The complete flow from client request through classification, storage, and queue submission:
sequenceDiagram
participant C as Client
participant API as Flask API
participant CL as MemoryClassifier
participant F as FalkorDB
participant EQ as Enrichment Queue
participant EmQ as Embedding Queue
participant Q as Qdrant
C->>API: POST /memory<br/>{content, tags, importance}
alt Type provided
API->>API: Use explicit type
else Auto-classify
API->>CL: classify(content)
CL->>CL: Try regex patterns
alt Regex match
CL-->>API: (type, confidence)
else No match
CL->>CL: _classify_with_llm()
CL-->>API: (type, confidence)
end
end
API->>F: CREATE Memory node
F-->>API: memory_id
API->>EQ: Enqueue enrichment
alt Embedding provided
API->>Q: Store vector + payload
else No embedding
API->>EmQ: Queue for generation
end
API-->>C: 201 Created<br/>{memory_id, enrichment: queued}
Validation Rules
Section titled “Validation Rules”Content Validation
Section titled “Content Validation”- Minimum length: 1 character (empty strings rejected)
- Maximum length: No hard limit (practical limit ~10KB for good embedding quality)
- Encoding: UTF-8
Type Validation
Section titled “Type Validation”- Must be one of:
Decision,Pattern,Preference,Style,Habit,Insight,Context - Case-sensitive
- Invalid types return
400 Bad Requestwith valid options listed
Confidence Validation
Section titled “Confidence Validation”- Range:
0.0to1.0(inclusive) - Default:
0.9if type explicitly provided, computed otherwise - Non-numeric values rejected
Importance Validation
Section titled “Importance Validation”- Range:
0.0to1.0(inclusive) - Default:
0.5 - Used in search ranking and consolidation
Tag Validation
Section titled “Tag Validation”- Array of strings or comma-separated string
- Normalized to lowercase
- Hierarchical prefixes auto-computed (e.g.,
"project:automem:api"→["project", "project:automem", "project:automem:api"]) - Empty tags filtered out
Timestamp Validation
Section titled “Timestamp Validation”- Must be ISO 8601 format
- Automatically converted to UTC with
+00:00timezone - Strings ending in
Zconverted to+00:00 - Invalid timestamps return
400 Bad Request
Embedding Validation
Section titled “Embedding Validation”- Must be exactly 768 dimensions
- All values must be numeric
- Auto-generated if omitted (OpenAI
text-embedding-3-smallor deterministic placeholder)