Hybrid Search

This document explains AutoMem’s hybrid search system, which combines semantic, lexical, graph, temporal, and metadata signals to retrieve and rank memories. The system implements a 9-component scoring algorithm that achieves 90.53% accuracy on the LoCoMo benchmark.

For information about the memory structure being searched, see Memory Model. For details on relationship types used in graph traversal, see Relationship Types. For API usage, see Recall Operations.

Hybrid Search Overview

AutoMem’s recall system combines results from two complementary data stores and applies multi-dimensional scoring to rank results by relevance:

sequenceDiagram
    participant Client
    participant API as "/recall endpoint"
    participant Qdrant as "Qdrant Vector DB"
    participant Falkor as "FalkorDB Graph"
    participant Scorer as "_compute_metadata_score()"

    Client->>API: "GET /recall?query=X&tags=Y"

    par Vector Search
        API->>Qdrant: "search(embedding, limit)"
        Qdrant-->>API: "Similar vectors with scores"
    and Keyword Search
        API->>Falkor: "MATCH WHERE content CONTAINS X"
        Falkor-->>API: "Keyword matches with scores"
    end

    API->>API: "Merge and deduplicate results"

    loop For each result
        API->>Scorer: "compute score with all factors"
        Scorer-->>API: "Weighted final score"
    end

    API->>API: "Sort by final score DESC"
    API->>Falkor: "Fetch relationships for top results"
    Falkor-->>API: "Related memory edges"

    API-->>Client: "Ranked results with context"

Search Components

graph TB
    Query["Search Query"]

    subgraph "Search Strategies"
        Vector["Vector Similarity<br/>Qdrant cosine search<br/>Weight: 0.35"]
        Keyword["Keyword Match<br/>FalkorDB text search<br/>Weight: 0.35"]
        Tag["Tag Overlap<br/>Prefix/exact matching<br/>Weight: 0.15"]
        Importance["Importance Score<br/>User-assigned value<br/>Weight: 0.10"]
        Confidence["Confidence Score<br/>Classification confidence<br/>Weight: 0.05"]
        Recency["Recency Score<br/>Time-based decay<br/>Weight: 0.10"]
        Exact["Exact Match<br/>Query in metadata<br/>Weight: 0.15"]
    end

    Result["Final Weighted Score"]

    Query --> Vector
    Query --> Keyword
    Query --> Tag
    Query --> Importance
    Query --> Confidence
    Query --> Recency
    Query --> Exact

    Vector --> Result
    Keyword --> Result
    Tag --> Result
    Importance --> Result
    Confidence --> Result
    Recency --> Result
    Exact --> Result

Vector Search (Semantic Similarity)

Vector search uses Qdrant to find memories with semantically similar embeddings. Query text is converted to a vector using the configured embedding provider, then a cosine similarity search retrieves the top candidates.

The _vector_search() function handles both explicit embeddings (passed by client) and on-demand generation. If Qdrant is unavailable, the system gracefully degrades to graph-only mode.

Key behaviors:

Returns empty list if no query text and no explicit embedding provided
Applies tag filters via _build_qdrant_tag_filter()
Deduplicates results using seen_ids set
Attaches relations by calling _fetch_relations() for each hit

Keyword Search (Lexical Matching)

Keyword search performs text matching in FalkorDB’s graph store using Cypher queries. The system extracts keywords from the query, searches content and tags, and assigns scores based on match frequency.

The keyword scoring algorithm rewards both individual keyword matches and phrase matches:

Match Type	Location	Score
Single keyword	content field	+2
Single keyword	tags array	+1
Full phrase	content field	+2
Full phrase	tags array	+1

Fallback behavior: If the normalized query is empty or "*", the system calls _graph_trending_results() to return high-importance memories sorted by the sort parameter (time_asc, time_desc, updated_asc, updated_desc, or default importance ordering).

Graph Relationship Traversal

Graph traversal leverages FalkorDB’s typed relationship edges to find connected memories. This enables multi-hop reasoning and bridge discovery.

The _fetch_relations() helper queries all relationships for a given memory and returns a list of related memory summaries. Each relationship includes:

Target memory ID and summary
Relationship type (e.g., "PREFERS_OVER")
Strength value (metadata field on relationship edge)

Multi-hop patterns:

Direct relations — Single-hop from seed memory
Bridges — Memories that connect two or more seed results
Entity expansion — Following entity:<type>:<slug> tags to related memories

Temporal Alignment

Temporal scoring boosts memories that align with time-based query constraints. The system supports both absolute time ranges and relative expressions.

Supported time expressions:

Relative: last hour, last day, last week, last month, last year
Relative: this hour, this day, this week, this month, this year
Relative: next hour, next day, next week, next month, next year
Absolute: before 2025-02-01, after 2025-01-15
Range: between 2025-01-15 and 2025-01-20
Range: 2025-01-15 to 2025-01-20

Tag Matching

Tag filters support both exact matching and prefix matching for hierarchical tag namespaces. The system normalizes tags to lowercase and computes prefixes for efficient filtering.

Tag prefix system: AutoMem automatically computes tag prefixes for efficient hierarchical filtering. For example, the tag "entity:person:sarah" generates prefixes: ["entity", "entity:person", "entity:person:sarah"].

Example queries:

tags=slack&tag_match=prefix
  → Matches: slack:*, slack:U123:*, slack:channel-ops

tags=entity:person&tag_match=prefix&tag_mode=all
  → Matches: entity:person:*, entity:person:sarah, entity:person:john

tags=project&tags=decision&tag_mode=all
  → Matches: Memories tagged with both "project" AND "decision"

Metadata Scoring Components

Three metadata fields contribute to the final score: importance, confidence, and recency.

Recency calculation: Recency uses an exponential decay function based on the time since last access (or creation if never accessed). Newer memories receive higher scores.

Default behavior:

Missing importance defaults to 0.5
Missing confidence defaults to 0.7
Missing last_accessed falls back to timestamp

9-Component Scoring System

The final score for each memory result combines nine weighted components. The weights are configurable via environment variables.

Scoring Formula

final_score =
    vector_similarity    × SEARCH_WEIGHT_VECTOR       (default: 0.25)
  + keyword_score        × SEARCH_WEIGHT_KEYWORD      (default: 0.15)
  + relation_strength    × 0.25                       (hardcoded)
  + content_overlap      × 0.25                       (hardcoded)
  + temporal_alignment   × SEARCH_WEIGHT_TEMPORAL     (default: 0.15)
  + tag_match_score      × SEARCH_WEIGHT_TAG          (default: 0.10)
  + importance           × SEARCH_WEIGHT_IMPORTANCE   (default: 0.05)
  + confidence           × SEARCH_WEIGHT_CONFIDENCE   (default: 0.05)
  + recency_score        × SEARCH_WEIGHT_RECENCY      (default: 0.10)

Component Weights

Component	Default Weight	Configurable	Description
Vector	25%	`SEARCH_WEIGHT_VECTOR`	Semantic similarity from Qdrant
Keyword	15%	`SEARCH_WEIGHT_KEYWORD`	Lexical matching score
Relation	25%	No	Graph relationship strength
Content	25%	No	Direct token overlap
Temporal	15%	`SEARCH_WEIGHT_TEMPORAL`	Time alignment
Tag	10%	`SEARCH_WEIGHT_TAG`	Tag filter matching
Importance	5%	`SEARCH_WEIGHT_IMPORTANCE`	User-assigned priority
Confidence	5%	`SEARCH_WEIGHT_CONFIDENCE`	Classification certainty
Recency	10%	`SEARCH_WEIGHT_RECENCY`	Time-based decay

Score Combination Flow

flowchart TD
    subgraph sources ["Data Source Scores"]
        VS["Vector Search<br/>Qdrant similarity<br/>0.0 - 1.0"]
        KS["Keyword Search<br/>TF-IDF score<br/>Normalized"]
        GS["Graph Score<br/>Importance fallback"]
    end

    subgraph weights ["Weight Application"]
        VW["× SEARCH_WEIGHT_VECTOR<br/>0.25"]
        KW["× SEARCH_WEIGHT_KEYWORD<br/>0.15"]
        RW["× relation_strength<br/>0.25"]
        CW["× content_overlap<br/>0.25"]
        TW["× SEARCH_WEIGHT_TEMPORAL<br/>0.15"]
        TagW["× SEARCH_WEIGHT_TAG<br/>0.10"]
        IW["× SEARCH_WEIGHT_IMPORTANCE<br/>0.05"]
        ConfW["× SEARCH_WEIGHT_CONFIDENCE<br/>0.05"]
        RecW["× SEARCH_WEIGHT_RECENCY<br/>0.10"]
    end

    subgraph combination ["Score Combination"]
        Sum["SUM all weighted<br/>components"]
        Normalize["Normalize to 0.0 - 1.0"]
    end

    subgraph output ["Final Ranking"]
        Sort["Sort by final_score DESC"]
        Dedup["Deduplicate by memory_id<br/>using seen_ids set"]
        Limit["Apply limit parameter"]
    end

    Results["Ranked results array"]

    VS --> VW
    KS --> KW
    GS --> RW
    VS --> CW
    VS --> TW
    VS --> TagW
    VS --> IW
    VS --> ConfW
    VS --> RecW

    VW --> Sum
    KW --> Sum
    RW --> Sum
    CW --> Sum
    TW --> Sum
    TagW --> Sum
    IW --> Sum
    ConfW --> Sum
    RecW --> Sum

    Sum --> Normalize
    Normalize --> Sort
    Sort --> Dedup
    Dedup --> Limit
    Limit --> Results

Multi-Hop Reasoning

Multi-hop reasoning enables AutoMem to find memories that are indirectly related to the query through intermediate connections.

Bridge Discovery

Bridge discovery identifies memories that connect multiple seed results, revealing hidden relationships.

Configuration:

expand_relations=true — Enable relation expansion (default: true)
expand_min_strength — Minimum relationship strength (0.0-1.0)
expand_min_importance — Minimum target memory importance (0.0-1.0)
RECALL_EXPANSION_LIMIT — Maximum expanded results (default: 25)

Bridge scoring: A bridge memory’s score is the sum of its relationship strengths to all seed memories. Higher scores indicate stronger connections.

Entity Expansion

Entity expansion follows entity tags to find related memories. This enables queries like “What is Sarah’s sister’s job?” to work across multiple memory hops.

How it works:

Execute initial recall to get seed results
For each seed result, extract entities using extract_entities()
Convert extracted entities to entity:<type>:<slug> tags
Query FalkorDB/Qdrant for memories with matching entity tags
Merge entity-expanded results with seed results
Apply expansion limits and filters

Configuration:

expand_entities=true — Enable entity expansion (default: false)
entity_expansion=true — Alias for expand_entities
Entity expansion respects original tag filters for context scoping

Entity types extracted:

entity:person:<name> — People mentioned
entity:tool:<name> — Tools and technologies
entity:project:<name> — Projects and repositories
entity:concept:<name> — Concepts and ideas
entity:organization:<name> — Organizations

Complete Search Flow

The recall endpoint orchestrates the entire hybrid search process:

Key decision points:

Embedding generation: Skip if explicit embedding parameter provided
Vector vs Keyword: Vector search requires Qdrant; keyword search always available
Trending fallback: If query is empty or "*", use _graph_trending_results()
Expansion order: Bridges first, then entity expansion
Filter application: Seed results never filtered; expanded results respect min thresholds

Configuration Reference

Search Weight Environment Variables

Variable	Default	Description
`SEARCH_WEIGHT_VECTOR`	0.25	Vector similarity contribution
`SEARCH_WEIGHT_KEYWORD`	0.15	Keyword match contribution
`SEARCH_WEIGHT_TEMPORAL`	0.15	Temporal alignment contribution
`SEARCH_WEIGHT_TAG`	0.10	Tag match contribution
`SEARCH_WEIGHT_IMPORTANCE`	0.05	Importance field contribution
`SEARCH_WEIGHT_CONFIDENCE`	0.05	Confidence field contribution
`SEARCH_WEIGHT_RECENCY`	0.10	Recency decay contribution
`SEARCH_WEIGHT_EXACT`	0.10	Exact phrase match boost

Expansion and Limit Configuration

Variable	Default	Description
`RECALL_MAX_LIMIT`	100	Maximum results per recall request
`RECALL_EXPANSION_LIMIT`	25	Maximum expanded results (bridges + entities)
`RECALL_RELATION_LIMIT`	10	Maximum relations fetched per memory

Query Parameters

Parameter	Type	Default	Description
`query`	string	—	Search query text
`embedding`	float[]	—	Pre-computed embedding vector
`tags`	string[]	—	Tag filters (comma-separated)
`tag_mode`	enum	”any”	Match mode: `"any"` or `"all"`
`tag_match`	enum	”prefix”	Match type: `"prefix"` or `"exact"`
`exclude_tags`	string[]	—	Tags to exclude
`time_query`	string	—	Temporal expression or ISO range
`expand_relations`	boolean	true	Enable bridge discovery
`expand_entities`	boolean	false	Enable entity expansion
`expand_min_strength`	float	—	Minimum relation strength filter
`expand_min_importance`	float	—	Minimum target importance filter
`limit`	integer	20	Maximum seed results
`sort`	enum	”score”	Sort order: `"score"`, `"time_asc"`, `"time_desc"`, `"updated_asc"`, `"updated_desc"`

Performance Characteristics

Benchmark Results (LoCoMo ACL 2024)

AutoMem achieves 90.53% accuracy on the LoCoMo benchmark with the following category breakdown:

Category	Accuracy	Notes
Complex Reasoning	100.00%	Perfect score on multi-step queries
Open Domain	95.84%	General knowledge recall
Temporal Understanding	85.05%	Time-aware queries
Single-hop Recall	79.79%	Basic fact retrieval
Multi-hop Reasoning	50.00%	Bridge discovery (+12.5pp over baseline)

Query Response Times

Typical response times for different query patterns:

Query Type	Typical Latency	Notes
Vector-only (Qdrant)	20-50ms	Semantic similarity only
Keyword-only (FalkorDB)	30-80ms	Graph keyword search
Hybrid (both stores)	50-150ms	Combined vector + keyword
With bridge expansion	100-300ms	Includes multi-hop traversal
With entity expansion	150-400ms	Includes entity tag queries

Optimization tips:

Use explicit embedding parameter to skip generation (saves 200-500ms)
Set tight expand_min_strength filters to reduce expansion overhead
Use limit parameter to reduce result set size
Enable Qdrant for semantic search; fallback to keyword-only is slower but functional