Skip to content

Hybrid Search

This document explains AutoMem’s hybrid search system, which combines semantic, lexical, graph, temporal, and metadata signals to retrieve and rank memories. The system implements a 9-component scoring algorithm that achieves 90.53% accuracy on the LoCoMo benchmark.

For information about the memory structure being searched, see Memory Model. For details on relationship types used in graph traversal, see Relationship Types. For API usage, see Recall Operations.


AutoMem’s recall system combines results from two complementary data stores and applies multi-dimensional scoring to rank results by relevance:

sequenceDiagram
    participant Client
    participant API as "/recall endpoint"
    participant Qdrant as "Qdrant Vector DB"
    participant Falkor as "FalkorDB Graph"
    participant Scorer as "_compute_metadata_score()"

    Client->>API: "GET /recall?query=X&tags=Y"

    par Vector Search
        API->>Qdrant: "search(embedding, limit)"
        Qdrant-->>API: "Similar vectors with scores"
    and Keyword Search
        API->>Falkor: "MATCH WHERE content CONTAINS X"
        Falkor-->>API: "Keyword matches with scores"
    end

    API->>API: "Merge and deduplicate results"

    loop For each result
        API->>Scorer: "compute score with all factors"
        Scorer-->>API: "Weighted final score"
    end

    API->>API: "Sort by final score DESC"
    API->>Falkor: "Fetch relationships for top results"
    Falkor-->>API: "Related memory edges"

    API-->>Client: "Ranked results with context"

graph TB
    Query["Search Query"]

    subgraph "Search Strategies"
        Vector["Vector Similarity<br/>Qdrant cosine search<br/>Weight: 0.35"]
        Keyword["Keyword Match<br/>FalkorDB text search<br/>Weight: 0.35"]
        Tag["Tag Overlap<br/>Prefix/exact matching<br/>Weight: 0.15"]
        Importance["Importance Score<br/>User-assigned value<br/>Weight: 0.10"]
        Confidence["Confidence Score<br/>Classification confidence<br/>Weight: 0.05"]
        Recency["Recency Score<br/>Time-based decay<br/>Weight: 0.10"]
        Exact["Exact Match<br/>Query in metadata<br/>Weight: 0.15"]
    end

    Result["Final Weighted Score"]

    Query --> Vector
    Query --> Keyword
    Query --> Tag
    Query --> Importance
    Query --> Confidence
    Query --> Recency
    Query --> Exact

    Vector --> Result
    Keyword --> Result
    Tag --> Result
    Importance --> Result
    Confidence --> Result
    Recency --> Result
    Exact --> Result

Vector search uses Qdrant to find memories with semantically similar embeddings. Query text is converted to a vector using the configured embedding provider, then a cosine similarity search retrieves the top candidates.

The _vector_search() function handles both explicit embeddings (passed by client) and on-demand generation. If Qdrant is unavailable, the system gracefully degrades to graph-only mode.

Key behaviors:

  • Returns empty list if no query text and no explicit embedding provided
  • Applies tag filters via _build_qdrant_tag_filter()
  • Deduplicates results using seen_ids set
  • Attaches relations by calling _fetch_relations() for each hit

Keyword search performs text matching in FalkorDB’s graph store using Cypher queries. The system extracts keywords from the query, searches content and tags, and assigns scores based on match frequency.

The keyword scoring algorithm rewards both individual keyword matches and phrase matches:

Match TypeLocationScore
Single keywordcontent field+2
Single keywordtags array+1
Full phrasecontent field+2
Full phrasetags array+1

Fallback behavior: If the normalized query is empty or "*", the system calls _graph_trending_results() to return high-importance memories sorted by the sort parameter (time_asc, time_desc, updated_asc, updated_desc, or default importance ordering).

Graph traversal leverages FalkorDB’s typed relationship edges to find connected memories. This enables multi-hop reasoning and bridge discovery.

The _fetch_relations() helper queries all relationships for a given memory and returns a list of related memory summaries. Each relationship includes:

  • Target memory ID and summary
  • Relationship type (e.g., "PREFERS_OVER")
  • Strength value (metadata field on relationship edge)

Multi-hop patterns:

  1. Direct relations — Single-hop from seed memory
  2. Bridges — Memories that connect two or more seed results
  3. Entity expansion — Following entity:<type>:<slug> tags to related memories

Temporal scoring boosts memories that align with time-based query constraints. The system supports both absolute time ranges and relative expressions.

Supported time expressions:

  • Relative: last hour, last day, last week, last month, last year
  • Relative: this hour, this day, this week, this month, this year
  • Relative: next hour, next day, next week, next month, next year
  • Absolute: before 2025-02-01, after 2025-01-15
  • Range: between 2025-01-15 and 2025-01-20
  • Range: 2025-01-15 to 2025-01-20

Tag filters support both exact matching and prefix matching for hierarchical tag namespaces. The system normalizes tags to lowercase and computes prefixes for efficient filtering.

Tag prefix system: AutoMem automatically computes tag prefixes for efficient hierarchical filtering. For example, the tag "entity:person:sarah" generates prefixes: ["entity", "entity:person", "entity:person:sarah"].

Example queries:

tags=slack&tag_match=prefix
→ Matches: slack:*, slack:U123:*, slack:channel-ops
tags=entity:person&tag_match=prefix&tag_mode=all
→ Matches: entity:person:*, entity:person:sarah, entity:person:john
tags=project&tags=decision&tag_mode=all
→ Matches: Memories tagged with both "project" AND "decision"

Three metadata fields contribute to the final score: importance, confidence, and recency.

Recency calculation: Recency uses an exponential decay function based on the time since last access (or creation if never accessed). Newer memories receive higher scores.

Default behavior:

  • Missing importance defaults to 0.5
  • Missing confidence defaults to 0.7
  • Missing last_accessed falls back to timestamp

The final score for each memory result combines nine weighted components. The weights are configurable via environment variables.

final_score =
vector_similarity × SEARCH_WEIGHT_VECTOR (default: 0.25)
+ keyword_score × SEARCH_WEIGHT_KEYWORD (default: 0.15)
+ relation_strength × 0.25 (hardcoded)
+ content_overlap × 0.25 (hardcoded)
+ temporal_alignment × SEARCH_WEIGHT_TEMPORAL (default: 0.15)
+ tag_match_score × SEARCH_WEIGHT_TAG (default: 0.10)
+ importance × SEARCH_WEIGHT_IMPORTANCE (default: 0.05)
+ confidence × SEARCH_WEIGHT_CONFIDENCE (default: 0.05)
+ recency_score × SEARCH_WEIGHT_RECENCY (default: 0.10)
ComponentDefault WeightConfigurableDescription
Vector25%SEARCH_WEIGHT_VECTORSemantic similarity from Qdrant
Keyword15%SEARCH_WEIGHT_KEYWORDLexical matching score
Relation25%NoGraph relationship strength
Content25%NoDirect token overlap
Temporal15%SEARCH_WEIGHT_TEMPORALTime alignment
Tag10%SEARCH_WEIGHT_TAGTag filter matching
Importance5%SEARCH_WEIGHT_IMPORTANCEUser-assigned priority
Confidence5%SEARCH_WEIGHT_CONFIDENCEClassification certainty
Recency10%SEARCH_WEIGHT_RECENCYTime-based decay
flowchart TD
    subgraph sources ["Data Source Scores"]
        VS["Vector Search<br/>Qdrant similarity<br/>0.0 - 1.0"]
        KS["Keyword Search<br/>TF-IDF score<br/>Normalized"]
        GS["Graph Score<br/>Importance fallback"]
    end

    subgraph weights ["Weight Application"]
        VW["× SEARCH_WEIGHT_VECTOR<br/>0.25"]
        KW["× SEARCH_WEIGHT_KEYWORD<br/>0.15"]
        RW["× relation_strength<br/>0.25"]
        CW["× content_overlap<br/>0.25"]
        TW["× SEARCH_WEIGHT_TEMPORAL<br/>0.15"]
        TagW["× SEARCH_WEIGHT_TAG<br/>0.10"]
        IW["× SEARCH_WEIGHT_IMPORTANCE<br/>0.05"]
        ConfW["× SEARCH_WEIGHT_CONFIDENCE<br/>0.05"]
        RecW["× SEARCH_WEIGHT_RECENCY<br/>0.10"]
    end

    subgraph combination ["Score Combination"]
        Sum["SUM all weighted<br/>components"]
        Normalize["Normalize to 0.0 - 1.0"]
    end

    subgraph output ["Final Ranking"]
        Sort["Sort by final_score DESC"]
        Dedup["Deduplicate by memory_id<br/>using seen_ids set"]
        Limit["Apply limit parameter"]
    end

    Results["Ranked results array"]

    VS --> VW
    KS --> KW
    GS --> RW
    VS --> CW
    VS --> TW
    VS --> TagW
    VS --> IW
    VS --> ConfW
    VS --> RecW

    VW --> Sum
    KW --> Sum
    RW --> Sum
    CW --> Sum
    TW --> Sum
    TagW --> Sum
    IW --> Sum
    ConfW --> Sum
    RecW --> Sum

    Sum --> Normalize
    Normalize --> Sort
    Sort --> Dedup
    Dedup --> Limit
    Limit --> Results

Multi-hop reasoning enables AutoMem to find memories that are indirectly related to the query through intermediate connections.

Bridge discovery identifies memories that connect multiple seed results, revealing hidden relationships.

Configuration:

  • expand_relations=true — Enable relation expansion (default: true)
  • expand_min_strength — Minimum relationship strength (0.0-1.0)
  • expand_min_importance — Minimum target memory importance (0.0-1.0)
  • RECALL_EXPANSION_LIMIT — Maximum expanded results (default: 25)

Bridge scoring: A bridge memory’s score is the sum of its relationship strengths to all seed memories. Higher scores indicate stronger connections.

Entity expansion follows entity tags to find related memories. This enables queries like “What is Sarah’s sister’s job?” to work across multiple memory hops.

How it works:

  1. Execute initial recall to get seed results
  2. For each seed result, extract entities using extract_entities()
  3. Convert extracted entities to entity:<type>:<slug> tags
  4. Query FalkorDB/Qdrant for memories with matching entity tags
  5. Merge entity-expanded results with seed results
  6. Apply expansion limits and filters

Configuration:

  • expand_entities=true — Enable entity expansion (default: false)
  • entity_expansion=true — Alias for expand_entities
  • Entity expansion respects original tag filters for context scoping

Entity types extracted:

  • entity:person:<name> — People mentioned
  • entity:tool:<name> — Tools and technologies
  • entity:project:<name> — Projects and repositories
  • entity:concept:<name> — Concepts and ideas
  • entity:organization:<name> — Organizations

The recall endpoint orchestrates the entire hybrid search process:

Key decision points:

  1. Embedding generation: Skip if explicit embedding parameter provided
  2. Vector vs Keyword: Vector search requires Qdrant; keyword search always available
  3. Trending fallback: If query is empty or "*", use _graph_trending_results()
  4. Expansion order: Bridges first, then entity expansion
  5. Filter application: Seed results never filtered; expanded results respect min thresholds

VariableDefaultDescription
SEARCH_WEIGHT_VECTOR0.25Vector similarity contribution
SEARCH_WEIGHT_KEYWORD0.15Keyword match contribution
SEARCH_WEIGHT_TEMPORAL0.15Temporal alignment contribution
SEARCH_WEIGHT_TAG0.10Tag match contribution
SEARCH_WEIGHT_IMPORTANCE0.05Importance field contribution
SEARCH_WEIGHT_CONFIDENCE0.05Confidence field contribution
SEARCH_WEIGHT_RECENCY0.10Recency decay contribution
SEARCH_WEIGHT_EXACT0.10Exact phrase match boost
VariableDefaultDescription
RECALL_MAX_LIMIT100Maximum results per recall request
RECALL_EXPANSION_LIMIT25Maximum expanded results (bridges + entities)
RECALL_RELATION_LIMIT10Maximum relations fetched per memory
ParameterTypeDefaultDescription
querystringSearch query text
embeddingfloat[]Pre-computed embedding vector
tagsstring[]Tag filters (comma-separated)
tag_modeenum”any”Match mode: "any" or "all"
tag_matchenum”prefix”Match type: "prefix" or "exact"
exclude_tagsstring[]Tags to exclude
time_querystringTemporal expression or ISO range
expand_relationsbooleantrueEnable bridge discovery
expand_entitiesbooleanfalseEnable entity expansion
expand_min_strengthfloatMinimum relation strength filter
expand_min_importancefloatMinimum target importance filter
limitinteger20Maximum seed results
sortenum”score”Sort order: "score", "time_asc", "time_desc", "updated_asc", "updated_desc"

AutoMem achieves 90.53% accuracy on the LoCoMo benchmark with the following category breakdown:

CategoryAccuracyNotes
Complex Reasoning100.00%Perfect score on multi-step queries
Open Domain95.84%General knowledge recall
Temporal Understanding85.05%Time-aware queries
Single-hop Recall79.79%Basic fact retrieval
Multi-hop Reasoning50.00%Bridge discovery (+12.5pp over baseline)

Typical response times for different query patterns:

Query TypeTypical LatencyNotes
Vector-only (Qdrant)20-50msSemantic similarity only
Keyword-only (FalkorDB)30-80msGraph keyword search
Hybrid (both stores)50-150msCombined vector + keyword
With bridge expansion100-300msIncludes multi-hop traversal
With entity expansion150-400msIncludes entity tag queries

Optimization tips:

  • Use explicit embedding parameter to skip generation (saves 200-500ms)
  • Set tight expand_min_strength filters to reduce expansion overhead
  • Use limit parameter to reduce result set size
  • Enable Qdrant for semantic search; fallback to keyword-only is slower but functional