Replies: 5 comments
-
|
Metadata for extracted memories is essential for production use. Here's a comprehensive approach: Metadata Schema Designfrom dataclasses import dataclass, field
from datetime import datetime
from typing import List, Dict, Optional
from enum import Enum
class MemorySource(Enum):
USER_EXPLICIT = "user_explicit" # User directly stated
USER_IMPLICIT = "user_implicit" # Inferred from behavior
AGENT_DERIVED = "agent_derived" # Agent concluded
EXTERNAL_API = "external_api" # From external source
class ConfidenceLevel(Enum):
HIGH = "high" # Directly stated
MEDIUM = "medium" # Reasonably inferred
LOW = "low" # Speculative
@dataclass
class MemoryMetadata:
# Provenance
source: MemorySource
source_message_id: Optional[str] = None
extraction_model: str = "gpt-4"
# Confidence
confidence: ConfidenceLevel = ConfidenceLevel.MEDIUM
confidence_score: float = 0.8
# Temporal
created_at: datetime = field(default_factory=datetime.utcnow)
valid_from: Optional[datetime] = None
valid_until: Optional[datetime] = None # For time-bound facts
last_accessed: Optional[datetime] = None
access_count: int = 0
# Categorization
category: str = "general" # preference, fact, event, relationship
tags: List[str] = field(default_factory=list)
entities: List[str] = field(default_factory=list)
# Versioning
version: int = 1
supersedes: Optional[str] = None # ID of memory this updates
superseded_by: Optional[str] = NoneUsage in Retrievaldef smart_retrieve(query: str, user_id: str) -> List[Memory]:
memories = mem0.search(query, user_id=user_id)
# Filter and rank by metadata
scored = []
for mem in memories:
score = mem.score # Base vector similarity
# Boost recent memories
age_days = (datetime.utcnow() - mem.metadata.created_at).days
recency_boost = 1.0 / (1 + age_days * 0.1)
# Boost high confidence
confidence_boost = {"high": 1.2, "medium": 1.0, "low": 0.8}
# Penalize expired memories
if mem.metadata.valid_until and mem.metadata.valid_until < datetime.utcnow():
score *= 0.3
final_score = score * recency_boost * confidence_boost[mem.metadata.confidence.value]
scored.append((mem, final_score))
return sorted(scored, key=lambda x: x[1], reverse=True)Key Metadata to Track
More on memory patterns: https://github.com/KeepALifeUS/autonomous-agents |
Beta Was this translation helpful? Give feedback.
-
|
Big +1 on metadata support for memories! Use cases we've needed this for:
Implementation suggestion: We've implemented similar patterns at RevolutionAI for enterprise memory systems. The metadata layer is what makes memories actionable vs just stored. |
Beta Was this translation helpful? Give feedback.
-
|
Metadata for extracted memories is essential for production use cases! Here is what has worked for us at RevolutionAI (https://revolutionai.io): Core metadata fields: {
"source": "conversation|document|tool_output",
"timestamp": "ISO datetime",
"confidence": 0.0-1.0,
"user_id": "for multi-tenant",
"session_id": "conversation context",
"extraction_model": "gpt-4|llama3|etc",
"tags": ["preference", "fact", "instruction"]
}Why each matters:
Pro tip: Add a "verified" boolean field. Let users confirm/deny memories — builds trust and improves quality over time. What metadata fields are you finding most useful for your use case? |
Beta Was this translation helpful? Give feedback.
-
|
Metadata for extracted memories is crucial! At RevolutionAI (https://revolutionai.io) we heavily use metadata for filtering and retrieval. Our metadata schema: class MemoryMetadata(BaseModel):
source: str # conversation, document, observation
confidence: float # 0-1 extraction confidence
timestamp: datetime
context_id: str # conversation/session ID
tags: list[str] # auto-extracted topics
user_id: str # who this memory belongs to
expires_at: datetime | None # TTL for temporary infoUse cases:
memories = mem0.search(
query="meeting notes",
filters={"timestamp": {"$gte": last_week}}
)
# Only use verified document memories, not conversation
memories = mem0.search(query, filters={"source": "document"})
# High-confidence memories only for critical decisions
memories = mem0.search(query, filters={"confidence": {"$gte": 0.8}})Would love to see first-class metadata support in extraction pipeline! |
Beta Was this translation helpful? Give feedback.
-
|
Great use case! Metadata-rich memories are essential for educational apps. Current workaround — embed metadata in the text: custom_prompt = """
Extract facts in this format:
[CATEGORY:progress|difficulty|misconception] [LEVEL:strong|medium|weak] fact text
Example: [CATEGORY:progress] [LEVEL:strong] Finished 50% of calculus chapter
"""
# Parse when retrieving
def parse_memory(text):
match = re.match(r"\[CATEGORY:(\w+)\] \[LEVEL:(\w+)\] (.+)", text)
return {
"category": match.group(1),
"level": match.group(2),
"text": match.group(3)
}Cleaner: Post-process and store in payload class EnhancedMemory(Memory):
def add(self, messages, user_id, **kwargs):
# Extract with custom prompt
result = super().add(messages, user_id, **kwargs)
# Parse and add metadata to vector store payload
for mem in result:
parsed = self.extract_metadata(mem.text)
self.vector_store.update_payload(
mem.id,
metadata=parsed
)
return resultIdeal API (feature request): memory.add(
messages,
user_id,
extraction_schema={
"text": str,
"category": ["progress", "difficulty", "misconception"],
"level": ["strong", "medium", "weak"]
}
)For retrieval filtering: memories = memory.search(
query,
user_id,
filters={"category": "misconception", "level": "weak"}
)We build educational AI at Revolution AI — structured metadata extraction would be a great addition to mem0. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi mem0 team
I’ve been using
custom_fact_extraction_promptand really like how much control it gives over what gets stored as memory. I wanted to ask about a possible extension to allow custom metadata to be attached to extracted facts.In applications, like personalized tutoring, memories represent a student’s learning state, not just a static fact. Along with the extracted text, it’s useful to capture:
For example:
Examples:
Would it be feasible for
custom_fact_extraction_promptto optionally return metadata along with each fact?Something like:
{ "facts": [ { "text": "Finished 50% of calculus chapter", "metadata": { "category": "progress", "level": "strong" } } ] }Alternatively, is there an existing or recommended pattern for attaching progress levels or custom categories to extracted memories?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions