Everything you need for AI agent memory management
Lloyd–Max quantization with DLQ residual correction reduces vector storage by up to 95% while maintaining cosine similarity accuracy.
Find memories by meaning, not just keywords. Cosine similarity search across compressed vectors with relevance scoring via pgvector HNSW.
Row-level security ensures complete data isolation between organizations and projects. Built for production multi-tenancy from day one.
Tag memories by agent, user, and session. Build persistent context for AI agents that remember across conversations and interactions.
Track memory changes over time with automatic versioning. Every update creates a new version, preserving the complete history.
Full REST API with TypeScript and Python SDKs. Store, search, update, and delete memories programmatically with typed clients.
Model Context Protocol server for direct integration with AI assistants like Claude. Expose memories as tools for LLM agents to use natively.
Async embedding pipeline with configurable worker pools. Non-blocking ingestion means your API stays fast while heavy compute runs in background.
Simple, secure API key auth with per-key rate limiting. Create multiple keys with different permissions and rate limits per project.