Self-hosted AI platform for indexing and chatting with your documents.
No file limits. Complete privacy. Powered by Claude.
A personal story
I was building a "second brain" in Obsidian based on the Zettelkasten method for all of my research, but I was restricted by simple keyword searches. For images, OCR processing required additional tools, and indexing content by storing tokens in markdown was less than ideal. Neither Obsidian, Notion, nor Evernote handled PDF searches well—especially locked PDFs. Obsidian is great for searching markdown, but I needed something more powerful.
Claude handled these complex file types beautifully, including read-only PDFs, but uploading to Claude Projects would have been a very manual process. Plus, my data would now be in two different repos—Obsidian and Claude—which was less than ideal, to say the least. There was no API for automating any of this in a Claude project.
Khoj offered something attractive, but it had a 10MB file size limit, couldn't process large PDFs, and couldn't process any read-only (locked) PDFs.
Commercial solutions were costly and didn't fit my workflow. Obsidian is my chosen knowledge base and I needed direct integration with it. Additionally, my research is often digital notes and scribbles—incomplete thoughts in many cases. I didn't want my data available for others to see or systems to process.
So I built Knosi. A self-hosted platform with no file size limits, full API access for automation, and Claude's powerful PDF parsing that could handle even locked documents. Simple Python backend, Docker deployment, PostgreSQL with pgvector for embeddings. Just ~500 lines of code doing exactly what I needed—nothing more, nothing less.
Built for research. Shared with everyone.
If you're frustrated with file size limits, vendor lock-in, or tools that can't handle your documents, Knosi might be what you need too.
Everything you need for powerful document search
Self-hosted on your infrastructure. Your documents never leave your control. Full data sovereignty.
Upload PDFs of any size. Default 100MB limit is fully configurable to meet your needs.
State-of-the-art PDF parsing and RAG chat using Anthropic's cutting-edge Claude AI.
Advanced vector search with pgvector and customizable embedding models for precise results.
Customize chunk sizes, embedding models, rate limits, and more via environment variables.
Automatic document syncing with Obsidian plugin and Python filesystem watcher. Your knowledge base stays up-to-date.
Get started in three simple steps
Clone the repo and run docker compose up on your server. Configure with your Anthropic API key.
Use the web UI, Obsidian plugin, or Python filesystem watcher to upload and index your documents automatically.
Ask questions and get AI-powered answers grounded in your documents. Sources are cited automatically for verification.
Multiple ways to sync your documents
Native Obsidian integration syncs your vault automatically. Changes are queued and uploaded in the background.
Python script monitors any folder and automatically uploads new or modified documents to your knowledge base.
Simple drag-and-drop web UI for manual uploads. Real-time progress tracking with Server-Sent Events.
Index a wide variety of document formats
Knosi is fully open source. Deploy on your infrastructure,
customize to your needs, contribute to the community.