What’s Missing in Today’s AI Coding Tools?
Over the past two years, we’ve seen a surge of “AI + coding” products. A representative example is Claude Code, which stands out for its clean interaction design and its ability to quickly generate code through conversational prompts. However, it comes with three well-known drawbacks that many developers have likely encountered:Context Is Often Missing
In the traditional Claude Code / CodeX CLI workflow:- Input → A pile of code snippets you manually copy and paste.
- Model → Forced to guess the context based only on natural language and raw strings.
- Lost cross-file references: The LLM has no way to access complete project context.
- Blurry semantic boundaries: The model only sees “a few isolated lines of code”, not the full business logic.
High Usage Costs
Claude is undeniably powerful, but the price can be a deal-breaking barrier for both enterprises and individual developers in everyday use. For example, with Claude Opus 4.1, Base Input Tokens are priced at $15 / MTok and Output Tokens at $75 / MTok(Reference)—just a few back-and-forth interactions can easily rack up tens of dollars. Cursor has become popular among developers and effectively addresses the context problem. However, since September 15, Cursor no longer allows subscribers to use Auto Mode without limits. Auto requests now count against the included usage quota of your plan. (Reference)Privacy and Restriction Concerns
Cursor not only faces high usage costs, but its privacy and closed-source limitations are equally concerning.- Security risks: During the codebase indexing stage, the official documentation states that “embeddings and metadata are stored in the cloud.” While source code itself may not be stored, transmitting it over the network is unavoidable. (Reference)
- Closed-source limitations: Cursor operates as a closed-source black box, leaving no way to audit how it actually works. Persistent telemetry runs in the background, continuously sending analytics calls. Even when the app is idle, unexplained data streams maintain live connections to their servers. More concerning, the platform repeatedly re-ingests your entire codebase without explicit consent—sometimes even a brief pause can trigger a full reprocessing.
Claude Context: Zilliz’s Open-Source Solution for Code Context
Claude Context is an open-source, MCP-compatible semantic code search engine. Whether you’re building a custom AI coding assistant from scratch or adding semantic awareness to existing agents like Claude Code or Gemini CLI, Claude Context provides the engine to make it possible. It runs locally, integrates seamlessly with your favorite tools and environments (such as VS Code and Chrome), and delivers powerful code-understanding capabilities—without depending on closed-source, cloud-only platforms. (Reference) At its core, Claude Code functions as a RAG (Retrieval-Augmented Generation) engine, composed of three main modules:- Text Processing: Syntax-based logic for segmenting code into meaningful chunks.
- Embedding Service: Integrates with popular cloud-hosted embedding providers.
- Vector Database: Connects to either local or cloud-hosted Milvus vector engines.

Enhancements For Fully-local, Self-hosted Code Context
To enable a fully local deployment of the coding agent, we enhanced Claude Code with the following two features:- Vector Search with Postgres: No longer limited to a single vector database—you can run directly on Postgres, either locally, cloud, or your existing PG infra. For example, spin up a Dockerized Postgres instance on your laptop, or use lightweight Postgres services like Relyt ONE(https://data.cloud/relytone) or Neon(https://neon.com/).
- Encryption: Many companies worry about whether AI might “leak” source code. To address this, indexing can store data in encrypted form, with decryption happening only in local memory at runtime. This ensures your entire development workflow continues to operate strictly within a private boundary.

Hands On
To achieve a fully localized deployment of Claude Context and enhance CodeX CLI with stronger contextual capabilities, we use a local Ollama Embedding Model together with a Dockerized Postgres setup. With the following script, you can perform one-click installation and configuration:- Ollama installation and embedding model download
- Postgres download and setup
- CodeX configuration rendering: Claude Context MCP
Common Issues
- Ollama is not installed on macOS
- On macOS, Ollama must be installed manually via the
.dmg
package: Ollama Download - After installation, re-run the script (note: you must manually run
ollama
once after the dmg installation).
- On macOS, Ollama must be installed manually via the
- Docker is not installed. Please install Docker first.
- If you run the configuration script without providing a
postgres-url
parameter, the script will automatically spin up a local Postgres instance as the vector engine. This requires a Docker environment, so you’ll need to install Docker first. - If you prefer not to install Docker, you can create a Postgres database using a managed service such as Relyt ONE, and then provide the connection string when running the script.
- If you run the configuration script without providing a
1
New Step