Ollama Copilot integrates local LLMs from Ollama directly into VS Code, providing AI-powered code completion and an interactive chat experience with your own locally-running models.
- AI-powered code completions: Get contextual code suggestions as you type, with support for:
- Variable and function name awareness
- Context-aware completions based on surrounding code
- Multi-line code suggestions
- Language-specific completions
- Dedicated chat interface: Ask questions about your code and get detailed responses through:
- Sidebar chat panel for quick access
- Dedicated chat view for more detailed discussions
- Real-time streaming responses
- Local model selection: Choose from any model installed in Ollama
- Context-aware assistance: The extension analyzes:
- Selected code snippets
- Specific files you choose
- Your entire workspace
- Function purposes and documentation
- Variables in scope
- Privacy-focused: All processing happens locally on your machine through Ollama
- Customizable configuration: Set your preferred:
- Default model
- API host
- Workspace context settings
- Ollama must be installed and running on your system
- You should have at least one model pulled in Ollama (see model recommendations)
- Install the extension from the VS Code marketplace
- Ensure Ollama is running in the background
- Select a default model when prompted (or set it later in settings)
Before using the extension, you need to download at least one model in Ollama:
Command Line:
# Download a code-optimized model
ollama pull codellama:13b
# Download a general-purpose model
ollama pull llama3:8b
# List available models
ollama list
Web UI: You can also download models through the Ollama web interface at http://localhost:11434 if you have it enabled.
Code completion is automatically active while you type. The extension analyzes your code context, including:
- Variables in scope
- Function declarations and parameters
- Comments and documentation
- Surrounding code context
Two ways to access the chat:
- Sidebar Chat: Quick access through the Ollama Chat icon in the activity bar
- Dedicated Chat Panel: Full-featured chat interface with more options
Chat features include:
- Model selection dropdown
- Context file management
- Code snippet integration
- Workspace context toggle
- Real-time streaming responses
The chat interface provides several ways to add context:
- Add Selected Code: Select code in your editor, then click the "📄" button
- Add File Context: Click the "📎" button to choose specific files
- Workspace Context: Enable the "@workspace" checkbox to analyze your entire workspace
Access these commands via Command Palette (Ctrl+Shift+P
or Cmd+Shift+P
):
- Open Ollama Chat Panel: Opens the chat panel as a separate view
- Select Default Model: Change the default Ollama model
- Search Available Models: List all available models in Ollama
- Clear Completion Cache: Clear the cached completions
Configure the extension through VS Code settings:
- Default Model: Set your preferred model (
ollama.defaultModel
) - API Host: Configure the Ollama API endpoint (
ollama.apiHost
)
For the best experience, we recommend:
- Code Completion: Models fine-tuned for code generation
codellama:13b
orcodellama:34b
wizardcoder:13b
orwizardcoder:34b
- General Assistance: Larger models with broad knowledge
llama3:8b
orllama3:70b
mistral:7b
ormixtral:8x7b
- Specialized Tasks: Task-specific models
deepseek-coder:6.7b
ordeepseek-coder:33b
for codephind-codellama:34b
for programming Q&A
- No Suggestions Appearing:
- Ensure Ollama is running (
ollama serve
) - Check model is properly loaded
- Clear completion cache and restart
- Ensure Ollama is running (
- Slow Performance:
- Try using a smaller model
- Reduce context size
- Clear completion cache
- Model Not Found:
- Verify model is downloaded in Ollama
- Check model name spelling
- Run
ollama list
to see available models
If the extension can't connect to Ollama:
- Verify Ollama is running (
ollama serve
) - Check the API host setting (
ollama.apiHost
) - Ensure port 11434 is accessible
- Restart VS Code if necessary
All processing happens locally on your machine through your installed Ollama instance. No data is sent to external servers.
We welcome feedback and contributions! Please submit issues and pull requests on our GitHub repository.