Continue

Open-source VS Code and JetBrains extension that turns any local LLM (or remote API) into a Copilot-style code assistant. The wiki’s anchor entry for the self-hosted Copilot replacement thread — what you install when you want intelligent autocomplete and inline chat without sending code to GitHub or Anthropic.

Site: continue.dev
License: Apache 2.0
Editor support: VS Code, JetBrains IDEs (IntelliJ family). Neovim extension “in the works” per Wolfgang’s coverage, not yet shipped at the time of his August 2024 video — worth re-checking.
Backend support: any Ollama endpoint, any OpenAI-compliant API, plus first-class support for Anthropic, Mistral, OpenAI, and other SaaS providers (the wiki uses it with local backends, but the SaaS support is real)

What it does

Two distinct features bundled into one extension:

Tab autocomplete — Copilot-style inline suggestions as you type. Uses a separate, smaller “completion model” by convention (3B–7B), since latency matters more than depth here. You can specify any local model.
Chat panel — full conversation interface inside the IDE, for “explain this function,” “rewrite this loop,” “what’s wrong with this Ansible task,” etc. Uses a larger “chat model” by convention (34B+), since you’re willing to wait for better answers.

The split is configured in a JSON config file inside the extension settings. Per Wolfgang: “You’re probably gonna want to have a lighter 7B or even 3B model for auto-suggestions, and leave beefy 34B and 70B models for chat interactions.”

How Wolfgang uses it

From the Docker + Ollama + Continue self-hosting walkthrough:

Install Continue from the VS Code Marketplace
Open the JSON config (Continue’s settings panel)
Set the Ollama endpoint URL to the homelab server (e.g., http://192.168.1.x:11434)
Specify two models: tabAutocompleteModel: starcoder:3b and models: [codebooga, codellama:7b] for chat
Restart VS Code and start writing code

The extension immediately starts producing inline suggestions sourced from the local Ollama instance. Wolfgang’s test was an Ansible playbook — Continue correctly recognized the playbook context and started suggesting hosts: all, task structure, etc.

Why it matters in the wiki

Practical alternative to Claude Code for non-vibe-coders — Continue is the closer-to-classical-IDE tool. You’re not delegating code generation to an agent, you’re getting smarter autocomplete with the same code privacy as a local Ollama setup.
The local-Copilot replacement — combined with Ollama (or Docker Model Runner) and a GPU, it’s a complete self-hosted Copilot stack with no per-seat fees and no code leaving the machine.
GPU dependency is real — Wolfgang’s data shows that without a dedicated GPU (130W gaming rig vs 4.6W MiniPC), Continue produces unusable suggestions. The extension itself is free and lightweight; the cost is the GPU it talks to. See also Alex Ziskind’s vLLM analysis for why local code assistance is GPU-bound.

How it compares

	Continue	Claude Code	Cursor	GitHub Copilot
Form factor	IDE extension (VS Code, JetBrains)	CLI agent harness	Standalone editor	IDE extension (VS Code, JetBrains, Neovim)
Code generation style	Inline autocomplete + chat	Agentic file edits	Compose-style multi-file edits	Inline autocomplete + chat
Local model support	Yes (Ollama, any OpenAI API)	Via model substitution	Limited	No
Per-seat fee	None	Anthropic API costs	$20/mo+	$10/mo+
Best for	Self-hosted privacy + tab completion	Agentic workflows + planning discipline	Multi-file refactors with visual diffs	”Just install it and go” with cloud API

Continue’s positioning is closest to a JetBrains-feel autocomplete + chat experience with full local-model support. It’s not trying to replace Claude Code — different category.

AI For Dev

Explorer

continue

Continue

What it does

How Wolfgang uses it

Why it matters in the wiki

How it compares

See Also

Graph View

Table of Contents

Backlinks