Langcli logoAI Coding Assistant

Custom Model Providers

Langcli allows you to configure custom model providers through the modelProviders setting in your settings.json. This enables you to switch between different AI models(includes models from LangRouter and your own custom models.) using the /model command.

Overview

Use modelProviders to declare curated model lists per protocol type that the /model picker can switch between. Optional protocol types include openai, anthropic, and gemini. Each model definition requires an id and must include an envKey. Optional fields include name, description, baseUrl, and generationConfig.

Path to the settings.json configuration file

  • Macos/Linux/WSL: ~/.langcli/settings.json
  • windows: C:\Users\%USERNAME%\.langcli\settings.json

Duplicate model ID under protocol type

Defining multiple models with the same id under a single protocol type (e.g., two entries with "id": "gpt-4o" in OpenAI) is currently not supported. If duplicates exist, the first one will take effect, and subsequent duplicates will be skipped with a warning. Note that the id field is used both as a configuration identifier and as the actual model name sent to the API, so using unique IDs (such as gpt-4o-creative, gpt-4o-balanced) is not a viable workaround. This is a known limitation that we plan to address in a future release.

Verify that the modifications to settings.json are valid.

After you modify settings.json, you can execute the following command to verify whether your changes are valid:

langcli doctor 

If there are any issues, this command will alert you. If there are no issues, the doctor command will only output Diagnostics and Updates information; simply press Enter to confirm.

Supported protocol types

The key of a modelProviders object must be a valid protocolType value. Currently supported protocol types include:

Protocol TypeDescription
openaiOpenAI-compatible APIs (local self-hosted servers like vLLM/Ollama, Litellm, OpenAI, Azure OpenAI)
anthropicAnthropic Claude API
geminiGoogle Gemini API

Invalid protocol type

If an invalid protocol type key is used (e.g., a typo like "openai-custom"), the configuration will be silently skipped and the models will not appear in the /model picker. Always use one of the supported auth type values listed above.

Configuration examples by protocol type

Below are complete configuration examples for different protocol types, showing the available parameters and their combinations.

Local Self-Hosted Models (via OpenAI-compatible API)

Most local inference servers (vLLM, Ollama, LM Studio, etc.) provide an OpenAI-compatible API endpoint. Configure them using the openai protocol type with a local baseUrl:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.langrouter.ai",
    "ANTHROPIC_AUTH_TOKEN": "your_langrouter_api_key",
    "OLLAMA_API_KEY": "ollama",
    "VLLM_API_KEY": "not-needed"
  },
  "model": "custom:qwen3.6:35b-a3b-q4_K_M",
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6:35b-a3b-q4_K_M",
        "name": "Qwen3.6:35B A3B (Ollama)",
        "envKey": "OLLAMA_API_KEY",
        "baseUrl": "http://localhost:11434/v1",
        "generationConfig": {
          "timeout": 300000,
          "maxRetries": 2,
          "contextWindowSize": 32768,
          "samplingParams": {
            "temperature": 0.7,
            "top_p": 0.9,
            "max_tokens": 4096
          }
        }
      },
      {
        "id": "llama-3.1-8b",
        "name": "Llama 3.1 8B (vLLM)",
        "envKey": "VLLM_API_KEY",
        "baseUrl": "http://localhost:8000/v1",
        "generationConfig": {
          "timeout": 120000,
          "maxRetries": 2,
          "contextWindowSize": 128000,
          "samplingParams": {
            "temperature": 0.6,
            "max_tokens": 8192
          }
        }
      }
    ]
  }
}

OpenAI-compatible providers (openai)

This protocol type supports not only OpenAI's official API but also any OpenAI-compatible endpoint, including aggregated model providers like OpenRouter.

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.langrouter.ai",
    "ANTHROPIC_AUTH_TOKEN": "your_langrouter_api_key",
    "DEEPSEEK_API_KEY": "abc",
    "OPENAI_API_KEY": "sk-your-actual-openai-key-here",
    "OPENROUTER_API_KEY": "sk-or-your-actual-openrouter-key-here"
  },
  "model": "custom:gpt-4o",
  "modelProviders": {
    "openai": [
      {
        "id": "deepseek-v4-flash[1m]",
        "name": "deepseek-v4-flash[1m] (custom)",
        "envKey": "DEEPSEEK_API_KEY",
        "baseUrl": "https://api.deepseek.com",
        "generationConfig": {
          "timeout": 300000,
          "maxRetries": 2,
          "contextWindowSize": 1000000,
          "samplingParams": {
            "temperature": 0.7,
            "top_p": 0.9,
            "max_tokens": 384000
          }
        }
      },
      {
        "id": "deepseek-v4-pro[1m]",
        "name": "deepseek-v4-pro[1m] (custom)",
        "envKey": "DEEPSEEK_API_KEY",
        "baseUrl": "https://api.deepseek.com",
        "generationConfig": {
          "timeout": 300000,
          "maxRetries": 2,
          "contextWindowSize": 1000000,
          "samplingParams": {
            "temperature": 0.7,
            "top_p": 0.9,
            "max_tokens": 384000
          }
        }
      },
      {
        "id": "gpt-4o",
        "name": "GPT-4o",
        "envKey": "OPENAI_API_KEY",
        "baseUrl": "https://api.openai.com/v1",
        "generationConfig": {
          "timeout": 60000,
          "maxRetries": 3,
          "enableCacheControl": true,
          "contextWindowSize": 128000,
          "modalities": {
            "image": true
          },
          "customHeaders": {
            "X-Client-Request-ID": "req-123"
          },
          "extra_body": {
            "enable_thinking": true,
            "service_tier": "priority"
          },
          "samplingParams": {
            "temperature": 0.2,
            "top_p": 0.8,
            "max_tokens": 4096,
            "presence_penalty": 0.1,
            "frequency_penalty": 0.1
          }
        }
      },
      {
        "id": "gpt-4o-mini",
        "name": "GPT-4o Mini",
        "envKey": "OPENAI_API_KEY",
        "baseUrl": "https://api.openai.com/v1",
        "generationConfig": {
          "timeout": 30000,
          "samplingParams": {
            "temperature": 0.5,
            "max_tokens": 2048
          }
        }
      },
      {
        "id": "openai/gpt-4o",
        "name": "GPT-4o (via OpenRouter)",
        "envKey": "OPENROUTER_API_KEY",
        "baseUrl": "https://openrouter.ai/api/v1",
        "generationConfig": {
          "timeout": 120000,
          "maxRetries": 3,
          "samplingParams": {
            "temperature": 0.7
          }
        }
      }
    ]
  }
}

Anthropic (anthropic)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.langrouter.ai",
    "ANTHROPIC_AUTH_TOKEN": "your_langrouter_api_key",
    "MY_ANTHROPIC_API_KEY": "sk-ant-your-actual-anthropic-key-here"
  },
  "model": "custom:claude-3-opus",
  "modelProviders": {
    "anthropic": [
      {
        "id": "claude-3-5-sonnet",
        "name": "Claude 3.5 Sonnet",
        "envKey": "MY_ANTHROPIC_API_KEY",
        "baseUrl": "https://api.anthropic.com/v1",
        "generationConfig": {
          "timeout": 120000,
          "maxRetries": 3,
          "contextWindowSize": 200000,
          "samplingParams": {
            "temperature": 0.7,
            "max_tokens": 8192,
            "top_p": 0.9
          }
        }
      },
      {
        "id": "claude-3-opus",
        "name": "Claude 3 Opus",
        "envKey": "MY_ANTHROPIC_API_KEY",
        "baseUrl": "https://api.anthropic.com/v1",
        "generationConfig": {
          "timeout": 180000,
          "samplingParams": {
            "temperature": 0.3,
            "max_tokens": 4096
          }
        }
      }
    ]
  }
}

Google Gemini (gemini)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.langrouter.ai",
    "ANTHROPIC_AUTH_TOKEN": "your_langrouter_api_key",
    "GEMINI_API_KEY": "AIza-your-actual-gemini-key-here"
  },
  "model": "custom:gemini-2.0-flash",
  "modelProviders": {
    "gemini": [
      {
        "id": "gemini-2.0-flash",
        "name": "Gemini 2.0 Flash",
        "envKey": "GEMINI_API_KEY",
        "baseUrl": "https://generativelanguage.googleapis.com",
        "capabilities": {
          "vision": true
        },
        "generationConfig": {
          "timeout": 60000,
          "maxRetries": 2,
          "contextWindowSize": 1000000,
          "schemaCompliance": "auto",
          "samplingParams": {
            "temperature": 0.4,
            "top_p": 0.95,
            "max_tokens": 8192,
            "top_k": 40
          }
        }
      }
    ]
  }
}

Additional information

For local servers that don't require authentication, you can use any placeholder value for the API key:

# For Ollama (no auth required)
"OLLAMA_API_KEY": "ollama"

# For vLLM (if no auth is configured)
"VLLM_API_KEY": "not-needed"

Setting the extra_body parameter

The extra_body parameter is only supported for OpenAI-compatible providers (openai). It is ignored for Anthropic, and Gemini providers.

Setting the envKey parameter

The envKey field specifies the name of an environment variable, not the actual API key value. For the configuration to work, you need to ensure the corresponding environment variable is set with your real API key. You can do this: using the env field in settings.json (as shown in the examples above):

{
  "env": {
    "OPENAI_API_KEY": "sk-your-actual-key-here"
  }
}

Each custom model example includes an env field to illustrate how the API key should be configured.

Last Updated