How to configure external OpenAI-compatible API Connectors

In case you would like to use an external OpenAI-compatible API via the inference API instead of using it directly, you can configure API keys and individual models to be forwarded.

Provide credentials

Create a secret containing the API keys and configuration, for example from a local file called secret-external-api-connectors.toml:

kubectl create secret generic inference-api-external-api-connectors \
    --from-file=external-api-connectors.toml=./secret-external-api-connectors.toml

warning

Replacing an existing model or checkpoint with an external API connector requires manual steps and a downtime at the moment. See "Replacing an existing model or checkpoint" for details.

The configuration file has the following structure:

[openai]
base_url = "https://api.openai.com/v1"
api_key = "[redacted]"
[openai.o3]
internal = "o3-mini-do-not-use"
external = "o3-mini"
[openai.gpt4]
internal = "gpt-4o-do-not-use"
external = "gpt-4o-2024-11-20"

[provider2]
base_url = "https://example.com/second/v7"
api_key = "[redacted]"
[provider2.deepseek]
internal = "deepseek-example"
external = "deepseek-r1"

Each provider requires a base_url and an api_key. Any amount of models can be made available by providing a mapping from the external to the internal model name. The internal model name will be also used as name for the checkpoint.

Activation in helm config

To activate the feature, reference the secret in the inference-api section of your configuration:

inference-api:
  externalApiConnectors:
    enabled: true
    secretName: inference-api-external-api-connectors

After modifications of the secret, the inference-api needs to be restarted for the settings to take effect.

Replacing an existing model or checkpoint

Test if your user has the required privileges.

This manual migration requires a user with administrative permissions. You can DELETE https://inference-api.example.com/model-packages/some-non-existent-checkpoint to see if that's the case. If you get back an HTTP 404 Not Found, then you're good to go.

Example curl request:
```
 curl -H "Authorization: $TOKEN" -X DELETE \
   https://inference-api.example.com/model-packages/some-non-existent-checkpoint
```
Prepare the configuration for the external API connector as outlined above.
Shut down all workers serving the model or checkpoint you want to replace with the external API connector. Your model will not be available until step 4 is completed.

DELETE https://inference-api.example.com/model-packages/{checkpoint_name}

Example curl request:

curl -H "Authorization: $TOKEN" -X DELETE \
  https://inference-api.example.com/model-packages/{checkpoint_name}

Deploy your new configuration for the inference-api service. This will restart the pod and the external API connector should immediately create the checkpoint/model again.

Troubleshooting

In case that a model does not become available, inspect the logs of the inference-api pod for potential configuration issues.

warning

Please note that not all parameters offered by OpenAI API endpoints are supported.

Provide credentials​

Activation in helm config​

Replacing an existing model or checkpoint​

Troubleshooting​

Provide credentials

Activation in helm config

Replacing an existing model or checkpoint

Troubleshooting