How to configure external OpenAI-compatible API Connectors
In case you would like to use an external OpenAI-compatible API via the inference API instead of using it directly, you can configure API keys and individual models to be forwarded.
Provide credentials
Create a secret containing the API keys and configuration, for example from a local file called secret-external-api-connectors.toml:
kubectl create secret generic inference-api-external-api-connectors \
--from-file=external-api-connectors.toml=./secret-external-api-connectors.toml
Replacing an existing model or checkpoint with an external API connector requires manual steps and a downtime at the moment. See "Replacing an existing model or checkpoint" for details.
The configuration file has the following structure:
[openai]
base_url = "https://api.openai.com/v1"
api_key = "[redacted]"
[openai.o3]
internal = "o3-mini-do-not-use"
external = "o3-mini"
[openai.gpt4]
internal = "gpt-4o-do-not-use"
external = "gpt-4o-2024-11-20"
[provider2]
base_url = "https://example.com/second/v7"
api_key = "[redacted]"
[provider2.deepseek]
internal = "deepseek-example"
external = "deepseek-r1"
Each provider requires a base_url and an api_key. Any amount of models can be made available by providing a mapping from the external to the internal model name. The internal model name will be also used as name for the checkpoint.
Activation in helm config
To activate the feature, reference the secret in the inference-api section of your configuration:
inference-api:
externalApiConnectors:
enabled: true
secretName: inference-api-external-api-connectors
After modifications of the secret, the inference-api needs to be restarted for the settings to take effect.
Replacing an existing model or checkpoint
-
Test if your user has the required privileges.
This manual migration requires a user with administrative permissions. You can
DELETE https://inference-api.example.com/model-packages/some-non-existent-checkpointto see if that's the case. If you get back anHTTP 404 Not Found, then you're good to go.Example curl request:
curl -H "Authorization: $TOKEN" -X DELETE \
https://inference-api.example.com/model-packages/some-non-existent-checkpoint -
Prepare the configuration for the external API connector as outlined above.
-
Shut down all workers serving the model or checkpoint you want to replace with the external API connector. Your model will not be available until step 4 is completed.
-
DELETE https://inference-api.example.com/model-packages/{checkpoint_name}Example curl request:
curl -H "Authorization: $TOKEN" -X DELETE \
https://inference-api.example.com/model-packages/{checkpoint_name} -
Deploy your new configuration for the
inference-apiservice. This will restart the pod and the external API connector should immediately create the checkpoint/model again.
Troubleshooting
In case that a model does not become available, inspect the logs of the inference-api pod for potential configuration issues.
Please note that not all parameters offered by OpenAI API endpoints are supported.