Introduction
Fortis exposes a single, OpenAI-compatible REST API for every model in the catalog. If you already use theopenai SDK, you only change the base URL and the API key — everything else works unchanged.
This guide takes you from zero to a streaming completion. You’ll need a Fortis account and an API key, which you can create from the dashboard.
Authentication
Authenticate every request with a bearer token. Store your key in an environment variable — never commit it to source control.
Keys are scoped per project and can be rotated at any time. Requests without a valid key return 401 Unauthorized.
Your first request
Send a chat completion with the official OpenAI SDK by overriding the base URL:
That’s the whole integration. The response shape matches the OpenAI spec, so existing parsing and tooling keep working.
Streaming responses
Set stream: trueto receive tokens as they’re generated — ideal for chat UIs where time-to-first-token matters.
Choosing a model
Pass any catalog model id as model. Start with fortis-l-8b for speed and cost, then scale up to fortis-l-70b for harder tasks. See the pricing page for per-model token rates.
Errors & retries
The API uses standard HTTP status codes. Retry 429 and 5xx responses with exponential backoff.
Rate limits scale with your plan. Every error body includes a machine-readable code and a human-readable message.
Next steps
You have a working endpoint. From here:
- Tune throughput and cost on the pricing page.
- Reserve dedicated capacity for production workloads.
- Wire structured outputs and tool calls into your app.
Questions? Reach the team in the developer Slack.