DOCS · QUICKSTART

Ship your first endpoint.

The Fortis API is OpenAI-compatible — point your existing SDK at our base URL and start serving inference in minutes.

START THE GUIDEVIEW PRICING

Introduction

Fortis exposes a single, OpenAI-compatible REST API for every model in the catalog. If you already use theopenai SDK, you only change the base URL and the API key — everything else works unchanged.

This guide takes you from zero to a streaming completion. You’ll need a Fortis account and an API key, which you can create from the dashboard.

Authentication

Authenticate every request with a bearer token. Store your key in an environment variable — never commit it to source control.

SHELL
export FORTIS_API_KEY="sk-fortis-..."

Keys are scoped per project and can be rotated at any time. Requests without a valid key return 401 Unauthorized.

Your first request

Send a chat completion with the official OpenAI SDK by overriding the base URL:

TYPESCRIPT
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.FORTIS_API_KEY,
  baseURL: "https://api.fortis.dev/v1",
});

const res = await client.chat.completions.create({
  model: "fortis-l-70b",
  messages: [{ role: "user", content: "Explain inference in one line." }],
});

console.log(res.choices[0].message.content);

That’s the whole integration. The response shape matches the OpenAI spec, so existing parsing and tooling keep working.

Streaming responses

Set stream: trueto receive tokens as they’re generated — ideal for chat UIs where time-to-first-token matters.

TYPESCRIPT
const stream = await client.chat.completions.create({
  model: "fortis-l-70b",
  messages: [{ role: "user", content: "Write a haiku about GPUs." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Choosing a model

Pass any catalog model id as model. Start with fortis-l-8b for speed and cost, then scale up to fortis-l-70b for harder tasks. See the pricing page for per-model token rates.

Errors & retries

The API uses standard HTTP status codes. Retry 429 and 5xx responses with exponential backoff.

HTTP
200  OK             — completion returned
401  Unauthorized   — missing or invalid API key
429  Rate limited   — back off and retry
503  Overloaded     — capacity scaling, retry shortly

Rate limits scale with your plan. Every error body includes a machine-readable code and a human-readable message.

Next steps

You have a working endpoint. From here:

  • Tune throughput and cost on the pricing page.
  • Reserve dedicated capacity for production workloads.
  • Wire structured outputs and tool calls into your app.

Questions? Reach the team in the developer Slack.