The Fireworks Kimi + Factory Droid Hack: Use the Anthropic API

I’ve been testing the new Fireworks Firepass, which gets you zero per-token costs for reasoning models like Kimi K2.5 Turbo ($7/week). I started with OpenCode, but my real goal was to use Kimi as a worker for complex “missions” in Factory Droid.

The catch is that Factory’s official docs recommend the standard OpenAI-compatible endpoint. That works for basic prompts, but Droid ends up ignoring the thinking tokens that drive the reasoning view in the terminal.

Here is the standard recommendation for ~/.factory/settings.json:

{
  "model": "accounts/fireworks/routers/kimi-k2p5-turbo",
  "baseUrl": "https://api.fireworks.ai/inference/v1",
  "provider": "generic-chat-completion-api",
  "apiKey": "fw_...",
  "displayName": "Kimi K2.5 Turbo [OpenAI API]"
}

The problem is that the generic provider doesn’t pick up Fireworks’ reasoning tokens. Kimi sends those in a reasoning_content field that Droid simply ignores.

The A/B Test

I ran a headless test using droid exec with two different configs to see how Factory Droid handles the usage metadata.

Test A: OpenAI API (Standard)

Provider: generic-chat-completion-api
Base URL: https://api.fireworks.ai/inference/v1

Metadata Result:

"usage": {
  "input_tokens": 346,
  "output_tokens": 52
}

Verdict: Zero thinking tokens captured. The agent just sees one block of text. You don’t get the “Thinking” animation in the CLI.

Test B: Anthropic API (The Hack)

Provider: anthropic
Base URL: https://api.fireworks.ai/inference (No /v1)

Metadata Result:

"usage": {
  "input_tokens": 7085,
  "output_tokens": 50,
  "thinking_tokens": 57
}

Verdict: Factory Droid successfully isolated the 57 thinking_tokens.

Why this works

Fireworks provides an Anthropic-compatible endpoint. By switching the provider to anthropic, you’re telling Droid to use its Claude-style message parser. Fireworks supports this endpoint, including the thinking parameter and the thinkingMaxTokens budget.

The Optimal Config

Use this config for Kimi K2.5 Turbo on Factory Droid:

{
  "model": "accounts/fireworks/routers/kimi-k2p5-turbo",
  "baseUrl": "https://api.fireworks.ai/inference",
  "apiKey": "fw_...",
  "displayName": "Kimi K2.5 Turbo (Anthropic API)",
  "provider": "anthropic",
  "enableThinking": true,
  "thinkingMaxTokens": 16384
}

This setup is the most reliable way I’ve found to use Kimi as a worker for Droid missions.