I’ve been testing the new Fireworks Firepass, which gets you zero per-token costs for reasoning models like Kimi K2.5 Turbo ($7/week). I started with OpenCode, but my real goal was to use Kimi as a worker for complex “missions” in Factory Droid.
The catch is that Factory’s official docs recommend the standard OpenAI-compatible endpoint. That works for basic prompts, but Droid ends up ignoring the thinking tokens that drive the reasoning view in the terminal.
Here is the standard recommendation for ~/.factory/settings.json:
{
"model": "accounts/fireworks/routers/kimi-k2p5-turbo",
"baseUrl": "https://api.fireworks.ai/inference/v1",
"provider": "generic-chat-completion-api",
"apiKey": "fw_...",
"displayName": "Kimi K2.5 Turbo [OpenAI API]"
}
The problem is that the generic provider doesn’t pick up Fireworks’ reasoning tokens. Kimi sends those in a reasoning_content field that Droid simply ignores.
The A/B Test
I ran a headless test using droid exec with two different configs to see how Factory Droid handles the usage metadata.
Test A: OpenAI API (Standard)
- Provider:
generic-chat-completion-api - Base URL:
https://api.fireworks.ai/inference/v1
Metadata Result:
"usage": {
"input_tokens": 346,
"output_tokens": 52
}
Verdict: Zero thinking tokens captured. The agent just sees one block of text. You don’t get the “Thinking” animation in the CLI.
Test B: Anthropic API (The Hack)
- Provider:
anthropic - Base URL:
https://api.fireworks.ai/inference(No/v1)
Metadata Result:
"usage": {
"input_tokens": 7085,
"output_tokens": 50,
"thinking_tokens": 57
}
Verdict: Factory Droid successfully isolated the 57 thinking_tokens.
Why this works
Fireworks provides an Anthropic-compatible endpoint. By switching the provider to anthropic, you’re telling Droid to use its Claude-style message parser. Fireworks supports this endpoint, including the thinking parameter and the thinkingMaxTokens budget.
The Optimal Config
Use this config for Kimi K2.5 Turbo on Factory Droid:
{
"model": "accounts/fireworks/routers/kimi-k2p5-turbo",
"baseUrl": "https://api.fireworks.ai/inference",
"apiKey": "fw_...",
"displayName": "Kimi K2.5 Turbo (Anthropic API)",
"provider": "anthropic",
"enableThinking": true,
"thinkingMaxTokens": 16384
}
This setup is the most reliable way I’ve found to use Kimi as a worker for Droid missions.