Pay per token. Nothing else.
No monthly fees, no minimums, no hidden charges. You only pay for what you use — input and output tokens at transparent, per-model rates. Volume discounts automatically apply as you scale.
Pricing model
Bentoo AI charges per token — both prompt (input) and completion (output). Rates are set per model and are typically 40–87% lower than going directly to the provider. We achieve this through volume pricing and intelligent provider routing.
Popular model pricing
Prices shown per 1M tokens. Output rates are typically 2–4x input rates.
See the full model registry for complete pricing on all 40+ models. Prices are updated in real time when providers change their rates.
Free tier
- ~10M input tokens on Gemini 2.5 Flash
- ~1.6M input tokens on Claude Sonnet 4.6
- ~500K input tokens on GPT-5
- ~18M input tokens on DeepSeek V3
Free-tier accounts are rate-limited to 20 requests/minute and 200K tokens/day. Upgrade to Pro to remove these limits.
Billing & payments
Billing cycle
Usage is tracked in real time and billed to your account balance. Auto-topup can be enabled to maintain a minimum balance. Invoices are generated monthly and sent via email.
Payment methods
We accept credit cards (Visa, Mastercard, Amex), PayPal, wire transfer (Enterprise), and crypto (USDC on Ethereum / Solana). All payments are processed securely through Stripe.
Usage dashboard
Track spend by model, by day, and by API key in the Bentoo AI dashboard. Set budget alerts to get notified when you hit 50%, 80%, or 100% of a custom threshold.
Enterprise pricing
For teams doing 100M+ tokens/month, we offer custom enterprise agreements with:
- Committed-use discounts up to an additional 35% off list pricing
- Dedicated capacity — reserved throughput for critical workloads
- Custom SLAs with 99.99% uptime guarantees
- SSO & SAML for team access control
- Zero-data-retention agreements
- Dedicated support with 1-hour response time
Contact sales@bentoo.ai for a custom quote.