Help me improve BoltAI

For a bug report, please include the following information:

  • Your OS version.

  • Your BoltAI app version.

  • Are you a Setapp user.

  • The AI provider & model you’re using.

  • Steps to reproduce the issue.

  • The error message

In Progress

[Feature Request] Custom API Headers & Body Parameters per Model

## The Problem Anthropic just released Claude Opus 4.6 Fast Mode — a research preview that delivers up to 2.5x faster output token generation. To use it, you need to pass two things that BoltAI currently doesn’t support: 1. A custom HTTP header: anthropic-beta: fast-mode-2026-02-01 1. A custom body parameter: "speed": "fast" The model ID stays the same (`claude-opus-4-6`), so you can’t work around this by just adding a new model. Without these fields, the request goes through as standard Opus 4.6 — no speed boost, and no way to opt in. This isn’t just about Fast Mode. Anthropic (and other providers) increasingly use beta headers and extra body parameters for new features: - anthropic-beta: context-1m-2025-08-07 — 1M token context window - anthropic-beta: prompt-caching-2024-07-31 — prompt caching - speed: "fast" — fast inference mode - inference_geo: "us" — data residency controls Right now, none of these are accessible from BoltAI. ## The Suggestion In the Edit Model screen, add two optional fields: - Custom Headers — a key-value input (or raw text field) for extra HTTP headers. - Custom Body Parameters — a key-value input (or JSON field) for additional parameters injected into the request body. ### Example Configuration For Opus 4.6 Fast Mode: |Field |Value | |-----------------|--------------------------------------| |Model ID |`claude-opus-4-6` | |Custom Header |`anthropic-beta: fast-mode-2026-02-01`| |Custom Body Param|`"speed": "fast"` | For 1M context window: |Field |Value | |-------------|---------------------------------------| |Model ID |`claude-opus-4-6` | |Custom Header|`anthropic-beta: context-1m-2025-08-07`| ## Why It Fits BoltAI BoltAI already gives power users granular control over model parameters like Top-P, Temperature, and Frequency Penalty. Custom headers and body params are the natural next step — they unlock every beta/preview feature from any provider without waiting for BoltAI to add explicit UI support for each one. This is a forward-looking solution: instead of playing catch-up with every new API flag, you give users the tools to configure it themselves on day one. ## Priority High — Fast Mode is available now and providers are shipping new beta features at an accelerating pace. Every week without this means users either can’t access new capabilities or have to fall back to curl/scripts.

Greg About 1 month ago

[Feature Request] "Paste long text as a file" with customizable threshold

I would love to see a feature that automatically converts large text pastes into an attached file. This is a game-changer for keeping conversation history readable on mobile and for taking full advantage of Prompt Caching. The Suggestion: • Add a toggle: "Paste long text as a file". • The Key Detail: Add a slider or input field to set the threshold (e.g., from 500 to 10,000 characters). • Benefit: This allows power users to decide exactly when their code or documents should be minimized into a file icon versus staying as a readable message. Why it fits BoltAI: Your app already provides incredible granular control over model parameters like Top-P and Frequency Penalty. Adding a customizable threshold for text handling would perfectly align with this "Power User" philosophy and further differentiate BoltAI from simpler AI clients. Sincerely, Greg iPhone 16 Pro

Greg About 1 month ago

Generation fails when app or chat is not active

I’ve encountered a few cases where I send a message and it fails to generate: When the app is not active: switching apps or locking the phone after sending a message Received error ‘fetch failed: Network connection was lost’ with Claude Sonnet 4.5 Received error ‘could not parse response’ with gpt-5-mini DID NOT receive an error with gpt-5 Opening a different chat in BoltAI while the response is being generated No error, no response with gpt-5, 5-mini, and Claude Sonnet 4.5 Since tool calling and reasoning take a few minutes sometimes, this is a pretty big obstacle to multitasking. I’ll do some more testing later, it could totally be just those models or something about my setup.

harrisonfloam About 2 months ago

1

Better Shortcuts Support

Just wanted to say I've been trying out the app on TestFlight, and I'm really happy I found it. I've used other apps for a while, mostly Typing Mind, which is a web app. It's got some good stuff, but the main reason I started looking around was for something more native and that integrates better with my system. There aren't many options like that for iPhone, so I was pretty excited to see Bolt AI. Testing it out, it really feels like it has so much potential. But there's one area I think could really make it shine, more Shortcuts actions. It would be awesome to be able to make requests/start chats without even opening the app. Say you select some text in any app; you could just share it to a custom shortcut you set up that will process the request and place the result in the clipboard. An option in that action to either start a new conversation or continue the last one would be awesome, and if the system prompt can also be set for that specific interaction, well that'd be the cherry on the top. A good, solid Shortcuts integration would really open up what you can do with a native app like this, letting it tie into other apps. One more thing I'd love for the developer to think about is to offer a lifetime option in those cases where the user is using services with his API tokens. I get that sustainability is tricky, but honestly, I just prefer avoiding subscriptions if I can.

Robert J. P. Oberg 6 months ago

2

"Psuedo-Realtime" voice chat via Elevenlabs Conversation Agents API via API adapter to incorporate existing user app data/credentials and native app features

OpenAI is deprecating the classic voice mode in all their apps soon and their new implementation of “Advanced Voice” is tuned to reduce their compute burden and is a large regression vs the older AVM implementation. I encourage you to try their new implementation of voice mode, its quite horrible. There is a very hungry consumer base that will be displaced by this. Entitlements to do background conversations (like ChatGPT.app can do). Local VAD, preferably using “semantic detection”, but idk how hard that is, if you need to use classic detection, then expose parameters for the user to adjust sensitivity and possible implement some kind of “back-off” algorithm to dynamically set the noise floor. (Road noise while driving completely ruins the ChatGPT.app VAD) Use Elevenlabs Conversational Agents API Incorporate any native/local future features like access to reminders as custom tool calls. Inject some extra config options to use the custom LLM option (piggyback off of the existing connections the user saved) Integrate existing MCP config w/ Elevenlab’s Conversation Agent MCP config Disable the “default_model_personality” parameter which makes the voice chat way too “customer service” feeling and what makes Elevenlabs default implementation via 11.ai feel “off” compared to how ChatGPT.app’s AVM used to feel.

kernkraft 7 months ago

Consider supporting the coreML version of Parakeet v3 for local ASR

Whisper is extremely slow by comparison, and while the ASR on macOS is actually decent now, those upgrades don’t seem to be coming to iOS devices (including the 16gb ipad pros). Even though I am a native english speaker and IRL people have no problem understanding me, the Apple ASR model on iOS literally can’t understand me well enough to finish 2 sentences properly. I currently use the ~490mb distilled Parakeet ASR model with a 3rd party app on macOS, and it is absolutely sufficient. If we extrapolate based on my known transcription speed on M3 Max (~250x realtime) and relative Geekbench Metal performance, an iPhone 15 Pro could transcribe audio 47x faster than realtime. So if a user was yapping for 2 whole minutes, parakeet would transcribe it in 2.5s, it would probably take an extra ~2-3s for the first time loading it with app in foreground, https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v2-coreml It claims max memory usage is 800mb

kernkraft 7 months ago

2