"Psuedo-Realtime" voice chat via Elevenlabs Conversation Agents API via API adapter to incorporate existing user app data/credentials and native app features

OpenAI is deprecating the classic voice mode in all their apps soon and their new implementation of “Advanced Voice” is tuned to reduce their compute burden and is a large regression vs the older AVM implementation. I encourage you to try their new implementation of voice mode, its quite horrible.

There is a very hungry consumer base that will be displaced by this.

  • Entitlements to do background conversations (like ChatGPT.app can do).

  • Local VAD, preferably using “semantic detection”, but idk how hard that is, if you need to use classic detection, then expose parameters for the user to adjust sensitivity and possible implement some kind of “back-off” algorithm to dynamically set the noise floor. (Road noise while driving completely ruins the ChatGPT.app VAD)

  • Use Elevenlabs Conversational Agents API

    • Incorporate any native/local future features like access to reminders as custom tool calls.

    • Inject some extra config options to

      • use the custom LLM option (piggyback off of the existing connections the user saved)

      • Integrate existing MCP config w/ Elevenlab’s Conversation Agent MCP config

      • Disable the “default_model_personality” parameter which makes the voice chat way too “customer service” feeling and what makes Elevenlabs default implementation via 11.ai feel “off” compared to how ChatGPT.app’s AVM used to feel.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board
📱

BoltAI Mobile

Date

6 months ago

Author

kernkraft

Subscribe to post

Get notified by email when there are changes.