Wouldn’t it be fantastic to interact with our automations using our most intuitive method – our voice? And wouldn’t it be great to have a single Personal Assistant that can communicate with other existing Agents?
This is where Vagent excels. It’s a streamlined Voice Interface that you and your entire team can utilize to engage with an AI Supervisor, essentially a personal AI Assistant with access to your custom workflows.
Watch the video below for a demonstration and a detailed explanation of its functionality.
To download the App, access the Docs, and obtain a Multi-Agent workflow template, please visit: https://vagent.io
Hi there,
Appreciate the excellent work and commitment to open-source. I've been using Vagent, and it's functioning flawlessly. I had an idea: what if we could enhance it by integrating OpenAI's real-time voice agent? This integration could enable us to connect tools that are executable in real-time via voice commands. For instance, we could configure callin.io workflows as webhooks and invoke them directly through the voice agent, eliminating the need for transcription. The agent could initiate workflows solely through voice commands, fostering a more fluid user experience. What are your thoughts on this concept?
Appreciate the feedback. Great to hear!
That's an interesting concept to integrate the new real-time API. I recently learned about its release. It's expected to reduce response times by approximately 2 seconds.
Using predefined commands feels like a step back in my opinion. It brings to mind the DOS era, where users needed to memorize commands to operate a computer. In contrast, I've structured the application and multi-agent system to facilitate more natural conversations, eliminating the need to understand the underlying processes. This makes it user-friendly, even for non-technical colleagues.
I believe transitioning to low-latency models from Groq could be an effective way to accelerate the overall response time.
Fantastic work on this! It's great that you've made it open source as well.
Looking forward to experimenting with this in my upcoming workflow builds.
Fantastic work, thank you. I’m just starting to play with it. If you do make the switch to the Realtime API, it would be great if there was an option to use an Azure deployment of OpenAI Realtime API.
Wow, really nice work with lots of attention to details!
In my wish list, I would add the option for an OpenAI base URL so we can point to a proxy like LiteLLM or compatible providers such as Openrouter.
LiteLLM can serve as a proxy for Azure OpenAI, including the realtime-preview model.
Thanks!
That's fantastic – precisely what I needed! However, I'm not an iOS user. Is there any possibility you'll be releasing this for Android in the near future?
Another vote for the Android version. I've seen someone mention they got it working, but it involved downloading a roughly 100MB app from Google Drive, which always feels a bit insecure (not making a judgment on its actual security, just the perception).
Android version: GitHub - octionic/vagent-android: A voice activated interface for your custom AI Agent.