xAI 發布 Voice Agent Builder Beta,Grok Voice 基準測試超越 GPT

XAI4.84%
GROK2.64%

xAI announced on July 1 the launch of the Voice Agent Builder Beta, a fully no-code AI voice agent building platform that enables users to build enterprise-grade voice agents in 2 minutes using natural language prompts. The platform adopts an end-to-end Speech-to-Speech single voice path tightly coupled with Grok Voice, surpassing GPT in benchmarks.

τ-voice Bench Benchmark: Grok Voice Think Fast 1.0 Surpasses GPT

xAI AI語音代理建置平台 (Source: xAI website)

According to xAI's official release, Grok Voice Think Fast 1.0 ranks first on the τ-voice Bench voice benchmark leaderboard, directly surpassing Google Gemini 3.1 Flash Live and OpenAI GPT Realtime 1.5 in both response speed and reasoning capability.

xAI explained that Grok Voice is trained on real call scenarios designed to be "the most difficult," covering low-quality phone noise, strong accents, user interruptions, and ambiguous commands, and natively supports over 25 languages.

End-to-End Speech-to-Speech Architecture: Single Voice Path Replaces Traditional STT+LLM+TTS Assembled Architecture

xAI officially explained that traditional enterprise AI voice customer service requires connecting three independent systems—Speech-to-Text (STT), Large Language Model (LLM), and Text-to-Speech (TTS). This assembled architecture increases multi-hop latency, error rates, and operational costs.

Voice Agent Builder adopts an end-to-end Speech-to-Speech single voice path tightly coupled with Grok Voice, where the entire voice processing pipeline operates without segmented switching, aiming to reduce latency and minimize concatenation errors.

Knowledge Base, Tool Integration, Voice Cloning, and Phone Access: Four Core Feature Specifications

According to xAI's official feature description, the four core functional modules of Voice Agent Builder are as follows:

Knowledge Base: Supports uploading Word, Excel, PDF, JSON, and other formats, which can be organized into shared Collections across agents, ensuring consistency in product specifications and policies.

Tools & Connectors: Built-in Google/Outlook Calendar, web search, X (Twitter) search, and Notion; supports transfer to human agent, end call, and real-time team notifications.

Voice & Telephony: Offers over 80 built-in voices; supports brand voice cloning with just 2 minutes of audio; can obtain a free phone number provided by xAI, or connect to existing PBX systems via SIP.

Transparent Pricing: Compute API fee is $0.05 per minute, with no additional platform fee; when using a phone number provided by xAI, an additional communication fee of $0.01 per minute is charged.

Enterprise Security Mechanisms: Automatic Recording Transcripts, Tool Usage Logs, and Dialogue Boundary Settings

According to xAI's official announcement, Voice Agent Builder includes built-in monitoring mechanisms (Observability) and guardrails for enterprise users: each call is automatically recorded with a generated transcript; administrators can view the tools used by the AI during calls at any time; and strict dialogue boundaries can be set, such as prohibiting the AI from reading out customer credit card numbers or discussing off-topic political subjects.

xAI stated in the official announcement: "Listening with your ears is more accurate than looking at benchmarks—build an agent and call with your most difficult workflow to try it out."

FAQ

What is the compute cost for xAI Voice Agent Builder?

According to xAI's official announcement, the compute API fee is $0.05 per minute, with no additional platform fee; if using a phone number provided for free by xAI, an additional communication fee of $0.01 per minute is charged.

How does Grok Voice Think Fast 1.0 perform on τ-voice Bench?

According to xAI's official release, Grok Voice Think Fast 1.0 surpasses Google Gemini 3.1 Flash Live and OpenAI GPT Realtime 1.5 on the τ-voice Bench benchmark, ranking first on the leaderboard in both response speed and reasoning capability.

Where can xAI Voice Agent Builder be tried now?

According to xAI's official announcement, the Voice Agent Builder Beta is now live on the xAI Console and open for trial.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments