Daily WebRTC
Using Daily as a WebRTC transport for your Pipecat Cloud agents
Daily is a WebRTC platform that provides real-time voice and video capabilities to connect users with voice agents.
Pipecat Cloud offers a first-class integration with Daily, making it straightforward to deploy WebRTC-enabled agents without managing complex infrastructure yourself.
Using an Integrated Daily API Key
When you create a Pipecat Cloud account, you’re automatically provisioned with a Daily API key that’s fully integrated with the platform.
The integrated Daily API key provides:
- Zero configuration: No need to separately sign up for Daily or manage API keys
- Free voice minutes: Voice minutes for one human participant and one agent are free when using your Pipecat Cloud-provisioned Daily key
- Simplified deployment: Create Daily rooms and launch agents with a single command
- Built-in compatibility: All Pipecat base images work with Daily out of the box
While 1:1 voice minutes are included, additional Daily features like recording, transcription, and PSTN/SIP connections are billed according to Daily’s standard pricing.
Using Daily with Pipecat Cloud Agents
Starting an Agent with Daily Transport
When starting a Pipecat Cloud agent, you can specify Daily as the transport using the --use-daily
flag with the CLI or setting the appropriate parameters in the SDK or REST API calls.
This command creates a Daily room and provides a URL you can open in your browser to interact with your agent using voice.
Room customization features will be available in a future update to Pipecat Cloud.
Using Your Own Daily API Key
While Pipecat Cloud provides a Daily API key with included 1:1 voice minutes, you can optionally use your own Daily API key if you have specific requirements. When using your own key, all usage will be billed according to your Daily account’s pricing plan rather than being included with your Pipecat Cloud subscription.
WebRTC vs. WebSockets for Voice AI
When building voice AI applications, choosing the right transport technology is crucial:
-
WebRTC is purpose-built for real-time audio and video communication between browsers and devices:
- Optimized for real-time media streaming over unpredictable networks
- Uses UDP protocol, prioritizing speed over guaranteed packet delivery
- Provides built-in echo cancellation and noise suppression
- Intelligently adapts bitrate based on changing network conditions
- Handles NAT traversal for connections across different networks
- Includes device management for cameras and microphones
-
WebSockets work well for server-to-server communication because:
- They operate in controlled network environments with stable connections
- When server-to-server latency is low, packet retransmission doesn’t add substantial delay
- Server environments don’t need device access or media quality enhancements
- They’re simpler to implement for pure data transmission
- Many server platforms have built-in WebSocket support
For browser or mobile app voice applications, WebRTC delivers superior performance across varying network conditions. For more details, see How to Talk to an LLM with Your Voice.
When connecting to telephony systems like Twilio or implementing server-to-server communication where network conditions are controlled and reliable, WebSockets remain an appropriate choice. However, for any user-facing voice or video application, WebRTC offers significant advantages in quality and reliability.
How it Works
When you start an agent with Daily integration:
- Pipecat Cloud creates a Daily room using your API key (either the provisioned one or your custom key)
- A URL and access token is generated, creating a means to access the Daily room
- The agent is started and configured to join the Daily room
- Your code (in
bot.py
) receives the room URL and token as parameters - The Pipecat framework automatically connects to the Daily room using these parameters
The basic flow is handled for you, allowing you to focus on building your agent’s conversation logic rather than WebRTC infrastructure.