active-call: 纯Rust写的超高性能VoiceAgent框架

shenjinti 发表于 2026-01-13 13:21

Tags：rust,voice-agent,webrtc,sip

https://github.com/restsend/active-call 欢迎Star

先给大家看看效果： Playbook demo

这次最重要的改进，就是不再依赖onnxruntime处理Silero VAD.

重新手写了Silero VAD的推理（纯Rust实现），性能有了非常高的提升（大概2.5x提升），并且内存占用非常的低：

VAD Engine	Implementation	Time (60s)	RTF (Ratio)	Note
TinySilero	Rust (Optimized)	~60.0 ms	0.0010	>2.5x faster than ONNX
ONNX Silero	ONNX Runtime	~158.3 ms	0.0026	Standard baseline
WebRTC VAD	C/C++ (Bind)	~3.1 ms	0.00005	Legacy, less accurate

这次还开始引入一个全新的playbook，过去的websocket api过于原始，大家想体验一个voice agent的流程会比较麻烦，干脆给大家提供一个参考方案，可以基于markdown进行扩展：

---
asr:
  provider: "aliyun"
llm:
  provider: "aliyun"
  model: "qwen-turbo"
tts:
  provider: "aliyun"
vad:
  provider: "silero"
denoise: true
greeting: "您好，我是您的AI助理，请问有什么可以帮您？"
interruption: "both"
recorder:
  recorderFile: "hello_{id}.wav"
---
# Role and Purpose
You are an intelligent, polite AI assistant. Your goal is to help users with their inquiries efficiently.

# Tool Usage
- When the user expresses a desire to end the conversation (e.g., "goodbye", "hang up", "I'm done"), you MUST provide a polite closing statement AND call the `hangup` tool.
- Always include your response text in the `text` field and any tool calls in the `tools` array.

# Example Response for Hanging Up:
json
{
  "text": "很高兴能为您服务，如果您还有其他问题，欢迎随时联系。再见！",
  "tools": [{"name": "hangup"}]
}

---

并且保留了我们最大的技术特色：唯一一个内置SIP支持的Voice-Agent SDK，可以直接对接sip 网关，实现Ai与电话的对接。

并且把LLM的流式输出也实现了，可以边思考边讲话

相比Pipecat/Livekit这些Framework, active-call更加专注于语音通话，最核心的就是性能很好， 2C4G很轻松支持200路并发，持续稳定运行个把月没问题

当然还是保留了最基础的Websocket API，允许通过Websocket来控制一个通话，这个当然是比较底层的实现，也是开发者最喜欢的方案。

从通话性能数据来看，现在已经能很便捷的和Ai聊天了（延迟基本上在800ms以内）

Ext Link: https://github.com/restsend/active-call/

评论区

写评论

还没有评论

1 共 0 条评论, 1 页