https://github.com/restsend/active-call 欢迎Star
先给大家看看效果:

这次最重要的改进,就是不再依赖onnxruntime处理Silero VAD.
重新手写了Silero VAD的推理(纯Rust实现),性能有了非常高的提升 (大概2.5x提升),并且内存占用非常的低:
| VAD Engine | Implementation | Time (60s) | RTF (Ratio) | Note |
|---|---|---|---|---|
| TinySilero | Rust (Optimized) | ~60.0 ms | 0.0010 | >2.5x faster than ONNX |
| ONNX Silero | ONNX Runtime | ~158.3 ms | 0.0026 | Standard baseline |
| WebRTC VAD | C/C++ (Bind) | ~3.1 ms | 0.00005 | Legacy, less accurate |
这次还开始引入一个全新的playbook,过去的websocket api过于原始,大家想体验一个voice agent的流程会比较麻烦,干脆给大家提供一个参考方案,可以基于markdown进行扩展:
---
asr:
provider: "aliyun"
llm:
provider: "aliyun"
model: "qwen-turbo"
tts:
provider: "aliyun"
vad:
provider: "silero"
denoise: true
greeting: "您好,我是您的AI助理,请问有什么可以帮您?"
interruption: "both"
recorder:
recorderFile: "hello_{id}.wav"
---
# Role and Purpose
You are an intelligent, polite AI assistant. Your goal is to help users with their inquiries efficiently.
# Tool Usage
- When the user expresses a desire to end the conversation (e.g., "goodbye", "hang up", "I'm done"), you MUST provide a polite closing statement AND call the `hangup` tool.
- Always include your response text in the `text` field and any tool calls in the `tools` array.
# Example Response for Hanging Up:
json
{
"text": "很高兴能为您服务,如果您还有其他问题,欢迎随时联系。再见!",
"tools": [{"name": "hangup"}]
}
---
并且保留了我们最大的技术特色: 唯一一个内置SIP支持的Voice-Agent SDK, 可以直接对接sip 网关,实现Ai与电话的对接。
并且把LLM的流式输出也实现了,可以边思考边讲话
相比Pipecat/Livekit这些Framework, active-call更加专注于语音通话,最核心的就是性能很好, 2C4G很轻松支持200路并发,持续稳定运行个把月没问题
当然还是保留了最基础的Websocket API,允许通过Websocket来控制一个通话,这个当然是比较底层的实现,也是开发者最喜欢的方案。
从通话性能数据来看,现在已经能很便捷的和Ai聊天了 (延迟基本上在800ms以内)
Ext Link: https://github.com/restsend/active-call/
评论区
写评论还没有评论