How does the context injection pattern work with LiveKit?

Instead of using LLM tool calling (which adds a "thinking" step), Moss queries run automatically on every user turn via the on_user_turn_completed hook. Results are injected into the chat context as a system message before the LLM generates its response. This is faster and more consistent than tool calling.

What audio providers work with Moss and LiveKit?

Any provider in the LiveKit plugin ecosystem: Deepgram for STT, OpenAI or Cartesia for TTS, Silero for VAD, and OpenAI/Anthropic for the LLM. Moss handles the retrieval layer independently.

Can I use Moss with other voice platforms alongside LiveKit?

Yes. Moss has dedicated integration packages for Pipecat (pipecat-moss), VAPI (vapi-moss), and ElevenLabs (elevenlabs-moss). Each uses the same MossClient SDK with framework-specific wrappers.

Voice Platform

Moss + LiveKit: Real-Time Search for Voice Agents

LiveKit provides real-time audio and video infrastructure for voice agents. Moss integrates using a context injection pattern: on every user turn, Moss is queried automatically and results are injected into the chat context before the LLM generates a response. No tool-calling overhead, no dead air.

Get Started →Book A Demo

Benefits

Why Use Moss with LiveKit

Context injection pattern: Moss is queried automatically on every user turn, no LLM tool-calling step needed

Sub-10ms retrieval eliminates dead air - results are injected before the LLM starts generating

Works with LiveKit Agents framework using the on_user_turn_completed hook

Combine with Deepgram STT, OpenAI LLM, and any TTS provider in the LiveKit ecosystem

Pre-load your index at agent startup with load_index() for fastest queries

Integration

Quick Start

Python

from moss import MossClient, QueryOptions
from livekit.agents import Agent, AgentSession, ChatContext, ChatMessage

class MossVoiceAgent(Agent):
    def __init__(self, moss_client: MossClient):
        super().__init__(instructions="You are a helpful assistant.")
        self.moss = moss_client

    async def on_user_turn_completed(
        self, turn_ctx: ChatContext, new_message: ChatMessage
    ) -> None:
        # Automatic search on every user turn
        results = await self.moss.query(
            "knowledge-base", new_message.text_content,
            QueryOptions(top_k=5, alpha=0.8)
        )
        # Inject context before LLM generates
        if results.docs:
            context = "\n".join([d.text for d in results.docs])
            turn_ctx.add_message(role="system", content=context)
        await super().on_user_turn_completed(turn_ctx, new_message)

Setup

Get Started in 3 Steps

Install the SDKs

Run pip install moss livekit-agents livekit-plugins-openai livekit-plugins-deepgram to set up your environment.

Create and load your index

Use client.create_index() to index your knowledge base (FAQs, product docs, etc.), then call load_index() at agent startup for sub-10ms queries.

Extend Agent with context injection

Override on_user_turn_completed() to query Moss and inject results into the chat context. The LLM always has relevant context without needing to decide when to search.

FAQ

Frequently asked questions

Explore

Voice Platform

Moss + Pipecat

Pipecat is an open-source framework for building voice and multimodal agents.

View integration →

Voice Platform

Moss + VAPI

VAPI provides hosted voice agent infrastructure with support for custom knowledge bases via webhooks.

View integration →

Voice Platform

Moss + ElevenLabs

ElevenLabs provides conversational AI agents with speech synthesis.

View integration →

Use Case

Semantic Search for Voice Agents

Your voice agent needs context in under 10 milliseconds.

Learn more →

Ship Real-Time Retrieval in Minutes

Join 1000+ teams building the future of conversational AI. Get started for free or talk to our founders.

No credit card required

5-minute setup

Deploy in production today

from moss import MossClient, QueryOptions from livekit.agents import Agent, AgentSession, ChatContext, ChatMessage class MossVoiceAgent(Agent): def __init__(self, moss_client: MossClient): super().__init__(instructions="You are a helpful assistant.") self.moss = moss_client async def on_user_turn_completed( self, turn_ctx: ChatContext, new_message: ChatMessage ) -> None: # Automatic search on every user turn results = await self.moss.query( "knowledge-base", new_message.text_content, QueryOptions(top_k=5, alpha=0.8) ) # Inject context before LLM generates if results.docs: context = "\n".join([d.text for d in results.docs]) turn_ctx.add_message(role="system", content=context) await super().on_user_turn_completed(turn_ctx, new_message)

Loading

Loading

Moss + LiveKit: Real-Time Search for Voice Agents

Why Use Moss with LiveKit

Quick Start

Get Started in 3 Steps

Install the SDKs

Create and load your index

Extend Agent with context injection

Frequently asked questions

How does the context injection pattern work with LiveKit?

What audio providers work with Moss and LiveKit?

Can I use Moss with other voice platforms alongside LiveKit?

Related

Moss + Pipecat

Moss + VAPI

Moss + ElevenLabs

Semantic Search for Voice Agents

Ship Real-Time Retrieval in Minutes

Loading

Moss + LiveKit: Real-Time Search for Voice Agents

Why Use Moss with LiveKit

Quick Start

Get Started in 3 Steps

Install the SDKs

Create and load your index

Extend Agent with context injection

Frequently asked questions

How does the context injection pattern work with LiveKit?

What audio providers work with Moss and LiveKit?

Can I use Moss with other voice platforms alongside LiveKit?

Related

Moss + Pipecat

Moss + VAPI

Moss + ElevenLabs

Semantic Search for Voice Agents

Ship Real-Time Retrieval in Minutes