Skip to content

Conversation

@pranavjoshi001
Copy link

@pranavjoshi001 pranavjoshi001 commented Dec 12, 2025

Changelog Entry

TBD

Description

This PR introduces the foundation for Speech-to-Speech (S2S) functionality in Web Chat by adding core infrastructure components. While the implementation is currently non-operational (no-op), it establishes the necessary hooks, providers, and utilities required for the upcoming MMRT (Multi-Modal Real-Time) and ABS (Azure Bot Service) integration changes.

Design

The Speech-to-Speech feature is built on three main components:

  1. Voice Activities Hook - A React hook (useVoiceActivities.ts) that filters and provides voice-specific activities from the Redux store
  2. SpeechToSpeech Provider - A React context provider (SpeechToSpeechComposer.tsx) that manages:
    • Audio recording via useRecorder.ts hook using Web Audio API with AudioWorklet
    • Audio playback via useAudioPlayer.ts hook with proper queueing and timing
    • Speech state management (idle, listening, processing, bot_speaking)
    • Integration with DirectLine for sending audio chunks and handling voice live events
  3. useSpeechToSpeech Hook - That provides mic state, mic toggle handler and speech state for any consumer UI to work with and provider will take care of intercepting voice chunks, play them and send voice chunks to socket.

The provider is designed to work with the existing Web Chat architecture, consuming activities from the Redux store and posting audio data through the postActivity hook.

Specific Changes

New Files Added:

  • isVoiceActivity.ts - Type guard for voice activities
  • useVoiceActivities.ts - Hook to select voice activities from store
  • SpeechToSpeechComposer.tsx - Main S2S provider component
  • useSpeechToSpeech.ts - Hook to consume S2S context
  • useAudioPlayer.ts - Audio playback logic
  • useRecorder.ts - Audio recording logic

Current State:

  • Not yet exposed in public API exports
  • Not yet integrated into main Composer wrapper
  • Provides foundation for voice live event protocol implementation
  • Test coverage added.
  • I have added tests and executed them locally
  • I have updated CHANGELOG.md
  • I have updated documentation

Review Checklist

This section is for contributors to review your work.

  • Accessibility reviewed (tab order, content readability, alt text, color contrast)
  • Browser and platform compatibilities reviewed
  • CSS styles reviewed (minimal rules, no z-index)
  • Documents reviewed (docs, samples, live demo)
  • Internationalization reviewed (strings, unit formatting)
  • package.json and package-lock.json reviewed
  • Security reviewed (no data URIs, check for nonce leak)
  • Tests reviewed (coverage, legitimacy)

@pranavjoshi001 pranavjoshi001 changed the title Feature/core s2s composer Core speech to speech composer implementation (no-op code) Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant