Decisions

  • Pending: STT engine choice (Groq Whisper, OpenAI Whisper, or alternatives)
  • Pending: TTS engine choice (ElevenLabs, OpenAI TTS, or alternatives)

User Tasks


Summary

Voice interface for Opus — speak to it and hear responses.

Problem / Motivation

Typing is not always convenient. Voice interaction enables hands-free use, especially on phone (FR-022) and for briefing delivery.

Proposed Solution

Integrate STT and TTS engines:

  • STT: Groq Whisper (fast, free tier), OpenAI Whisper, or local Whisper
  • TTS: ElevenLabs (high quality voices), OpenAI TTS, or alternatives

Consider privacy for voice data.


Open Questions

No open questions.


Phase Overview

PhaseDescriptionStatus
Phase 1Speech to Text
Phase 2Text to Speech

Phase 1: Speech to Text —

Goal: Enable voice input that gets transcribed and sent to Opus.

File / FeatureDetailsOwnerStatus
Voice inputVoice input via phone or desktop micopus
TranscriptionTranscribe and send to Opus as text commandopus

Phase 2: Text to Speech —

Goal: Enable Opus to read responses and briefings aloud.

File / FeatureDetailsOwnerStatus
Read briefings aloudRead daily briefings as audioopus
Voice responsesVoice responses to queriesopus

Prerequisites / Gap Analysis

Requirements

RequirementDescription
REQ-1Can give Opus a voice command and receive a text response
REQ-2Can listen to the daily briefing as audio

Current State

ComponentStatusDetails
Phone accessFR-022 not yet started
STT integrationNo integration exists
TTS integrationNo integration exists

Gap (What’s missing?)

GapEffortBlocker?
Phone access (FR-022)HighYes
STT engine integrationMedNo
TTS engine integrationMedNo

Test

Manual tests

TestExpectedActualLast
Voice command inputTranscribed text sent to Opuspending-
Briefing audio playbackDaily briefing read aloudpending-

AI-verified tests

ScenarioExpected behaviorVerification method

E2E tests

ScenarioAssertion

Integration tests

ComponentCoverage

Unit tests

ComponentTestsCoverage

History

DateEventDetails
2026-02-26CreatedCreated from brain dump
2026-02-27RenumberedFrom FR-019 to FR-024
2026-02-28RewrittenAligned to feature-request template

References

  • FR-022 (Phone Access) — depends on phone access for mobile voice use
  • ClaudeClaw Rebuild Prompt — Groq Whisper STT + ElevenLabs TTS pipeline, OGA→OGG workaround