WebRTCPipeline.ts — 1,582 lines. Call lifecycle, media, audio routing, stats, participants — all in one class. useSessionConnection.ts — 1,384 lines. Orchestration hook mixed with backend-specific logic. No central call state. 10 transport states, set from 13+ places, no validation.
2 independent layers, each with its own state machine.
RTC Layer — RTCService (rtc.service.ts) Low-level WebRTC. RTCPeerConnection, SDP, ICE, SFU signaling, negotiation queue. Owns RTCState. Call layer reads it but does not write it. Currently: WebRTCSession.ts (995 lines) + PeerConnectionManager.ts (398 lines). Important: the negotiation queue lives here. All SDP operations (publish, subscribe, renegotiate) are serialized through this queue. Call layer never touches SDP directly.
Call Layer — CallService (call.service.ts) Business logic. CallState, participants, media control. Sits on top of RTC layer. Currently: WebRTCPipeline.ts (1,582 lines) — target ~650-700 lines.
Media Manager (media-manager.ts) getUserMedia, camera/mic presets, quality, stop tracks, silent placeholder. Currently: MediaCaptureService.ts (277 lines) — already clean, keep as is.
Types (types.ts) All types in one top-level file: src/callapp/types.ts
Key SFU difference from the P2P examples: The RTC-FRONTEND examples are peer-to-peer (direct offer/answer between users). Our system uses an SFU (Cloudflare Calls) — publish and subscribe are separate SDP negotiations on the same PeerConnection. The SFU requires SDP surgery (strip inactive m-sections, patch extmap IDs, strip orphaned RTX). All of this stays in the RTC layer. Call layer never sees SDP.
Two independent state machines:
RTC layer state — WebRTC connection:
type RTCState =
| { status: 'new' }
| { status: 'connecting' }
| { status: 'connected' }
| { status: 'disconnected' }
| { status: 'failed'; reason: string }
| { status: 'closed' }
Call layer state — call lifecycle:
type CallState =
| { status: 'idle' }
| { status: 'connecting'; sessionID: string }
| { status: 'joining'; sessionID: string }
| { status: 'active'; sessionID: string }
| { status: 'reconnecting'; sessionID: string }
| { status: 'error'; message: string; previousStatus?: CallStatus }
Errors can happen at both layers independently:
-
RTC layer: 'failed' with reason (ICE failure, DTLS error)
-
Call layer: 'error' with message (publish failed, subscribe failed, timeout)
type MediaState = { isCameraEnabled: boolean isMicEnabled: boolean isGuestMuted: boolean localPreviewStream: MediaStream | null }
type CallParticipant = { id: string name: string isLocal: boolean hasVideo: boolean hasAudio: boolean isEmpty: boolean }
Transitions (Call layer):
idle --> connecting --> joining --> active
^ | | |
| disconnect fail/cancel disconnect
| | | |
+----------+-------------+----------+
ERROR can happen from ANY state (connecting, joining, active, reconnecting).
Any state --> error.
error --> reconnecting --> connecting (recovery success)
error --> idle (user gives up / disconnect)
reconnecting --> error (recovery failed)
IMPORTANT: disconnect and error must not race.
If the system enters 'error' state, disconnect() must be safe to call
without corrupting an in-progress reconnect. The CallStateMachine must
reject disconnect() if status is 'reconnecting' — reconnect owns the
teardown/rebuild cycle. Only after reconnect completes (success or fail)
can disconnect() run.
Current bug in WebRTCPipeline: disconnect() nulls session/peer/callsClient
unconditionally. If ICE failure fires onIceRestartExhausted while reconnect()
is in progress, the reconnect sees null references and crashes silently.
The CallStateMachine fixes this by gating transitions.
Participants are separate from CallState:
active + just me = waiting for other side
active + 2 people = active call
active + 1 person = other side dropped
error + any = connection lost
RTCService (RTC layer):
class RTCService {
readonly rtcState: Accessor<RTCState>
// Connection
createSession(params: { sessionId: string; guestId: string }): Promise<void>
closeSession(): Promise<void>
restartIce(): Promise<void>
// Publish / Subscribe (SDP negotiation queue)
publishTracks(tracks: MediaStreamTrack[]): Promise<void>
subscribeTracks(params?: SubscribeParams): Promise<void>
clearPublisherSubscription(publisherGuestId: string): void
// Track management
addTrack(track: MediaStreamTrack): Promise<RTCRtpSender>
replaceTrack(sender: RTCRtpSender, track: MediaStreamTrack | null): Promise<void>
removeTrack(sender: RTCRtpSender): Promise<void>
applySendingParams(sender: RTCRtpSender, params: RTCRtpSendParameters): void
// Callbacks (set by CallService)
onRemoteStream: ((stream: MediaStream | null) => void) | null
onRemoteVideoActive: (() => void) | null
onRemoteVideoStale: (() => void) | null
onNetworkStats: ((stats: NetworkStats | null) => void) | null
onIceRestartExhausted: (() => void) | null
}
CallService (Call layer, sits on top of RTCService):
class CallService {
readonly callState: Accessor<CallState>
readonly participants: Accessor<CallParticipant[]>
readonly stats: Accessor<CallStats | null>
readonly isCameraEnabled: Accessor<boolean>
readonly isMicEnabled: Accessor<boolean>
readonly isGuestMuted: Accessor<boolean>
readonly localPreviewStream: Accessor<MediaStream | null>
constructor(rtc: RTCService)
// Connection (used by the orchestration hook)
connect(params: ConnectParams): Promise<void>
disconnect(): Promise<void>
reconnect(): Promise<void>
// Publish / Subscribe (called by the hook after connect)
publish(): Promise<void>
subscribe(params?: SubscribeParams): Promise<void>
clearPublisherSubscription(publisherGuestId: string): void
// Media Control
enableCamera(stream?: MediaStream): Promise<void>
disableCamera(): Promise<void>
enableMic(stream?: MediaStream): Promise<void>
disableMic(): Promise<void>
setCameraQuality(quality: string, fps?: number): Promise<void>
// Audio Output
setSpeakerEnabled(enabled: boolean): Promise<void>
setSpeakerDevice(deviceId: string | null): Promise<void>
setCombinedAudioPlayback(enabled: boolean): Promise<void>
setCombinedAudioLocalGain(gain: number): Promise<void>
setCombinedAudioRemoteGain(gain: number): Promise<void>
// Guest Control
setGuestMuted(muted: boolean): Promise<void>
}
Why connect/publish/subscribe stay separate (not collapsed into joinCall): The hook (useCall) sequences these with app-level logic between each step: backend API calls, stale-flow detection, conditional camera/mic, publish-failure recovery with SFU re-join. This orchestration depends on sessionStore, localMedia, and backend API — concerns that don't belong in CallService.
JoinCfSfuSession — on join — register guest with Cloudflare SFU PublishSessionTracks — on publish — send local audio/video SDP + track list to SFU SubscribeSessionTracks — on subscribe — receive remote tracks from publisher RenegotiateSessionAnswer — mid-call — SDP renegotiation (camera toggle, track add/remove) CloseSessionTracks — on leave — close tracks server-side ListSessionMembers — on subscribe — discover who else is in the session Heartbeat — every 10s — keep session alive on backend LeaveSession — on disconnect — notify backend to clean up participant Ping — every 50s — pre-call health check GetParticipantPaths — on join — discover participant routes
Current files:
src/realtime/pipelines/webrtc/
WebRTCPipeline.ts 1,582 lines — SPLITS
pipelineHelpers.ts 101 lines — keep
core/
WebRTCSession.ts 995 lines — becomes RTCService
PeerConnectionManager.ts 398 lines — merges into RTCService
MediaCaptureService.ts 277 lines — keep (media manager)
networkStatsBuilder.ts 238 lines — keep
staleVideoDetector.ts 109 lines — keep
sdpUtils.ts — keep
silentPlaceholder.ts — keep
signaling/
CloudflareCallsClient.ts — keep
SFUAdapter.ts — keep
src/features/session/hooks/
useSessionConnection.ts 1,384 lines — simplify
After refactoring — flat structure under src/callapp/:
src/callapp/
types.ts ~80 lines ALL types, top level
CallService.ts ~650-700 lines
RTCService.ts ~800 lines (WebRTCSession + PeerConnectionManager)
CallStateMachine.ts ~100 lines
CombinedAudioPlayback.ts ~290 lines extracted from WebRTCPipeline
ParticipantManager.ts ~120 lines extracted from WebRTCPipeline
MediaManager.ts ~277 lines renamed from MediaCaptureService
networkStatsBuilder.ts ~238 lines moved as-is
staleVideoDetector.ts ~109 lines moved as-is
sdpUtils.ts moved as-is
silentPlaceholder.ts moved as-is
signaling/
CloudflareCallsClient.ts moved as-is
SFUAdapter.ts moved as-is
src/features/session/
hooks/
useSessionConnection.ts ~800-900 lines SIMPLIFIED
utils/
callRouting.ts ~100 lines extracted from useSessionConnection
What moves out of WebRTCPipeline.ts into CombinedAudioPlayback.ts (~290 lines):
Fields that move: playbackAudioContext, playbackDestinationNode, playbackRemoteSourceNode, playbackLocalSourceNode, playbackRemoteGainNode, playbackLocalGainNode, playbackRemoteAnalyserNode, playbackLocalAnalyserNode, playbackRemoteAnalyserData, playbackLocalAnalyserData, playbackStream, combinedPlaybackDebugTimer, combinedPlaybackSetupRetryTimer, combinedPlaybackSetupRetryCount, combinedAudioDebugEnabled, combinedMicOnlyDebugEnabled, observedCombinedMicTrackId, detachCombinedMicTrackListeners
Methods that move: buildCombinedPlaybackStream, teardownCombinedPlayback, refreshPlaybackOutput, resumePlaybackContext, scheduleCombinedPlaybackSetupRetry, clearCombinedPlaybackSetupRetry, handleMicTrackAvailability, clearCombinedMicTrackListeners, enforceMicFeedbackSafeguards, startCombinedPlaybackDebugMonitor, stopCombinedPlaybackDebugMonitor, logCombinedPlaybackState, applySpeakerSinkId
Interface — receives current state as arguments (streams change over time):
class CombinedAudioPlayback {
// Called whenever streams or speaker state change.
// CallService passes CURRENT values each time.
refresh(context: {
remoteStream: MediaStream | null
micStream: MediaStream | undefined
speakerEnabled: boolean
}): Promise<void>
setCombinedEnabled(enabled: boolean): void
setLocalGain(gain: number): void
setRemoteGain(gain: number): void
setSpeakerDevice(deviceId: string | null): void
teardown(): void
}
This class owns the remoteVideoElement and the WebAudio graph. CallService calls refresh() wherever it currently calls refreshPlaybackOutput().
What moves out into ParticipantManager.ts (~120 lines):
Fields that move: remoteVideoReceiving, activePublisherSessionId, prevRemoteVideoElementCurrentTime, prevRemoteRenderedFrames
Methods that move: emitParticipants, sampleRemoteRenderDiagnostics, applyGuestMuteState
Interface — receives current state as arguments:
class ParticipantManager {
readonly participants: Accessor<CallParticipant[]>
update(context: {
remoteStream: MediaStream | null
remoteVideoElement: HTMLVideoElement | null
isGuestMuted: boolean
}): void
setVideoReceiving(active: boolean): void
setActivePublisherId(id: string): void
sampleRenderDiagnostics(): { ... }
clear(): void
}
What STAYS in CallService (not extracted): onRemoteStreamChanged — coordinates both CombinedAudioPlayback and ParticipantManager clearRemoteParticipant — calls audioPlayback.teardown() + participants.clear() These stay because they coordinate across the two extracted classes.
What moves out into CallStateMachine.ts (~100 lines):
Replaces 13 scattered setState() calls with validated transitions. Invalid transitions are rejected and logged — prevents impossible states.
class CallStateMachine {
readonly state: Accessor<CallState>
transition(to: CallState, reason: string): boolean // validates, rejects invalid
reset(): void // force to idle (full teardown only)
}
Key rules:
- Error can happen from ANY state — always allowed.
- 'reconnecting' blocks 'idle' transition — reconnect owns the teardown/rebuild cycle. disconnect() must wait for reconnect to finish.
- Error during reconnect → status goes to 'error', then caller decides: retry (→ reconnecting) or give up (→ idle).
This fixes a current bug: disconnect() nulls session/peer unconditionally. If ICE failure fires while reconnect() is in progress, reconnect sees null references. The state machine prevents this by rejecting the disconnect transition while reconnecting.
The existing 10 transport-level RealtimeState values are replaced. Two new state machines handle both layers independently:
RTC layer mapping (RTCState ← ICE/DTLS events):
iceConnectionState "checking" --> RTCState { status: 'connecting' }
iceConnectionState "connected" --> RTCState { status: 'connected' }
iceConnectionState "completed" --> RTCState { status: 'connected' }
iceConnectionState "disconnected" --> RTCState { status: 'disconnected' }
iceConnectionState "failed" --> RTCState { status: 'failed' }
iceConnectionState "closed" --> RTCState { status: 'closed' }
Call layer mapping (CallState ← CallService methods):
connect() called --> CallState { status: 'connecting' }
publish() called --> CallState { status: 'joining' }
subscribe() success --> CallState { status: 'active' }
reconnect() called --> CallState { status: 'reconnecting' }
disconnect() called --> CallState { status: 'idle' }
any failure --> CallState { status: 'error' }
What stays in CallService.ts (~650-700 lines):
- CallState machine (delegates to CallStateMachine)
- connect / disconnect / reconnect lifecycle
- publish / subscribe coordination
- enableCamera / disableCamera / enableMic / disableMic
- onRemoteStreamChanged (coordinates audio + participants)
- clearRemoteParticipant (coordinates audio + participants)
- publishIfNeeded (re-publish after mid-call media changes)
- Stats callback wiring (receives from RTCService, enriches, emits)
- Local preview stream management
- Stale video re-subscribe handler
- Camera quality and sending params
Note on stale video: When StaleVideoDetector fires onRemoteVideoStale, CallService clears the subscription cache and re-subscribes. This is a production-critical self-healing path that stays in CallService.
What moves out of useSessionConnection.ts into callRouting.ts (~100 lines):
Routing utilities extracted from useSessionConnection. The join flow stays in the hook — it depends on sessionStore, API calls, and stale-flow detection that belong at the app layer.
SolidJS note: Extracted classes create signals in their constructor (createSignal). This is safe — the codebase already does this in sessionStore.ts. Rule: extracted classes MUST NOT create createEffect or createMemo internally. They are "passive" — they expose signals but don't subscribe to them. All reactive subscriptions stay in hooks or components.
All features below already work today. These are regression tests to run after each extraction, not implementation steps.
After Extraction 1 (CombinedAudioPlayback):
- Combined audio playback toggle on/off
- Local gain and remote gain adjustment
- Speaker device switching
- Mic monitoring (hear yourself through speaker)
After Extraction 2 (ParticipantManager):
- Guest mute/unmute
- Participant list updates when remote joins/leaves
- Video element creation for remote stream
After Extraction 3 (RTCService + CallStateMachine):
- RTCState tracks ICE events correctly
- CallState transitions: idle -> connecting -> joining -> active
- Disconnect: any -> idle
- Error: ICE failure -> error -> reconnecting -> connecting
- Invalid transitions are logged and rejected
- RTCState and CallState update independently
After Extraction 4 (Restructure to src/callapp/): Steps 1-10 full regression:
- Pre-call connection (host + guest both browsers + mobile)
- Video-only call (low quality)
- Audio-only call
- Audio + Video call
- Empty join (no media, placeholder only)
- Empty join + upgrade (add camera, then mic mid-call)
- Start with media + downgrade (remove camera, then mic)
- Connect/disconnect stress (rapid join/leave, no leaked state)
- Quality presets, stats polling, combined audio
- Bug fixes from above
2 independent layers, each with its own state: RTCService — RTCState (new/connecting/connected/disconnected/failed/closed) CallService — CallState (idle/connecting/joining/active/reconnecting/error)
WebRTCPipeline.ts (1,582 lines) splits into: CallService.ts (~650-700) + CombinedAudioPlayback.ts (~290)
- ParticipantManager.ts (~120) + CallStateMachine.ts (~100)
WebRTCSession.ts + PeerConnectionManager.ts merge into RTCService.ts (~800 lines).
useSessionConnection.ts (1,384 lines) simplifies to ~800-900 lines.
All files move to src/callapp/. Types in src/callapp/types.ts.
Every extraction ships independently. Regression tests after each step.