WebRTCPipeline.ts — 1,582 lines. Call lifecycle, media, audio routing, stats, participants — all in one class. useSessionConnection.ts — 1,384 lines. Orchestration hook mixed with backend-specific logic. No central call state. 10 transport states, set from 13+ places, no validation.
3 layers, inspired by RTC-FRONTEND pattern, adapted for our SFU (not P2P).
Layer 1 — Call Service (call.service.ts) Business logic. CallState, participants, media control, coordinates layers below. Currently: WebRTCPipeline.ts (1,582 lines) — target ~650-700 lines.
Layer 2 — WebRTC Service (webrtc.service.ts) Low-level WebRTC. RTCPeerConnection, SDP, ICE, SFU signaling, negotiation queue. Currently: WebRTCSession.ts + PeerConnectionManager.ts — already separate, keep. Important: the negotiation queue lives here. All SDP operations (publish, subscribe, renegotiate) are serialized through this queue. Call Service never touches SDP directly.
Layer 3 — Media Manager (media-manager.ts) getUserMedia, camera/mic presets, quality, stop tracks, silent placeholder. Currently: MediaCaptureService.ts (277 lines) — already clean, keep as is.
Types (call.types.ts) All call types in one file.
Key SFU difference from the P2P examples: The RTC-FRONTEND examples are peer-to-peer (direct offer/answer between users). Our system uses an SFU (Cloudflare Calls) — publish and subscribe are separate SDP negotiations on the same PeerConnection. The SFU requires SDP surgery (strip inactive m-sections, patch extmap IDs, strip orphaned RTX). All of this stays in Layer 2 (WebRTC Service). Call Service never sees SDP.
type CallState =
| { status: 'idle' }
| { status: 'connected'; hostID: string }
| { status: 'joining'; sessionID: string }
| { status: 'call'; sessionID: string }
| { status: 'reconnecting'; sessionID: string }
| { status: 'error'; message: string }
6 states. The joining state covers the publish/subscribe phase where the user
sees a progress indicator. The reconnecting state is needed because recovery
takes time and the UI must show it.
type MediaState = {
isCameraEnabled: boolean
isMicEnabled: boolean
isGuestMuted: boolean
localPreviewStream: MediaStream | null
}
type StreamState = {
remoteStream: MediaStream | null
localCameraStream: MediaStream | null
localMicStream: MediaStream | null
}
type CallParticipant = {
id: string
name: string
isLocal: boolean
hasVideo: boolean
hasAudio: boolean
}
Transitions:
idle --> connected --> joining --> call
^ | | |
| disconnect fail/cancel disconnect
| | | |
+----------+-------------+----------+
ERROR can happen from ANY state (connected, joining, call, reconnecting).
Any state --> error.
error --> reconnecting --> connected (recovery success)
error --> idle (user gives up / disconnect)
reconnecting --> error (recovery failed)
IMPORTANT: disconnect and error must not race.
If the system enters 'error' state, disconnect() must be safe to call
without corrupting an in-progress reconnect. The CallStateMachine must
reject disconnect() if status is 'reconnecting' — reconnect owns the
teardown/rebuild cycle. Only after reconnect completes (success or fail)
can disconnect() run.
Current bug in WebRTCPipeline: disconnect() nulls session/peer/callsClient
unconditionally. If ICE failure fires onIceRestartExhausted while reconnect()
is in progress, the reconnect sees null references and crashes silently.
The CallStateMachine fixes this by gating transitions.
Participants are separate from CallState:
connected + just me = waiting for other side
call + 2 people = active call
call + 1 person = other side dropped
error + any = connection lost
class CallService {
// State (read-only signals)
readonly callState: Accessor<CallState>
readonly participants: Accessor<CallParticipant[]>
readonly mediaState: Accessor<MediaState>
readonly stats: Accessor<CallStats | null>
// Connection (used by the orchestration hook)
connect(params: ConnectParams): Promise<void>
disconnect(): Promise<void>
reconnect(): Promise<void>
// Publish / Subscribe (called by the hook after connect)
publish(): Promise<void>
subscribe(params?: SubscribeParams): Promise<void>
clearPublisherSubscription(publisherGuestId: string): void
// Media Control
enableCamera(stream?: MediaStream): Promise<void>
disableCamera(): Promise<void>
enableMic(stream?: MediaStream): Promise<void>
disableMic(): Promise<void>
setCameraQuality(quality: string, fps?: number): Promise<void>
// Audio Output
setSpeakerEnabled(enabled: boolean): Promise<void>
setSpeakerDevice(deviceId: string | null): Promise<void>
setCombinedAudioPlayback(enabled: boolean): Promise<void>
setCombinedAudioLocalGain(gain: number): Promise<void>
setCombinedAudioRemoteGain(gain: number): Promise<void>
// Guest Control
setGuestMuted(muted: boolean): Promise<void>
}
Why connect/publish/subscribe stay separate (not collapsed into joinCall): The hook (useSessionConnection) sequences these with app-level logic between each step: backend API calls, stale-flow detection, conditional camera/mic, publish-failure recovery with SFU re-join. This orchestration depends on sessionStore, localMedia, and backend API — concerns that don't belong in the Call Service. The Call Service is a cleaner version of the current WebRTCPipeline — same methods, but organized.
JoinCfSfuSession — on join — register guest with Cloudflare SFU PublishSessionTracks — on publish — send local audio/video SDP + track list to SFU SubscribeSessionTracks — on subscribe — receive remote tracks from publisher RenegotiateSessionAnswer — mid-call — SDP renegotiation (camera toggle, track add/remove) CloseSessionTracks — on leave — close tracks server-side ListSessionMembers — on subscribe — discover who else is in the session Heartbeat — every 10s — keep session alive on backend LeaveSession — on disconnect — notify backend to clean up participant Ping — every 50s — pre-call health check GetParticipantPaths — on join — discover participant routes
Current files:
src/realtime/pipelines/webrtc/
WebRTCPipeline.ts 1,582 lines — SPLITS INTO 4 FILES
pipelineHelpers.ts 101 lines — keep
core/
WebRTCSession.ts 995 lines — keep (Layer 2, owns negotiation queue)
PeerConnectionManager.ts 398 lines — keep (Layer 2)
MediaCaptureService.ts 277 lines — keep (Layer 3)
networkStatsBuilder.ts 238 lines — keep
staleVideoDetector.ts 109 lines — keep
sdpUtils.ts — keep
silentPlaceholder.ts — keep
signaling/
CloudflareCallsClient.ts — keep
SFUAdapter.ts — keep
src/features/session/hooks/
useSessionConnection.ts 1,384 lines — simplify
After refactoring — new files:
src/realtime/pipelines/webrtc/
CallService.ts ~650-700 lines RENAMED from WebRTCPipeline.ts
core/
CallStateMachine.ts ~100 lines NEW
CombinedAudioPlayback.ts ~290 lines NEW — extracted from WebRTCPipeline
RemoteParticipantManager.ts ~120 lines NEW — extracted from WebRTCPipeline
types/
call.types.ts ~50 lines NEW
src/features/session/
hooks/
useSessionConnection.ts ~800-900 lines SIMPLIFIED
utils/
moqRouting.ts ~100 lines NEW — extracted from useSessionConnection
What moves out of WebRTCPipeline.ts into CombinedAudioPlayback.ts (~290 lines):
Fields that move: playbackAudioContext, playbackDestinationNode, playbackRemoteSourceNode, playbackLocalSourceNode, playbackRemoteGainNode, playbackLocalGainNode, playbackRemoteAnalyserNode, playbackLocalAnalyserNode, playbackRemoteAnalyserData, playbackLocalAnalyserData, playbackStream, combinedPlaybackDebugTimer, combinedPlaybackSetupRetryTimer, combinedPlaybackSetupRetryCount, combinedAudioDebugEnabled, combinedMicOnlyDebugEnabled, observedCombinedMicTrackId, detachCombinedMicTrackListeners
Methods that move: buildCombinedPlaybackStream, teardownCombinedPlayback, refreshPlaybackOutput, resumePlaybackContext, scheduleCombinedPlaybackSetupRetry, clearCombinedPlaybackSetupRetry, handleMicTrackAvailability, clearCombinedMicTrackListeners, enforceMicFeedbackSafeguards, startCombinedPlaybackDebugMonitor, stopCombinedPlaybackDebugMonitor, logCombinedPlaybackState, applySpeakerSinkId
Interface — receives current state as arguments (streams change over time):
class CombinedAudioPlayback {
// Called whenever streams or speaker state change.
// CallService passes CURRENT values each time.
refresh(context: {
remoteStream: MediaStream | null
micStream: MediaStream | undefined
speakerEnabled: boolean
}): Promise<void>
setCombinedEnabled(enabled: boolean): void
setLocalGain(gain: number): void
setRemoteGain(gain: number): void
setSpeakerDevice(deviceId: string | null): void
teardown(): void
}
This class owns the remoteVideoElement and the WebAudio graph. CallService calls refresh() wherever it currently calls refreshPlaybackOutput().
What moves out into RemoteParticipantManager.ts (~120 lines):
Fields that move: remoteVideoReceiving, activePublisherSessionId, prevRemoteVideoElementCurrentTime, prevRemoteRenderedFrames
Methods that move: emitParticipants, sampleRemoteRenderDiagnostics, applyGuestMuteState
Interface — receives current state as arguments:
class RemoteParticipantManager {
readonly participants: Accessor<CallParticipant[]>
update(context: {
remoteStream: MediaStream | null
remoteVideoElement: HTMLVideoElement | null
isGuestMuted: boolean
}): void
setVideoReceiving(active: boolean): void
setActivePublisherId(id: string): void
sampleRenderDiagnostics(): { ... }
clear(): void
}
What STAYS in CallService (not extracted): onRemoteStreamChanged — coordinates both CombinedAudioPlayback and RemoteParticipantManager clearRemoteParticipant — calls audioPlayback.teardown() + participants.clear() These stay because they coordinate across the two extracted classes.
What moves out into CallStateMachine.ts (~100 lines):
Replaces 13 scattered setState() calls with validated transitions. Invalid transitions are rejected and logged — prevents impossible states.
class CallStateMachine {
readonly state: Accessor<CallState>
transition(to: CallState, reason: string): boolean // validates, rejects invalid
reset(): void // force to idle (full teardown only)
}
Key rules:
- Error can happen from ANY state — always allowed.
- 'reconnecting' blocks 'idle' transition — reconnect owns the teardown/rebuild cycle. disconnect() must wait for reconnect to finish.
- Error during reconnect → status goes to 'error', then caller decides: retry (→ reconnecting) or give up (→ idle).
This fixes a current bug: disconnect() nulls session/peer unconditionally. If ICE failure fires while reconnect() is in progress, reconnect sees null references. The state machine prevents this by rejecting the disconnect transition while reconnecting.
The existing 10 transport-level RealtimeState values stay as INTERNAL state in CallService for backward compatibility. CallStateMachine is a new, higher-level signal derived from transport events.
Mapping:
Transport state "connecting" --> CallState { status: 'connected' }
Transport state "connected" --> CallState { status: 'connected' }
Transport state "publishing" --> CallState { status: 'joining' }
Transport state "published" --> CallState { status: 'joining' }
Transport state "subscribing" --> CallState { status: 'joining' }
Transport state "subscribed" --> CallState { status: 'call' }
Transport state "reconnecting" --> CallState { status: 'reconnecting' }
Transport state "error" --> CallState { status: 'error' }
Transport state "disconnected" --> CallState { status: 'idle' }
Transport state "idle" --> CallState { status: 'idle' }
What stays in CallService.ts (~650-700 lines):
- CallState machine (delegates to CallStateMachine)
- connect / disconnect / reconnect lifecycle
- publish / subscribe coordination
- enableCamera / disableCamera / enableMic / disableMic
- onRemoteStreamChanged (coordinates audio + participants)
- clearRemoteParticipant (coordinates audio + participants)
- publishIfNeeded (re-publish after mid-call media changes)
- Stats callback wiring (receives from WebRTCSession, enriches, emits)
- Local preview stream management
- Stale video re-subscribe handler
- Camera quality and sending params
Note on stale video: When StaleVideoDetector fires onRemoteVideoStale, CallService clears the subscription cache and re-subscribes. This is a production-critical self-healing path that stays in CallService.
What moves out of useSessionConnection.ts into moqRouting.ts (~100 lines):
parseMoqUrl, normalizeNamespace, derivePrefix, buildMoqRoutingConfig, buildSubscribeNamespacesFromParticipantPaths, resolveNamespaceFromPath
What simplifies in useSessionConnection.ts:
Transport-state pattern matching replaced with callState reads. Backend-specific branches in runJoinCall() extracted to focused functions. The join flow stays in the hook — it depends on sessionStore, API calls, and stale-flow detection that belong at the app layer.
SolidJS note: Extracted classes create signals in their constructor (createSignal). This is safe — the codebase already does this in sessionStore.ts. Rule: extracted classes MUST NOT create createEffect or createMemo internally. They are "passive" — they expose signals but don't subscribe to them. All reactive subscriptions stay in hooks or components.
All features below already work today. These are regression tests to run after each extraction, not implementation steps.
After Extraction 1 (CombinedAudioPlayback):
- Combined audio playback toggle on/off
- Local gain and remote gain adjustment
- Speaker device switching
- Mic monitoring (hear yourself through speaker)
After Extraction 2 (RemoteParticipantManager):
- Guest mute/unmute
- Participant list updates when remote joins/leaves
- Video element creation for remote stream
After Extraction 3 (CallStateMachine):
- Full join flow: idle -> connected -> joining -> call
- Disconnect: any -> idle
- Error: ICE failure -> error -> reconnecting -> connected
- Invalid transitions are logged and rejected
After Extraction 4 (Rename + Restructure): Steps 1-10 full regression:
- Pre-call connection (host + guest both browsers + mobile)
- Video-only call (low quality)
- Audio-only call
- Audio + Video call
- Empty join (no media, placeholder only)
- Empty join + upgrade (add camera, then mic mid-call)
- Start with media + downgrade (remove camera, then mic)
- Connect/disconnect stress (rapid join/leave, no leaked state)
- Quality presets, stats polling, combined audio
- Bug fixes from above
WebRTCPipeline.ts (1,582 lines) splits into: CallService.ts (~650-700) + CombinedAudioPlayback.ts (~290)
- RemoteParticipantManager.ts (~120) + CallStateMachine.ts (~100)
useSessionConnection.ts (1,384 lines) simplifies to ~800-900 lines.
New types file: call.types.ts (~50 lines). New utility: moqRouting.ts (~100 lines).
6 call states (idle/connected/joining/call/reconnecting/error) replace 10 unvalidated transport states for external consumers. Transport states stay internally for backward compatibility.
Every extraction ships independently. Regression tests after each step.