Zero to Production: Building a Scalable Media Suite in a SaaS Platform
We increased user-generated content by 30%. Not through marketing, pricing changes, or UX tweaks—but by building an in-app media suite that removed friction from content creation.
Users previously had to:
- Record video/audio externally
- Edit in separate tools
- Export and upload
- Hope the format worked
We compressed this into: click record, edit inline, done. Here’s how we built it and the lessons learned along the way.
The Problem
Our e-learning platform lets users create courses, but media creation was a bottleneck:
User workflow before:
- Record screencast with Loom ($)
- Record voiceover with Audacity
- Edit in iMovie or similar
- Upload to our platform
- Re-record if mistakes found after upload
Pain points:
- Friction: 5 different tools
- Cost: External tools required subscriptions
- Quality: Inconsistent exports, format issues
- Time: 30+ minutes per 5-minute video
Result: Only 22% of users included rich media in courses.
The Vision
Build an all-in-one media suite directly in the product:
- Video recording (screen + camera)
- Audio recording and editing
- Image capture and annotation
- Basic editing (trim, crop, effects)
- Format optimization for web delivery
Goal: Increase media adoption to 40%+ users.
Architecture
High-Level System Design
┌─────────────────────────────────────────────────┐
│ Client (React App) │
│ ┌──────────────┐ ┌─────────────────────────┐ │
│ │ MediaRecorder│ │ Editor UI │ │
│ │ - Video │ │ - Trim/Crop │ │
│ │ - Audio │ │ - Filters │ │
│ │ - Screen │ │ - Preview │ │
│ └──────┬───────┘ └────────┬────────────────┘ │
│ │ │ │
└─────────┼────────────────────┼───────────────────┘
│ │
▼ ▼
┌─────────────────────────────────┐
│ Upload Service (Node.js) │
│ - Chunked upload handling │
│ - Resumable uploads │
│ - Format validation │
└─────────────┬───────────────────┘
│
▼
┌─────────────────────────────────┐
│ Processing Pipeline │
│ ┌─────────────────────────┐ │
│ │ FFmpeg Lambda Workers │ │
│ │ - Transcoding │ │
│ │ - Thumbnail generation │ │
│ │ - Quality variants │ │
│ └────────────┬─────────────┘ │
└───────────────┼──────────────────┘
│
▼
┌─────────────────────────────────┐
│ Storage (S3 + CloudFront) │
│ - Original files │
│ - Optimized variants │
│ - Thumbnails │
│ - Signed URLs (see ADR-034) │
└──────────────────────────────────┘
Implementation: Phase by Phase
Phase 1: Audio Recording (MVP)
Goal: Ship something fast to validate demand
Tech stack:
- MediaStream Recording API (browser native)
- Web Audio API for visualization
- React for UI
Implementation:
import { useState, useRef, useCallback } from 'react';
interface UseAudioRecorderResult {
isRecording: boolean;
isPaused: boolean;
recordingTime: number;
audioBlob: Blob | null;
startRecording: () => Promise<void>;
pauseRecording: () => void;
resumeRecording: () => void;
stopRecording: () => void;
}
function useAudioRecorder(): UseAudioRecorderResult {
const [isRecording, setIsRecording] = useState(false);
const [isPaused, setIsPaused] = useState(false);
const [recordingTime, setRecordingTime] = useState(0);
const [audioBlob, setAudioBlob] = useState<Blob | null>(null);
const mediaRecorderRef = useRef<MediaRecorder | null>(null);
const chunksRef = useRef<Blob[]>([]);
const streamRef = useRef<MediaStream | null>(null);
const timerRef = useRef<number | null>(null);
const startRecording = useCallback(async () => {
try {
// Request microphone access
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: true,
noiseSuppression: true,
sampleRate: 44100,
},
});
streamRef.current = stream;
// Create MediaRecorder
const mediaRecorder = new MediaRecorder(stream, {
mimeType: 'audio/webm;codecs=opus',
});
mediaRecorderRef.current = mediaRecorder;
chunksRef.current = [];
// Collect data chunks
mediaRecorder.ondataavailable = (event) => {
if (event.data.size > 0) {
chunksRef.current.push(event.data);
}
};
// Handle recording completion
mediaRecorder.onstop = () => {
const blob = new Blob(chunksRef.current, { type: 'audio/webm' });
setAudioBlob(blob);
setIsRecording(false);
// Stop all tracks
streamRef.current?.getTracks().forEach(track => track.stop());
};
// Start recording
mediaRecorder.start(100); // Collect data every 100ms
setIsRecording(true);
setRecordingTime(0);
// Start timer
timerRef.current = window.setInterval(() => {
setRecordingTime(prev => prev + 1);
}, 1000);
} catch (error) {
console.error('Failed to start recording:', error);
alert('Could not access microphone. Please check permissions.');
}
}, []);
const pauseRecording = useCallback(() => {
if (mediaRecorderRef.current?.state === 'recording') {
mediaRecorderRef.current.pause();
setIsPaused(true);
if (timerRef.current) clearInterval(timerRef.current);
}
}, []);
const resumeRecording = useCallback(() => {
if (mediaRecorderRef.current?.state === 'paused') {
mediaRecorderRef.current.resume();
setIsPaused(false);
timerRef.current = window.setInterval(() => {
setRecordingTime(prev => prev + 1);
}, 1000);
}
}, []);
const stopRecording = useCallback(() => {
if (mediaRecorderRef.current) {
mediaRecorderRef.current.stop();
if (timerRef.current) clearInterval(timerRef.current);
}
}, []);
return {
isRecording,
isPaused,
recordingTime,
audioBlob,
startRecording,
pauseRecording,
resumeRecording,
stopRecording,
};
}
// React component
function AudioRecorder({ onRecordingComplete }: Props) {
const {
isRecording,
isPaused,
recordingTime,
audioBlob,
startRecording,
pauseRecording,
resumeRecording,
stopRecording,
} = useAudioRecorder();
const handleComplete = async () => {
if (audioBlob) {
// Upload to server
const formData = new FormData();
formData.append('audio', audioBlob, 'recording.webm');
const response = await fetch('/api/media/upload', {
method: 'POST',
body: formData,
});
const { mediaId } = await response.json();
onRecordingComplete(mediaId);
}
};
return (
<div>
<div>
{formatTime(recordingTime)}
</div>
{!isRecording && !audioBlob && (
<button onClick={startRecording}>Start Recording</button>
)}
{isRecording && (
<>
{isPaused ? (
<button onClick={resumeRecording}>Resume</button>
) : (
<button onClick={pauseRecording}>Pause</button>
)}
<button onClick={stopRecording}>Stop</button>
</>
)}
{audioBlob && (
<>
<audio src={URL.createObjectURL(audioBlob)} controls />
<button onClick={handleComplete}>Use Recording</button>
<button onClick={() => setAudioBlob(null)}>Re-record</button>
</>
)}
</div>
);
}Result: Shipped in 2 weeks, 15% adoption in month 1.
Phase 2: Video Recording
Added complexity:
- Screen capture + webcam
- Picture-in-picture
- Higher file sizes (chunked uploads)
async function startVideoRecording() {
// Get screen capture
const screenStream = await navigator.mediaDevices.getDisplayMedia({
video: {
cursor: 'always',
displaySurface: 'monitor',
},
audio: true,
});
// Get webcam (optional)
const cameraStream = await navigator.mediaDevices.getUserMedia({
video: {
width: { ideal: 320 },
height: { ideal: 240 },
},
audio: false, // Use screen audio only
});
// Combine streams
const combinedStream = new MediaStream([
...screenStream.getVideoTracks(),
...screenStream.getAudioTracks(),
]);
// Add camera as picture-in-picture
const videoTrack = cameraStream.getVideoTracks()[0];
if (videoTrack) {
combinedStream.addTrack(videoTrack);
}
// Create recorder
const recorder = new MediaRecorder(combinedStream, {
mimeType: 'video/webm;codecs=vp9,opus',
videoBitsPerSecond: 2500000, // 2.5 Mbps
});
// Rest of recording logic...
}Phase 3: Chunked Upload for Large Files
Videos can be 100MB+. Chunked uploads with resume capability:
class ChunkedUploader {
private chunkSize = 5 * 1024 * 1024; // 5MB chunks
async upload(file: Blob, onProgress?: (percent: number) => void) {
// Initialize upload session
const { uploadId, uploadUrl } = await this.initializeUpload(file);
const totalChunks = Math.ceil(file.size / this.chunkSize);
const uploadedChunks: string[] = [];
for (let i = 0; i < totalChunks; i++) {
const start = i * this.chunkSize;
const end = Math.min(start + this.chunkSize, file.size);
const chunk = file.slice(start, end);
const etag = await this.uploadChunk(uploadId, i + 1, chunk);
uploadedChunks.push(etag);
onProgress?.(((i + 1) / totalChunks) * 100);
}
// Complete multipart upload
return this.completeUpload(uploadId, uploadedChunks);
}
private async initializeUpload(file: Blob) {
const response = await fetch('/api/media/init-upload', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
fileName: 'recording.webm',
fileSize: file.size,
mimeType: file.type,
}),
});
return response.json();
}
private async uploadChunk(
uploadId: string,
partNumber: number,
chunk: Blob
): Promise<string> {
// Get pre-signed URL for this chunk
const { uploadUrl } = await fetch(
`/api/media/upload-part?uploadId=${uploadId}&partNumber=${partNumber}`
).then(r => r.json());
// Upload chunk
const response = await fetch(uploadUrl, {
method: 'PUT',
body: chunk,
});
// Return ETag for completion
return response.headers.get('ETag')!;
}
private async completeUpload(uploadId: string, etags: string[]) {
const response = await fetch('/api/media/complete-upload', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
uploadId,
parts: etags.map((etag, index) => ({
ETag: etag,
PartNumber: index + 1,
})),
}),
});
return response.json();
}
}Phase 4: Server-Side Processing
Raw WebM files aren’t web-optimized. Processing pipeline using AWS Lambda + FFmpeg:
// Lambda function triggered by S3 upload
import ffmpeg from 'fluent-ffmpeg';
import AWS from 'aws-sdk';
const s3 = new AWS.S3();
export async function processVideo(event: S3Event) {
const bucket = event.Records[0].s3.bucket.name;
const key = event.Records[0].s3.object.key;
// Download from S3
const inputStream = s3.getObject({ Bucket: bucket, Key: key }).createReadStream();
// Process with FFmpeg
return new Promise((resolve, reject) => {
ffmpeg(inputStream)
// Generate multiple quality levels
.outputOptions([
'-c:v libx264', // H.264 codec
'-preset fast', // Encoding speed
'-crf 23', // Quality (lower = better)
'-c:a aac', // Audio codec
'-b:a 128k', // Audio bitrate
'-movflags +faststart', // Web optimization
])
.on('end', async () => {
// Generate thumbnail
await generateThumbnail(bucket, key);
// Update database
await markProcessingComplete(key);
resolve();
})
.on('error', reject)
.save(`/tmp/output.mp4`);
});
}
// Generate variants for adaptive streaming
async function generateVariants(inputPath: string) {
const variants = [
{ resolution: '1920x1080', bitrate: '5000k', suffix: '1080p' },
{ resolution: '1280x720', bitrate: '2500k', suffix: '720p' },
{ resolution: '854x480', bitrate: '1000k', suffix: '480p' },
];
for (const variant of variants) {
await new Promise((resolve, reject) => {
ffmpeg(inputPath)
.size(variant.resolution)
.videoBitrate(variant.bitrate)
.on('end', resolve)
.on('error', reject)
.save(`/tmp/output-${variant.suffix}.mp4`);
});
// Upload variant to S3
await uploadToS3(`/tmp/output-${variant.suffix}.mp4`, variant.suffix);
}
}Phase 5: Inline Editing
Basic editing features without leaving the browser:
// Trim video/audio
interface TrimmerProps {
mediaUrl: string;
duration: number;
onTrim: (startTime: number, endTime: number) => void;
}
function MediaTrimmer({ mediaUrl, duration, onTrim }: TrimmerProps) {
const [startTime, setStartTime] = useState(0);
const [endTime, setEndTime] = useState(duration);
const videoRef = useRef<HTMLVideoElement>(null);
const handleTrim = async () => {
// Send trim request to server
const response = await fetch('/api/media/trim', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
mediaUrl,
startTime,
endTime,
}),
});
const { trimmedUrl } = await response.json();
onTrim(trimmedUrl);
};
return (
<div>
<video ref={videoRef} src={mediaUrl} controls />
<RangeSlider
min={0}
max={duration}
values={[startTime, endTime]}
onChange={([start, end]) => {
setStartTime(start);
setEndTime(end);
videoRef.current!.currentTime = start;
}}
/>
<button onClick={handleTrim}>Trim</button>
</div>
);
}
// Server-side trim using FFmpeg
app.post('/api/media/trim', async (req, res) => {
const { mediaUrl, startTime, endTime } = req.body;
const inputPath = await downloadFromS3(mediaUrl);
const outputPath = `/tmp/trimmed-${Date.now()}.mp4`;
await new Promise((resolve, reject) => {
ffmpeg(inputPath)
.setStartTime(startTime)
.setDuration(endTime - startTime)
.on('end', resolve)
.on('error', reject)
.save(outputPath);
});
const trimmedUrl = await uploadToS3(outputPath);
res.json({ trimmedUrl });
});Challenges and Solutions
Challenge 1: Browser Compatibility
Problem: MediaRecorder API support varies
Solution: Feature detection + fallbacks
function isMediaRecorderSupported(): boolean {
return !!(
navigator.mediaDevices &&
navigator.mediaDevices.getUserMedia &&
window.MediaRecorder
);
}
function BrowserCheck() {
const isSupported = isMediaRecorderSupported();
if (!isSupported) {
return (
<Alert severity="warning">
Your browser doesn't support recording.
Please use Chrome, Firefox, or Edge.
</Alert>
);
}
return <MediaRecorderComponent />;
}Challenge 2: Large File Uploads
Problem: 500MB video uploads timing out
Solution: Chunked uploads with S3 multipart
// AWS S3 multipart upload
import { S3 } from 'aws-sdk';
class S3MultipartUploader {
private s3 = new S3();
async uploadLargeFile(file: File) {
// 1. Initiate multipart upload
const { UploadId } = await this.s3.createMultipartUpload({
Bucket: 'media-bucket',
Key: `uploads/${file.name}`,
ContentType: file.type,
}).promise();
// 2. Upload parts
const partSize = 10 * 1024 * 1024; // 10MB
const parts = [];
for (let i = 0; i < file.size; i += partSize) {
const chunk = file.slice(i, i + partSize);
const partNumber = Math.floor(i / partSize) + 1;
const { ETag } = await this.s3.uploadPart({
Bucket: 'media-bucket',
Key: `uploads/${file.name}`,
PartNumber: partNumber,
UploadId: UploadId!,
Body: chunk,
}).promise();
parts.push({ ETag, PartNumber: partNumber });
}
// 3. Complete upload
await this.s3.completeMultipartUpload({
Bucket: 'media-bucket',
Key: `uploads/${file.name}`,
UploadId: UploadId!,
MultipartUpload: { Parts: parts },
}).promise();
}
}Challenge 3: Processing Costs
Problem: FFmpeg Lambda timeouts for long videos
Solution: Step Functions orchestration + progress tracking
// Step Function definition
{
"StartAt": "ProcessVideo",
"States": {
"ProcessVideo": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxx:function:process-video",
"Next": "GenerateThumbnails",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "NotifyFailure"
}]
},
"GenerateThumbnails": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxx:function:generate-thumbnails",
"Next": "NotifySuccess"
},
"NotifySuccess": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxx:function:notify-user",
"End": true
},
"NotifyFailure": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxx:function:notify-failure",
"End": true
}
}
}Results
After 6 months in production:
- Adoption: 35% of users now include media (up from 22%)
- Content volume: 30% increase in user-generated content
- User satisfaction: 4.5/5 rating for media suite
- Time savings: Average 20 minutes saved per media creation
- Cost: $0.15 per video processed (sustainable)
User feedback:
- “Game changer for my course creation”
- “Finally don’t need Loom subscription”
- “Editing is basic but covers 90% of needs”
Lessons Learned
-
Ship Fast, Iterate: Audio-only MVP validated demand before building complex video features
-
Browser APIs Are Powerful: MediaRecorder, MediaStream, Web Audio API handle surprisingly complex use cases
-
Chunked Uploads Are Essential: For files >25MB, don’t even try without chunking
-
Processing Is Expensive: AWS Lambda costs add up. Consider batch processing for non-urgent work
-
Simple Editing Wins: Users wanted trim/crop, not iMovie. Resist feature creep.
What’s Next
Future enhancements:
- Real-time collaboration: Multiple users editing same media
- AI features: Auto-generated captions, scene detection, smart trimming
- Advanced editing: Transitions, text overlays, multi-track audio
- Mobile apps: Native recording on iOS/Android
Conclusion
Building an in-app media suite significantly increased user engagement by removing friction from content creation. The key was focusing on the user workflow, not building a competitor to professional editing tools.
Core principles:
- Start with MVP (audio recording)
- Use browser APIs where possible
- Chunked uploads for reliability
- Server-side processing for optimization
- Simple editing features that users actually need
The 30% increase in content creation validated that reducing friction has outsized impact on user behavior.
Building media features in your SaaS product? I’d love to discuss architecture approaches and lessons learned. Connect on LinkedIn.