Serverless at Scale: AWS Lambda, DynamoDB, and Cognito in Production

Serverless isn’t just for hobby projects. We built a production SaaS application serving thousands of users entirely on AWS serverless services: Lambda for compute, DynamoDB for data, and Cognito for auth.

Results: 99.95% uptime, auto-scaling to handle 10x traffic spikes, and infrastructure costs of just $247/month for 50,000 monthly active users.

The Stack

  • AWS Lambda: Serverless compute (Node.js/TypeScript)
  • API Gateway: REST API endpoints
  • DynamoDB: NoSQL database
  • Cognito: User authentication and authorization
  • S3 + CloudFront: Static assets
  • EventBridge: Event-driven workflows

Architecture

[Client (React)]
      ↓
[CloudFront + S3]
      ↓
[API Gateway]
      ↓
[Lambda Authorizer (Cognito)]
      ↓
┌─────────────────────────────────┐
│      Lambda Functions           │
│  ┌──────────┬──────────────┐   │
│  │  Users   │   Content    │   │
│  ├──────────┼──────────────┤   │
│  │ Billing  │  Analytics   │   │
│  └──────────┴──────────────┘   │
└────────┬────────────────────────┘
         │
    ┌────┴─────┐
    ↓          ↓
[DynamoDB]  [EventBridge]
                ↓
         [Background Jobs]

Key Implementation Patterns

1. API Gateway + Lambda

// Lambda function handler
import { APIGatewayProxyHandler } from 'aws-lambda';
import { DynamoDB } from 'aws-sdk';
 
const dynamodb = new DynamoDB.DocumentClient();
 
export const handler: APIGatewayProxyHandler = async (event) => {
  const userId = event.requestContext.authorizer?.claims.sub;
 
  if (!userId) {
    return {
      statusCode: 401,
      body: JSON.stringify({ error: 'Unauthorized' }),
    };
  }
 
  try {
    // Query DynamoDB
    const result = await dynamodb.get({
      TableName: 'Users',
      Key: { userId },
    }).promise();
 
    return {
      statusCode: 200,
      headers: {
        'Content-Type': 'application/json',
        'Access-Control-Allow-Origin': '*',
      },
      body: JSON.stringify(result.Item),
    };
  } catch (error) {
    console.error('Error:', error);
    return {
      statusCode: 500,
      body: JSON.stringify({ error: 'Internal server error' }),
    };
  }
};

2. Cognito Authentication

// Cognito User Pool + Identity Pool
import { CognitoIdentityServiceProvider } from 'aws-sdk';
 
const cognito = new CognitoIdentityServiceProvider();
 
// Sign up
export async function signUp(email: string, password: string) {
  const params = {
    ClientId: process.env.COGNITO_CLIENT_ID!,
    Username: email,
    Password: password,
    UserAttributes: [
      { Name: 'email', Value: email },
    ],
  };
 
  return cognito.signUp(params).promise();
}
 
// Confirm sign up
export async function confirmSignUp(email: string, code: string) {
  return cognito.confirmSignUp({
    ClientId: process.env.COGNITO_CLIENT_ID!,
    Username: email,
    ConfirmationCode: code,
  }).promise();
}
 
// Sign in
export async function signIn(email: string, password: string) {
  return cognito.initiateAuth({
    AuthFlow: 'USER_PASSWORD_AUTH',
    ClientId: process.env.COGNITO_CLIENT_ID!,
    AuthParameters: {
      USERNAME: email,
      PASSWORD: password,
    },
  }).promise();
}

3. DynamoDB Single-Table Design

// Single table design for multiple entities
interface BaseEntity {
  PK: string;  // Partition key
  SK: string;  // Sort key
  GSI1PK?: string;  // Global secondary index
  GSI1SK?: string;
}
 
interface UserEntity extends BaseEntity {
  PK: `USER#${string}`;
  SK: 'PROFILE';
  email: string;
  name: string;
  createdAt: string;
}
 
interface ProjectEntity extends BaseEntity {
  PK: `USER#${string}`;
  SK: `PROJECT#${string}`;
  GSI1PK: `PROJECT#${string}`;
  GSI1SK: `USER#${string}`;
  title: string;
  description: string;
}
 
// Query patterns
async function getUserProjects(userId: string) {
  return dynamodb.query({
    TableName: 'AppTable',
    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
    ExpressionAttributeValues: {
      ':pk': `USER#${userId}`,
      ':sk': 'PROJECT#',
    },
  }).promise();
}
 
async function getProject(projectId: string) {
  return dynamodb.query({
    TableName: 'AppTable',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :pk',
    ExpressionAttributeValues: {
      ':pk': `PROJECT#${projectId}`,
    },
  }).promise();
}

4. Event-Driven Architecture

// EventBridge integration
import { EventBridge } from 'aws-sdk';
 
const eventBridge = new EventBridge();
 
// Emit event
export async function emitProjectCreated(project: Project) {
  await eventBridge.putEvents({
    Entries: [{
      Source: 'app.projects',
      DetailType: 'ProjectCreated',
      Detail: JSON.stringify(project),
    }],
  }).promise();
}
 
// Lambda function listening to events
export const projectCreatedHandler: Handler = async (event) => {
  const project = JSON.parse(event.detail);
 
  // Send notification
  await sendEmail(project.owner, `Project ${project.title} created!`);
 
  // Initialize analytics
  await initializeAnalytics(project.id);
 
  // Other background tasks...
};

Production Lessons

1. Cold Starts Are Real

Problem: First request after idle period takes 2-3 seconds

Solutions:

  • Use provisioned concurrency for critical endpoints
  • Keep functions warm with CloudWatch Events
  • Minimize dependencies (tree-shaking)
// serverless.yml
functions:
  api:
    handler: handler.main
    provisionedConcurrency: 5  # Keep 5 instances warm

2. DynamoDB Capacity Planning

Problem: Throttling during traffic spikes

Solutions:

  • Use on-demand billing for unpredictable workloads
  • Implement exponential backoff
  • Monitor consumed capacity units
// Exponential backoff for DynamoDB
async function queryWithRetry(params: any, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await dynamodb.query(params).promise();
    } catch (error) {
      if (error.code === 'ProvisionedThroughputExceededException' && i < maxRetries - 1) {
        await sleep(Math.pow(2, i) * 100);  // 100ms, 200ms, 400ms
        continue;
      }
      throw error;
    }
  }
}

3. Lambda Timeouts

Problem: Long-running tasks timing out (15 min max)

Solution: Use Step Functions for orchestration

# Step Functions state machine
StartAt: ProcessUpload
States:
  ProcessUpload:
    Type: Task
    Resource: !GetAtt ProcessUploadFunction.Arn
    Next: GenerateThumbnail
 
  GenerateThumbnail:
    Type: Task
    Resource: !GetAtt GenerateThumbnailFunction.Arn
    Next: NotifyUser
 
  NotifyUser:
    Type: Task
    Resource: !GetAtt NotifyUserFunction.Arn
    End: true

Cost Optimization

Monthly costs for 50,000 MAU:

  • Lambda: $45 (5M requests, 512MB, 1s avg duration)
  • DynamoDB: $85 (on-demand, 10GB storage, 50M reads, 10M writes)
  • API Gateway: $35 (10M requests)
  • Cognito: $28 (50K MAU)
  • CloudFront + S3: $25 (100GB bandwidth, 50GB storage)
  • Data transfer: $20
  • CloudWatch: $9

**Total: 0.005 per user

Compare to traditional EC2:

  • 2x t3.medium instances: ~$120/month
  • RDS t3.small: ~$55/month
  • Load balancer: ~$30/month
  • Total: $205/month (but doesn’t auto-scale)

Results

  • 99.95% uptime over 12 months
  • Auto-scaling: Handled 10x Black Friday traffic with zero config changes
  • Response time: p50: 120ms, p99: 450ms
  • Developer velocity: Deploy in 2 minutes with no downtime
  • Cost efficiency: $0.005 per monthly active user

Key Takeaways

  1. Serverless works at scale: Don’t let myths hold you back
  2. Cold starts are manageable: Provisioned concurrency for critical paths
  3. DynamoDB single-table design: Requires planning but worth it
  4. Event-driven architecture: Decouple services for flexibility
  5. Monitor everything: CloudWatch alarms are your friend

Building serverless applications? I’d love to discuss patterns and pitfalls. Connect on LinkedIn.