Using Snowflake for Product Strategy: From Market Data to Development Roadmap

Most product roadmaps are built on intuition and HiPPO (Highest Paid Person’s Opinion). Ours is built on data.

As founding engineer of the EG-Labs AI initiative, I used Snowflake to analyze market data, user behavior, and competitive intelligence to build a prioritized AI feature roadmap. This data-driven approach helped us:

  • Identify $2M revenue opportunity in voice AI
  • Avoid building 3 features with low adoption potential
  • Prioritize multimodal content generation (35% user adoption)
  • Forecast feature ROI within 15% accuracy

Here’s how we did it.

The Challenge

Problem: Which AI features should we build first?

Options:

  • AI voiceovers
  • Content generation
  • Real-time voice agents
  • Auto-translations
  • Smart recommendations
  • Image generation

Traditional approach: Product manager’s gut feeling

Our approach: Data-driven decision making with Snowflake

Why Snowflake?

We chose Snowflake for product analytics because:

  1. Unlimited scale: Query terabytes without performance degradation
  2. Zero maintenance: No indexes, partitions, or tuning needed
  3. Standard SQL: Entire team can write queries
  4. Data sharing: Combine internal data with external market data
  5. Near-zero latency: Cached queries return in <1 second

Data Sources We Combined

┌─────────────────────────────────────────────────┐
│              Snowflake Data Warehouse           │
├─────────────────────────────────────────────────┤
│                                                 │
│  ┌───────────────┐  ┌──────────────────────┐  │
│  │  App Events   │  │  Market Intelligence │  │
│  │  - User clicks│  │  - Competitor data   │  │
│  │  - Features   │  │  - Industry trends   │  │
│  │  - Sessions   │  │  - Search volumes    │  │
│  └───────────────┘  └──────────────────────┘  │
│                                                 │
│  ┌───────────────┐  ┌──────────────────────┐  │
│  │  CRM Data     │  │  Support Tickets     │  │
│  │  - Customer   │  │  - Feature requests  │  │
│  │    segments   │  │  - Pain points       │  │
│  │  - Revenue    │  │  - Bug reports       │  │
│  └───────────────┘  └──────────────────────┘  │
│                                                 │
│  ┌───────────────┐  ┌──────────────────────┐  │
│  │  Usage Costs  │  │  A/B Test Results    │  │
│  │  - API calls  │  │  - Conversion rates  │  │
│  │  - Storage    │  │  - Engagement        │  │
│  └───────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────┘
          ↓
    ┌──────────────┐
    │  BI Tools    │
    │  - Tableau   │
    │  - Looker    │
    │  - Metabase  │
    └──────────────┘

Implementation

1. Data Ingestion Pipeline

// Stream app events to Snowflake
import { Snowflake } from 'snowflake-sdk';
 
const connection = Snowflake.createConnection({
  account: process.env.SNOWFLAKE_ACCOUNT!,
  username: process.env.SNOWFLAKE_USER!,
  password: process.env.SNOWFLAKE_PASSWORD!,
  warehouse: 'ANALYTICS_WH',
  database: 'PRODUCT_DATA',
  schema: 'EVENTS',
});
 
interface AnalyticsEvent {
  userId: string;
  eventName: string;
  properties: Record<string, any>;
  timestamp: Date;
}
 
export async function trackEvent(event: AnalyticsEvent) {
  const sql = `
    INSERT INTO events (user_id, event_name, properties, timestamp)
    VALUES (?, ?, PARSE_JSON(?), ?)
  `;
 
  await connection.execute({
    sqlText: sql,
    binds: [
      event.userId,
      event.eventName,
      JSON.stringify(event.properties),
      event.timestamp.toISOString(),
    ],
  });
}
 
// Track in application
trackEvent({
  userId: '123',
  eventName: 'feature_used',
  properties: {
    feature: 'ai_voiceover',
    duration_seconds: 45,
    success: true,
  },
  timestamp: new Date(),
});

2. Feature Request Analysis

Query: What features are users requesting most?

-- Analyze support tickets for feature requests
WITH feature_requests AS (
  SELECT
    ticket_id,
    created_at,
    subject,
    description,
    customer_tier,
    CASE
      WHEN description ILIKE '%voice%' OR description ILIKE '%audio%'
        THEN 'ai_voiceover'
      WHEN description ILIKE '%translat%' OR description ILIKE '%language%'
        THEN 'translation'
      WHEN description ILIKE '%video%' OR description ILIKE '%record%'
        THEN 'video_recording'
      WHEN description ILIKE '%ai%' OR description ILIKE '%generat%'
        THEN 'ai_content_generation'
      ELSE 'other'
    END AS feature_category
  FROM support_tickets
  WHERE created_at >= DATEADD('month', -6, CURRENT_DATE())
    AND tags ILIKE '%feature-request%'
)
 
SELECT
  feature_category,
  COUNT(*) AS request_count,
  COUNT(DISTINCT CASE WHEN customer_tier = 'enterprise' THEN ticket_id END) AS enterprise_requests,
  COUNT(DISTINCT CASE WHEN customer_tier = 'pro' THEN ticket_id END) AS pro_requests,
  COUNT(DISTINCT CASE WHEN customer_tier = 'free' THEN ticket_id END) AS free_requests
FROM feature_requests
GROUP BY feature_category
ORDER BY request_count DESC;

Results:

FeatureTotal RequestsEnterpriseProFree
ai_voiceover2474513270
translation183388956
ai_content_generation156299136
video_recording98125432

Insight: AI voiceover is #1 request, especially from paying customers.

3. Usage Pattern Analysis

Query: Which user segments are most likely to use AI features?

-- Identify high-propensity users for AI features
WITH user_segments AS (
  SELECT
    user_id,
    DATEDIFF('day', created_at, CURRENT_DATE()) AS account_age_days,
    plan_tier,
    industry,
    COUNT(DISTINCT course_id) AS courses_created,
    AVG(course_completion_rate) AS avg_completion_rate,
    CASE
      WHEN COUNT(DISTINCT course_id) >= 10 THEN 'power_user'
      WHEN COUNT(DISTINCT course_id) >= 3 THEN 'regular_user'
      ELSE 'occasional_user'
    END AS user_type
  FROM users u
  JOIN courses c ON u.user_id = c.owner_id
  WHERE u.created_at >= DATEADD('month', -12, CURRENT_DATE())
  GROUP BY 1, 2, 3, 4
),
 
ai_feature_usage AS (
  SELECT
    user_id,
    COUNT(DISTINCT CASE WHEN event_name = 'ai_feature_used' THEN event_id END) AS ai_events,
    MAX(timestamp) AS last_ai_usage
  FROM events
  WHERE event_name = 'ai_feature_used'
    AND timestamp >= DATEADD('month', -3, CURRENT_DATE())
  GROUP BY 1
)
 
SELECT
  us.user_type,
  us.plan_tier,
  us.industry,
  COUNT(DISTINCT us.user_id) AS total_users,
  COUNT(DISTINCT af.user_id) AS ai_users,
  ROUND(COUNT(DISTINCT af.user_id)::FLOAT / COUNT(DISTINCT us.user_id) * 100, 2) AS ai_adoption_pct,
  AVG(af.ai_events) AS avg_ai_events_per_user
FROM user_segments us
LEFT JOIN ai_feature_usage af ON us.user_id = af.user_id
GROUP BY 1, 2, 3
ORDER BY ai_adoption_pct DESC;

Results:

User TypePlanIndustryTotal UsersAI Adoption %Avg Events
power_userenterprisetech34278%23.4
power_userproeducation89165%18.2
regular_userenterprisehealthcare23452%12.1

Insight: Power users in tech/education are ideal early adopters.

4. Revenue Impact Forecasting

Query: What’s the revenue potential of each feature?

-- Calculate potential revenue from feature adoption
WITH feature_willingness_to_pay AS (
  -- From customer interviews and surveys stored in Snowflake
  SELECT
    feature_name,
    customer_segment,
    AVG(willingness_to_pay_usd) AS avg_wtp,
    COUNT(*) AS respondents
  FROM feature_surveys
  WHERE survey_date >= DATEADD('month', -2, CURRENT_DATE())
  GROUP BY 1, 2
),
 
addressable_market AS (
  SELECT
    plan_tier AS customer_segment,
    COUNT(DISTINCT user_id) AS total_users,
    SUM(mrr) AS current_mrr
  FROM users
  WHERE status = 'active'
  GROUP BY 1
)
 
SELECT
  fwp.feature_name,
  fwp.customer_segment,
  am.total_users,
  fwp.avg_wtp,
  -- Conservative estimate: 20% adoption
  ROUND(am.total_users * 0.20 * fwp.avg_wtp, 0) AS estimated_annual_revenue
FROM feature_willingness_to_pay fwp
JOIN addressable_market am ON fwp.customer_segment = am.customer_segment
ORDER BY estimated_annual_revenue DESC;

Results:

FeatureSegmentUsersWTP/yearEst. Revenue
ai_voiceoverenterprise450$2,400$216,000
ai_content_genpro2,100$600$252,000
translationenterprise450$1,800$162,000

Insight: Content generation has highest revenue potential despite fewer requests.

5. Competitive Intelligence

Query: What are competitors launching?

-- Analyze competitor product releases (from web scraping stored in Snowflake)
WITH competitor_features AS (
  SELECT
    competitor_name,
    feature_name,
    launch_date,
    DATEDIFF('day', launch_date, CURRENT_DATE()) AS days_since_launch,
    sentiment_score,  -- From social media monitoring
    estimated_adoption
  FROM competitor_intelligence
  WHERE launch_date >= DATEADD('month', -6, CURRENT_DATE())
    AND feature_category = 'ai'
)
 
SELECT
  feature_name,
  COUNT(DISTINCT competitor_name) AS competitors_with_feature,
  AVG(sentiment_score) AS avg_sentiment,
  AVG(estimated_adoption) AS avg_adoption_pct,
  MIN(launch_date) AS first_launched
FROM competitor_features
GROUP BY 1
ORDER BY competitors_with_feature DESC;

Results:

FeatureCompetitorsSentimentAdoptionFirst Launch
ai_voiceover40.7228%2024-01-15
translation50.6835%2023-08-20
ai_content20.8118%2024-06-10

Insight: AI content generation is newer, less crowded, higher satisfaction.

The Final Roadmap

Combining all data sources, we built a prioritized roadmap:

Priority 1: AI Voiceover

Why:

  • #1 customer request (247 requests)
  • High enterprise demand (45 enterprise requests)
  • 4 competitors already have it (table stakes)
  • $216K revenue potential

Decision: Build it, but not differentiated. Keep scope minimal.

Priority 2: Multimodal Content Generation

Why:

  • Highest revenue potential ($252K)
  • Least competition (only 2 competitors)
  • Highest sentiment (0.81)
  • Power users want it (tech/education segments)

Decision: This is our differentiator. Invest heavily.

Priority 3: Translation

Why:

  • High enterprise demand
  • Many competitors (5) = validated market
  • Good revenue potential ($162K)

Decision: Build later, partner/integrate initially.

Deprioritized: Video Recording

Why:

  • Low request volume (98 requests)
  • Technically complex
  • Low revenue potential
  • Outside core competency

Decision: Don’t build. Focus elsewhere.

Results

6 months post-launch:

FeaturePredicted AdoptionActual AdoptionPredicted RevenueActual Revenue
AI Voiceover20%22%$216K$234K
Content Gen20%35%$252K$411K

Accuracy:

  • Adoption forecast: Within 15%
  • Revenue forecast: Within 12% for voiceover, 63% upside surprise for content gen

Key win: Avoided building video recording, saved 3 months of engineering time.

Lessons Learned

  1. Combine qualitative + quantitative: Support tickets + usage data beats either alone
  2. Competitive intelligence matters: Don’t build what everyone has, differentiate
  3. Revenue > Request volume: 100 enterprise requests beat 500 free user requests
  4. Segment your analysis: Power users ≠ occasional users
  5. Forecast conservatively: We used 20% adoption, reality ranged 22-35%

Snowflake Tips for Product Analytics

Use VARIANT for flexible schemas

-- Store event properties as JSON
CREATE TABLE events (
  event_id STRING,
  user_id STRING,
  event_name STRING,
  properties VARIANT,  -- Flexible JSON
  timestamp TIMESTAMP
);
 
-- Query nested properties
SELECT
  properties:feature::STRING AS feature,
  properties:duration_seconds::NUMBER AS duration
FROM events
WHERE event_name = 'feature_used';

Leverage data sharing

-- Share data across teams without ETL
CREATE SHARE product_analytics_share;
GRANT USAGE ON DATABASE product_data TO SHARE product_analytics_share;
GRANT SELECT ON TABLE events TO SHARE product_analytics_share;
 
-- Other teams can query directly

Use materialized views for dashboards

-- Pre-compute expensive queries
CREATE MATERIALIZED VIEW daily_feature_usage AS
SELECT
  DATE_TRUNC('day', timestamp) AS date,
  event_name,
  COUNT(*) AS event_count,
  COUNT(DISTINCT user_id) AS unique_users
FROM events
GROUP BY 1, 2;
 
-- Dashboards query this instead of raw events (100x faster)

Conclusion

Data-driven product decisions aren’t about replacing intuition—they’re about augmenting it with evidence. Snowflake enabled us to:

  • Analyze millions of events in seconds
  • Combine 6 different data sources
  • Build accurate forecasts
  • Make defensible prioritization decisions

The result: A roadmap that delivered 35% adoption and $645K in new revenue in 6 months.


Using data for product decisions? I’d love to discuss analytics strategies and Snowflake use cases. Connect on LinkedIn.