Analyzing Results

Understand experiment outcomes and make data-driven decisions.

Overview

Running an experiment is only half the work. Analyzing results correctly ensures you draw valid conclusions and make improvements that actually help.

Accessing Experiment Data

  1. Navigate to Analytics in your dashboard
  2. Use the Experiment filter to select your test
  3. View metrics broken down by variant

Key Metrics to Compare

Completion Rate

The percentage of users who finish the Story.

MetricFormulaMeaning
Completion RateCompletions / ViewsHow compelling is your content?

Interpreting results:

  • Higher is better
  • 70-85% is typical for onboarding
  • < 50% suggests problems

Drop-off Rate

The percentage of users who abandon before completing.

MetricFormulaMeaning
Dismissal RateDismissals / ViewsAre users bailing?

Interpreting results:

  • Lower is better
  • Compare per-screen to find problem areas
  • High drop-off on one screen = that screen needs work

Engagement Rate

How much users interact with your content.

MetricFormulaMeaning
Interaction RateInteractions / ViewsAre users engaged?

Interpreting results:

  • Higher generally better
  • Context matters - more interactions on a selection screen is good; more "back" button clicks might indicate confusion

Screen-Level Metrics

Beyond totals, examine each screen:

ScreenViewsExitsTime Spent
Screen 11000508.2s
Screen 295010012.1s
Screen 3850756.5s

Look for:

  • High exits: Screen is problematic
  • Long time: Either engaging or confusing
  • Short time: Either skipped or simple

Statistical Significance

Not all differences are meaningful. A variant showing 72% completion vs 70% might be random variation, not a real improvement.

What Significance Means

Significant result: The difference is unlikely due to chance. You can confidently say one variant performs better.

Not significant: The difference could be random. More data needed, or variants are essentially equal.

Factors Affecting Significance

FactorEffect
Sample sizeMore users = more confidence
Effect sizeBigger difference = faster significance
VarianceConsistent results = clearer signal

Waiting for Significance

Resist the urge to conclude early. Common timeline:

TrafficTime to Significance
100 users/day2-4 weeks
1,000 users/day3-7 days
10,000 users/day1-2 days

The dashboard indicates when results reach significance.

Making Decisions

Clear Winner

When one variant significantly outperforms:

  1. End the experiment
  2. Promote the winner to evergreen
  3. Archive the losing variant
  4. Document what you learned

No Clear Winner

When variants perform similarly:

  1. The simpler variant wins (less is more)
  2. Consider secondary metrics
  3. Or run longer with more traffic

Variant Underperforming

When the experiment is clearly worse:

  1. Pause or end the experiment to protect user experience
  2. Analyze why it failed
  3. Apply learnings to future tests

Note: You cannot edit a live variant. To test a modified version, end the experiment and launch a new one.

Common Analysis Mistakes

Peeking too early Checking results daily and stopping when one variant is ahead leads to false positives. Wait for significance.

Ignoring segments Overall metrics might hide important patterns. A variant might excel for iOS but fail on Android.

Over-indexing on one metric Completion rate improved but error rate also increased? Consider the full picture.

Attributing causation incorrectly Correlation isn't causation. Seasonal effects, marketing campaigns, and app updates can affect results.

Documenting Experiments

Keep a record of experiments:

FieldExample
Hypothesis"Shorter onboarding increases completion"
Variants3-screen vs 5-screen
Traffic50/50 split
DurationJan 5-19, 2026
Result3-screen: 78% vs 5-screen: 71% (significant)
DecisionPromote 3-screen to evergreen
LearningsUsers prefer brevity over detail

This history informs future experiments and prevents re-testing the same ideas.