Experiment Prioritization: Which A/B Tests to Run First

Framework

Not all features should go live immediately. Many should be A/B tested first.

Prioritization for experiments differs from feature prioritization:

Dimension	Feature Prioritization	Experiment Prioritization
Metric	Revenue/engagement impact	Learning value
Question	What has biggest impact?	What's most uncertain?
Goal	Ship the best features	Learn the most

High-uncertainty, medium-impact feature should be tested. High-certainty feature should be shipped.

Actionable Steps

1. Categorize Each Idea

High confidence → Ship directly
Medium confidence → A/B test first
Low confidence → Prototype/learn first

2. Prioritize Tests by Expected Learning Value

Test the idea that, if wrong, would most change your strategy.

3. Run Test Quickly, Get Signal Fast

The faster you learn, the more experiments you can run.

Key Takeaways

Testing priorities differ from shipping priorities. Test uncertain ideas, ship confident ideas.
Maximize learning velocity. Faster tests = more learning in same time frame.
Test the idea that, if wrong, would most change your strategy. Learning value > potential upside.

The Experiment Prioritization Problem

You have 30 experiment ideas. Your team can run 5 this quarter. Which 5?

Common mistakes:

"Test the feature with highest upside" (ignores certainty)
- If you're 80% confident it works, testing is low-value
- You should ship directly
"Test the feature with highest downside" (too conservative)
- Never test because there's always downside risk
- Paralysis by analysis
"Test random ideas" (no prioritization framework)
- 15 small experiments, learn nothing
- Should have run 5 focused experiments

Best approach: Prioritize by learning value per dollar/time spent.

The ICE-V Framework for Experiment Prioritization

ICE = Impact × Confidence × Ease V = Value of information (what do we learn if wrong?)

Step 1: Score Each Experiment on ICE

Impact (1-5 scale):

How much will this improve the metric if it works?
1 = +0.5% (minimal)
5 = +20%+ (transformational)

Example: "Add social sharing button"

Potential impact on engagement: +3% (small but positive)
Impact score: 3

Confidence (1-5 scale):

How confident are you this will work?
1 = Total speculation
5 = We're 90%+ confident it works

Example: "Add social sharing button"

Similar products have it, users asked for it
Confidence score: 4

Ease (1-5 scale):

How easy/fast is this to test?
1 = Months of work
5 = Hours of work

Example: "Add social sharing button"

Design: 4 hours
Engineering: 8 hours
Testing infrastructure: Already set up
Ease score: 5

ICE Score = 3 × 4 × 5 = 60

Step 2: Score Value of Information (V)

Value of information = magnitude of strategy change if you learn you were wrong

Question: "If this test fails, how much does it change your product strategy?"

Examples:

Test: "Add social sharing button"

If it fails: "OK, sharing isn't important to users. No strategy change."
V score: 1 (low learning value)
Recommendation: Don't test. Your confidence is high (4/5). Ship it.

Test: "Change pricing from per-user to per-company"

If it fails: "We misunderstood buyer behavior. Strategy is wrong."
If it works: "This unlocks enterprise market."
V score: 5 (high learning value)
Recommendation: Test this, even if effort is high.

Test: "Build mobile app vs. responsive web"

If mobile wins: "We build native, shift resources"
If web wins: "We go web-only, avoid native maintenance"
V score: 5 (high learning value)
Recommendation: Test this despite high effort.

Step 3: Prioritization Matrix

Plot each experiment on:

X-axis: ICE Score (left = low, right = high)
Y-axis: Value of Information (bottom = low, top = high)

Quadrant	Action	Example
High ICE, High V	DO FIRST ✓	Pricing model change, core flow optimization
High ICE, Low V	SHIP DIRECTLY (skip test)	Add social sharing, copy tweaks
Low ICE, High V	DO AFTER (risky but learn a lot)	New product category, novel interaction
Low ICE, Low V	DEPRIORITIZE	Minor UI tweaks, niche feature requests

Real-World Case Study: Experiment Prioritization

Company: Mid-Market SaaS (50K users)

Q2 Experiment Backlog: 30 ideas

Experiment	Impact	Conf	Ease	ICE	V	Quadrant	Priority
Pricing model (per-user → per-org)	5	2	1	10	5	High V, Low ICE	DO 2ND
Onboarding checklist	3	4	5	60	2	Quad II	SHIP
Dark mode	1	5	4	20	1	Quad IV	SKIP
New export formats	2	3	3	18	1	Quad IV	SKIP
AI-powered suggestions	4	1	2	8	5	High V, Low ICE	DO 1ST*
Help sidebar position	2	4	5	40	1	Quad II	SHIP
Slack integration	3	3	4	36	3	Quad III	DO 3RD
New dashboard layout	2	3	4	24	2	Quad IV	SKIP

*Note: AI-powered suggestions has low ICE because it's unproven tech. But V=5 (if it works, changes strategy). Moved to DO 1ST.

Prioritization Decision:

Run these 5 experiments this quarter (in order):

AI-powered suggestions (High V learning, risky)
Pricing model change (High V learning, need more confidence first)
Slack integration (Moderate learning, medium effort)
Onboarding checklist (Ship directly, don't test—high confidence)
Help sidebar (Ship directly, don't test—high confidence)

Results (Q2):

Experiment	Outcome	Learning
AI suggestions	Works! +15% engagement	Strategy insight: AI is value driver. Invest here.
Pricing model	Fails. Only 15% adoption	Learning: Users want per-user flexibility. New pricing model needed.
Slack integration	Works. +20% retention	Developers love workflows. Expand integration ecosystem.
Onboarding	+8% onboarding completion	✓ Shipped, worked as expected
Help sidebar	+3% support ticket reduction	✓ Shipped, minor win

Strategic decisions based on learning:

Allocate 30% Q3 engineering to AI features (was 5%)
Redesign pricing to keep per-user flexibility (was going to change)
Expand Slack integration (prioritize over other integrations)

Without experiment prioritization: Would have tested all 30 ideas, learned less, and taken longer to act on insights.

Anti-Patterns in Experiment Prioritization

Anti-Pattern 1: "Test everything, no matter how confident"

The problem: You test "Add social sharing button" (90% confident it works). Result: 4 weeks testing confirms the obvious.

The fix: If confidence > 75%, ship directly. Use testing for uncertain ideas only.

Anti-Pattern 2: "Run too many experiments in parallel"

The problem: Running 15 small experiments simultaneously. Result: No statistical power. No clear winners. Wasted effort.

The fix: Run 3-5 focused experiments. Get clear signal. Ship winning ones. Then iterate.

Anti-Pattern 3: "Optimize for speed, not learning"

The problem: "We need results fast, so let's test surface-level changes (button color, copy)" Result: Fast answers, no strategic insight.

The fix: Balance speed with learning value. Spend 6 weeks testing pricing model (high V) vs. 1 week testing button color (low V).

The Economics of Experiment Prioritization

Scenario: $100K testing budget for 20 people × 5 weeks

Bad prioritization:

20 small experiments (low learning value each)
Cost: $100K total
Learning: Fractional
ROI: Low

Good prioritization:

5 focused experiments (high learning value each)
Cost: $100K total
Learning: High (strategic insights)
ROI: Identifies 2 features worth $1M+ in revenue potential

Math: If 1 experiment uncovers a $2M+ feature opportunity, ROI is 20:1.

Prodinja Connection

The gap: Most PMs run experiments but don't tie them to strategy or prioritization. Prodinja's RICE scoring in the Prioritization Studio is designed to close that gap — score each experiment idea on Reach, Impact, Confidence, and Effort, and the backlog re-ranks live as you adjust the inputs, so the highest-uncertainty, highest-leverage tests surface instead of whatever's loudest in the room. Tag experiments with Kano categories to separate the "must-have, just ship it" ideas from the genuinely uncertain bets worth testing. That structure alone pushes PMs from "run lots of tests" toward "run the right tests."

Key Takeaways (Expanded)

Don't test high-confidence ideas. If you're 80%+ confident, ship directly. Use testing for uncertain bets.
Prioritize by learning value, not upside potential. A $100K opportunity that you're 90% confident about beats a $1M opportunity that you're 20% confident about.
Run 3-5 focused experiments, not 20 small ones. You'll learn more in the same time frame.
Balance speed with learning. Don't optimize purely for fast results at the expense of strategic insight.
Track experiment outcomes and use them to inform strategy. If pricing model test fails, that's a strategic insight worth millions.