Week 06 · Days 2630

Growth Frameworks & Experimentation

Learn how growth PMs design experiments and think about A/B testing.

Portfolio deliverable · An experiment design doc (hypothesis + A/B test plan)

DAY 26

Growth loops vs funnels

Lesson

Lesson: Growth loops

A funnel is linear: input → output, end of story. A growth loop is circular: the output of one cycle becomes the input for the next, compounding over time. Common loop types: referral loops (existing users invite new users, e.g. Dropbox's extra storage for referrals), content loops (user-generated content gets indexed/shared, drawing in new users, e.g. Pinterest, TikTok), and paid loops (revenue from users funds ads that acquire more users, viable when LTV > CAC with margin). The key question for a growth PM: where does this loop leak, and what's the smallest change that increases the 'conversion rate' of one step in the loop?

Task

Task: Diagram your practice app's growth loop

Draw (in Figma or on paper) the growth loop you believe drives your practice app's user acquisition.

DAY 27

Forming hypotheses

Lesson

Lesson: Hypothesis-driven thinking

The hypothesis format — 'If we [change], then [metric] will [increase/decrease], because [underlying reason], measured by [specific metric/method]' — forces three things: a testable change (not vague), a causal mechanism (so you learn something even if you're wrong), and a way to measure success before you build. Weak hypothesis: 'Adding social proof will help.' Strong: 'If we show "12,000 people completed this today" on the signup screen, signup completion rate will increase by 5%+ because social proof reduces hesitation for new users, measured by completion rate in the signup funnel over 2 weeks.'

Task

Task: Write 3 growth hypotheses

Using your growth loop from Day 26, write 3 hypotheses for changes that could strengthen the weakest part of the loop.

DAY 28

A/B testing fundamentals

Lesson

Lesson: A/B testing basics

In an A/B test, users are randomly split between control (current experience) and variant (your change). Statistical significance (commonly p < 0.05) tells you whether a difference is likely real vs random noise — but you need enough sample size to detect the effect you expect; small effects on low-traffic features may take weeks to reach significance. Guardrail metrics are things that shouldn't get worse even if your primary metric improves (e.g. a change that boosts signups but tanks retention is a bad trade). The two most common mistakes: peeking at results early and stopping as soon as it 'looks significant' (this inflates false positives), and running too many variants at once, which splits your sample size and slows everything down.

Task

Task: Design an A/B test

Pick your top hypothesis from Day 27. Design the A/B test: control vs variant, primary metric, guardrail metrics, and rough sample size needed.

DAY 29

Reading experiment results

Lesson

Lesson: Interpreting results & making ship decisions

Reading a results readout is rarely a clean 'it worked' or 'it didn't.' Common scenarios: (1) Primary metric significant and positive, guardrails flat → ship it. (2) Primary metric positive but a guardrail metric (e.g. retention, revenue per user) is negative → this is the hard case; you weigh the size of each effect and whether the guardrail regression is acceptable or a dealbreaker. (3) Not statistically significant → don't conclude 'it doesn't work,' conclude 'we don't have enough evidence yet' — you may need more time/traffic, or the effect may genuinely be too small to matter. Document your reasoning, not just the decision — that's what builds trust with stakeholders over time.

Task

Task: Write a mock results readout

Write a fictional results readout for your A/B test from Day 28 (made-up numbers), including a ship/no-ship recommendation with reasoning.

DAY 30

Synthesize: Experiment design doc

Deliverable

Deliverable: Experiment Design Doc

Combine hypothesis, growth loop diagram, A/B test design, and mock readout into a single experiment design doc.

Advanced Challenge

Advanced Challenge: Design a prompt/model experiment

Classic A/B testing assumes a deterministic change. For AI features, you're often testing prompts, model versions, or retrieval strategies — where outputs vary even within a 'variant.' Design an experiment for an AI feature (e.g. two different system prompts for a conversational assistant): define what you'd hold in a fixed evaluation set (so you can compare quality offline before any user sees it), what you'd A/B test live (user-facing metrics once you've cleared an offline quality bar), and how 'guardrails' differ — e.g. a guardrail might be 'refusal rate doesn't increase' or 'response length stays within X tokens.' This two-stage approach (offline eval → online A/B) is how mature AI teams ship model/prompt changes safely.