What To Do When Your AB Tests Won't Reach Statistical Significance
A practical system for shipping fast when you can't wait for perfect data
Everywhere you look, growth playbooks preach the same: experiment, AB test, iterate. It sounds perfect on paper. A scientific method to maximise distribution. But here’s the thing most don’t tell you: if you’re at an early-stage startup, you might not have the traffic to run statistically significant tests in all parts of your funnel. So how do you make smart product decisions with imperfect data?
Table of contents
Who actually benefits from AB testing?
The traffic trap: when the math doesn’t work for early-stage startups
Gather feedback without AB testing
How to prioritise what to build
Wrapping up
Who actually benefits from AB testing?
Let’s get real about who AB testing is actually built for:
Companies past product-market fit with solid traffic who are optimising flows
Established products testing new ideas on existing audiences (fake door tests work great here)
Growth marketing teams running campaigns to thousands of leads
Consumer apps with hundreds/thousands of daily visitors hitting key funnel steps
Teams with the right tech stack (easier now with tools like Amplitude, Split.io, LaunchDarkly, PostHog, so not a huge blocker)
Companies with time to let tests run for weeks or even a month
If you don’t fit into the above? AB testing probably isn’t your best friend right now. And that’s okay. If you’re visual, you can also use this checklist to find out if you should run an AB test.
The traffic trap: when the math doesn’t work
Even if you’ve got thousands of visitors, that might not be enough.
Let’s look at a realistic early-stage scenario:
Baseline conversion rate: 20%
Confidence level: 80% (acceptable for fast iteration)
Expected lift: 20% (you should be swinging for at least this much)
Number of variants: 2
You’d need around 600 samples per variant, or about 1,200 total visitors in your test.
If you want your test to wrap in two weeks, that’s roughly 85-90 visitors per day hitting your experiment.
And here’s the thing: if you’re aiming for smaller improvements (like 10%), those numbers balloon to 5,000+ samples per variant. But at early stage, you shouldn’t be optimising for 10% lifts anyway. You should be looking for step-change improvements that are obvious enough you don’t need perfect statistical rigour to spot them.
Plus, there’s another catch: you might have decent traffic on your landing page, but what about deeper in your funnel? Testing your checkout flow or final activation step? Those pages often see a fraction of your top-of-funnel traffic. And those are exactly the places where experiments would have the biggest impact.
Don’t have that kind of traffic yet? Don’t worry. There are other ways.
Gather feedback without AB testing: the Always-On Feedback Engine
Here’s how I’d approach collecting data if I were building an early-stage product today (which, hey, I’ve been doing with Make My Invoice).
You can use the Always-On Feedback Engine: a system that continuously gathers user insights without eating up all your time.
Automate your qualitative feedback loops
Founder intuition and user feedback are your secret weapons at this stage. But gathering feedback doesn’t have to eat up all your time.
Here are my favourite quick wins:
Automated founder email survey: Send an automated email to people who signed up but didn’t use your product. Ask them why. Keep it short, personal, from the founder. People respond to that.
Scheduled weekly user interviews: Block off time every week to talk to users. Make it automatic. With Zapier, you can easily create email automations, such as:
Create a Zap that automatically sends an email to all new signups from the last 7 days inviting them to a 30-minute customer interview.
Trigger:
- Schedule daily at 09:00 Europe/London.
Steps:
1. Find new signups from the last 7 days in [YOUR APP — e.g. Google Sheets, Airtable, HubSpot].
- Fields: Name, Email, Created At
- Filter: Created At within last 7 days
2. Loop through each signup and send an email via [Gmail / Outlook / SendGrid]:
- Subject: “Quick 30-min chat? Help us improve”
- Body: [Your email body]
Set up Cal.com (or Calendly, for example), block your Friday mornings, and you’re done. Offer a small thank you: Amazon vouchers, product credits, whatever makes sense.
Use this time to dig into:
How they found you
What attracted them in the first place
Why they used it (or didn’t)
What they love, hate, and would change with a magic wand
Feedback on features you’re considering
Trust me, automating and doing a bit per week will be easier than doing a huge batch of interviews every once in a while.
Use AI to replicate your audience
I went to a product talk earlier this year where the speaker asked: Do we really have free will, or will AI eventually predict every choice we make?
This idea still blows my mind 🤯.
There are tools now that can help you simulate user feedback, and the results are surprisingly good.
Tools like Ask Rally or Figma Plugins aim to replicate how your audience thinks. The time and cost savings for small teams could be massive. Worth exploring.
A recent study by Ask Rally’s CEO tested 12 leading AI models against real human behaviour across five market research studies. The findings were fascinating: AI personas achieved 70-80% accuracy in replicating human choices, and you could get 80% of the insights at just 1% of the cost of traditional focus groups. The most human-sounding model? Gemini 2.0 Flash, at less than a penny per 100 responses.
The catch: models excel at logical decisions but struggle with emotional, high-stakes choices (like education marketing or complex B2B decisions). So this works best for straightforward product testing, not nuanced positioning or messaging.
Or just use your LLM of choice. Try something like this:
“Pretend you’re a first-time [Company Name] user trying to find [XYZ]. I’ll be the moderator. Answer honestly and naturally, describing your experience step by step with details. Here’s the URL: [link]”
Will it replace real user interviews? No. But it’s surprisingly good at surfacing obvious UX/UI wins based on established best practices. And sometimes, that’s exactly what you need.
A word of caution: Can’t get 10 user interviews? AI simulation can help fill gaps, but use it to complement real conversations, not replace them. Nothing beats talking to actual humans. But with the Always-On Feedback Engine running, you’ll have a steady stream of insights to feed into your prioritisation decisions. Pain points will surface easier.
How to prioritise what to build: the Activation-First Framework
Now you’ve got feedback from user interviews, surveys, and maybe some AI simulation. Great! But how do you actually decide what to build first when you have contradictory signals and limited resources?
There are several frameworks to prioritise a list of ideas. I’m personally a big fan of impact x effort matrix, ICE and RICE scores. But for specific problems, I do like a weighted-scoring method.
So here’s my Activation-First Framework for stacking ideas with imperfect data:
The Activation-First Framework
1. Impact on activation (weighted 40%)
Does this get users to their a-ha moment faster?
Will this increase the % of signups who actually use the product?
2. Tech ease (weighted 30%)
Can we ship this in days, not weeks?
Prefer quick wins early
3. User feedback intensity (weighted 20%)
How badly do people want this?
Are they actively asking for it, or is it just “nice to have”?
4. Founder conviction (weighted 10%)
Your intuition matters, but weight it last
Use it as a tiebreaker, not the main criteria
The magic is in the weighting. Early-stage is all about activation.
In practice: ship fast, watch your North Star
You can still ship experiments. You just can’t afford to wait weeks for statistical significance.
If you’re pre-PMF or in early signs of it, your obsession should be activation. Get people to the a-ha moment. You can’t scale a leaky bucket.
So focus on:
Guiding users to your a-ha moment as fast as possible
Removing friction from onboarding (or adding good friction where it helps)
Making sure people consistently experience your core value
Rapid iteration: ship, learn, repeat
Keep your North Star Metric front and centre. If it’s moving up, you’re winning. A few practical tips to stay sane:
Ship one thing at a time when possible. If your metric moves (up or down), you’ll have a better sense of what caused it.
Keep a simple changelog. Note what you shipped and when. When your North Star shifts, you can look back and spot patterns.
Be ready to roll back. If you ship something and your metric drops, don’t be precious about it. Revert and learn.
Wrapping up
Early-stage startups operate in a different reality. Sometimes you don’t have the luxury of perfect data or endless traffic (if you have it, make good use of it!). But you do have speed, flexibility, and the ability to talk directly to your users.
So lean into that. Ship fast. Listen hard. Watch your North Star. And remember: imperfect data beats no action every time.
And I’d love to know, have you used any other method that helped you get clarity without perfect data?
Hey, thank you for getting this far!
If you’re in this messy stage between PMF and scale and need someone who can move fast with incomplete data, let’s talk.
You can find me on LinkedIn, my DMs are always open.
Bea



