Chapter 5: Behavioral Economics & Smoke Testing
Measuring revealed preferences through landing pages and "fake doors".
Actions Speak Louder Than Words
When someone says "I'd buy that," they're giving you an opinion. When they actually try to buy it, they're giving you evidence.
Economists call this the difference between Stated Preference (what people say) and Revealed Preference (what people do). For validation, only revealed preference counts. This distinction, first articulated by economist Paul Samuelson in 1938, remains one of the most important concepts in customer validation. Samuelson's insight was that we can learn more about someone's preferences from observing their behavior than from asking them questions -- because behavior requires trade-offs, while opinions are free.
The gap between stated and revealed preference is well-documented. In a classic study by the Journal of Marketing Research, 45% of consumers who stated they would purchase an organic product actually did so when given the opportunity -- less than half. For novel products (like those most startups are building), the gap is even wider. The implication for founders is clear: design your experiments to observe behavior, not collect opinions.
Stated Preference
"I would definitely use an app that does X."
Reality: People are polite. They want to be helpful. They have no skin in the game. This tells you almost nothing. The person saying this would say the same thing about dozens of other hypothetical products -- it costs them nothing to be supportive.
Revealed Preference
Person clicks "Pre-order for $99" and enters credit card info.
Reality: They had to overcome friction. They risked something. This is real signal. The act of entering a credit card number involves a psychological commitment that transforms casual interest into genuine intent.
The Currency of Validation
The only valid currency is Skin in the Game: time, money, or reputation. If someone isn't willing to invest at least one of these, their "interest" is meaningless.
Nassim Nicholas Taleb popularized this concept in his book of the same name. The principle is simple: people's opinions become reliable only when they bear consequences for being wrong. A friend who says "I'd invest in your startup" means nothing until they write a check. A user who says "I love this feature" means nothing until they use it repeatedly. Design your experiments to require skin in the game, and you'll separate real demand from polite encouragement.
The Commitment Spectrum
Not all skin-in-the-game commitments are equal. Daniel Kahneman's work on loss aversion shows that people weight potential losses roughly twice as heavily as equivalent gains. This means financial commitments are especially powerful validation signals -- the customer is risking money they could lose. Understanding the commitment spectrum helps you choose the right level of friction for each experiment.
| Commitment Type | Example | Evidence Quality | Best For |
|---|---|---|---|
| Verbal | "I'd probably use that" | Very Low | Nothing -- avoid relying on this |
| Time | Attends a demo, fills out a survey, joins a call | Low-Medium | Early-stage problem validation |
| Reputation | Introduces you to colleagues, shares your landing page | Medium | Network-effect products, referral-driven models |
| Financial | Pre-order, deposit, letter of intent, subscription signup | High | Demand validation, willingness-to-pay testing |
The most powerful experiments combine multiple commitment types. When a customer pre-orders your product AND recommends it to three colleagues, you're seeing financial commitment plus reputational commitment -- an exceptionally strong signal. The LeanPivot Assumption Mapper helps you identify which assumptions need which level of commitment evidence to be considered validated.
Smoke Tests: The "Fake Door" Method
A smoke test offers a product that doesn't fully exist yet to see if anyone tries to buy it. It's the ultimate test of desirability. The term comes from hardware engineering, where a "smoke test" is the first test of a new circuit: plug it in and see if smoke comes out. In software validation, a smoke test plugs your value proposition into the market and sees if demand comes out.
The beauty of smoke tests is their speed and cost. While building a minimum viable product might take weeks or months, a smoke test can be set up in a weekend for under $500. If the test fails, you've saved yourself months of wasted development. If it succeeds, you have real behavioral data to inform what you build next.
Landing Page Test
Create a simple page describing your value prop. Drive traffic via ads. Measure clicks on "Sign Up" or "Buy Now." This is the simplest and most common smoke test -- and often the most informative.
Cost: $50-500 in ads
Timeline: 1-2
weeks
Best for: Testing value proposition and messaging at scale
Concierge MVP
Manually perform the service your software will automate. Zappos started by manually buying shoes from stores and shipping them. Food on the Table started by personally grocery shopping for customers.
Cost: Your time
Timeline: Ongoing
Best for: Validating the entire customer experience and willingness to pay
Wizard of Oz
Users interact with a frontend that looks finished, but humans do the work behind the scenes. From the customer's perspective, the product works -- they just don't know there's no automation behind the curtain.
Cost: Basic UI + manual work
Timeline:
2-4 weeks
Best for: Testing products where the experience matters as much as the outcome
Smoke Test Case Studies
These aren't hypothetical -- they're real examples of founders who used smoke tests to validate (or invalidate) demand before investing in development:
Dropbox: The Explainer Video
Drew Houston couldn't easily build a working demo of file syncing, so he made a 3-minute video showing the product concept. He posted it to Hacker News and measured waitlist signups. The waitlist went from 5,000 to 75,000 overnight.
What it validated: Massive demand for seamless file syncing. The product didn't exist yet, but the revealed preference (signing up for a waitlist after watching a video) proved people wanted it badly enough to take action.
Food on the Table: Concierge to Scale
Manuel Rosso's meal planning service started with him personally visiting a customer's home every week, asking about dietary preferences, checking grocery store sales, and creating a custom meal plan. For one customer. He didn't write a line of code until he'd manually served a dozen customers and understood exactly what they valued.
What it validated: Willingness to pay, the exact workflow customers needed, and which parts of the service could be automated without reducing value.
Rent the Runway: The Trunk Show
Before building an online dress rental platform, Jennifer Hyman and Jennifer Fleiss rented designer dresses and set up a pop-up on a college campus. Students could try on dresses and rent them on the spot. The conversion rate was astounding -- proving the concept before a single line of code was written.
What it validated: Women would rent rather than buy designer dresses, removing the core desirability risk.
The Hustle: Email Before Everything
Sam Parr validated demand for a tech news newsletter by simply starting one -- writing it himself, sending it to friends, and measuring forwarding rates and organic signups. No app. No platform. Just an email list and compelling content. It eventually grew to 1.5 million subscribers before being acquired by HubSpot.
What it validated: Audience demand for a specific content format and editorial voice, proven by organic growth without paid acquisition.
Ethical Fake Doors
When running fake door tests, treat users with respect. If they click "Buy," tell them you're in beta or taking early signups -- never charge their card without delivering. Offer a discount for launch as thanks for their interest.
Transparency builds trust. A well-handled fake door test can actually create your earliest brand advocates -- people who feel special for being early. The landing page message might read: "We're building [product name] and aren't quite ready yet. Join our founding members list for 30% off at launch and direct access to our product team." This is honest, respectful, and still generates a strong demand signal.
Designing Your Experiment
Every experiment needs a clear hypothesis and success criteria -- defined before you run it. This is non-negotiable. If you define success after seeing the results, you'll unconsciously adjust the threshold to match what happened -- a form of confirmation bias called "moving the goalposts." Write your hypothesis and success criteria down, share them with a co-founder or advisor, and don't change them once the experiment starts.
The Experiment Template
We believe that [target audience]
will [take this action]
for [this reason/value prop].
We'll know we're right if [metric] reaches [threshold] within [timeframe].
Example: "We believe that freelance designers will pre-order our cash flow tool ($29/month) because they're frustrated with unpredictable income. We'll know we're right if 50 people sign up in 2 weeks from $200 in Facebook ads."
Notice the specificity. The hypothesis names the audience (freelance designers), the action (pre-order), the value proposition (frustrated with unpredictable income), the metric (signups), the threshold (50), and the timeframe (2 weeks). There's no ambiguity about what success looks like. If you get 49 signups, that's a miss -- not "close enough." If you get 51, that's a hit. This level of precision is what separates rigorous validation from wishful thinking.
Setting the Right Threshold
How do you decide what threshold counts as "success"? Here are some guidelines:
- Landing page conversion: A cold traffic conversion rate of 2-5% is typical for a compelling value proposition. Below 1% suggests weak messaging or wrong audience. Above 5% suggests strong product-market fit signal.
- Email signup to engagement: If 40%+ of email signups open your first email, your audience is genuinely interested. Below 20% suggests they signed up impulsively.
- Waitlist to pre-order conversion: If 10-15% of waitlist members convert to paid when you launch, you have strong demand. Below 5% suggests the "interest" was superficial.
- Ad click-through rate: A CTR above 2% on Facebook/Instagram suggests strong message-market fit. Below 0.5% suggests your messaging doesn't resonate with the audience you're targeting.
The Anatomy of a Well-Designed Smoke Test
David Bland and Alex Osterwalder, in Testing Business Ideas, outline 44 different experiment types. But regardless of which type you choose, every well-designed smoke test shares the same structural elements. Missing any one of these increases the risk of ambiguous or misleading results:
Before the Test
- Specific hypothesis -- using the template above, no vague "let's see what happens"
- Target audience -- exactly who you're reaching (demographics, psychographics, channels)
- Kill criteria -- the threshold below which you pivot or kill, defined before launch
- Gray zone -- the range between clear success and clear failure, with prescribed follow-up actions
- Timeline -- how long the experiment runs, with no extensions allowed
After the Test
- Raw data -- all metrics collected, not just the ones that support your preferred conclusion
- Segmented analysis -- break results down by audience segment, channel, and time period
- Confounding factors -- anything that might have influenced results (holidays, competitor launches, viral sharing)
- Decision -- persevere, pivot, or kill, based strictly on pre-defined criteria
- Lean Vault entry -- full documentation for institutional memory
The CTA Strength Ladder
The strength of your evidence depends on the friction of your call-to-action. This is a principle from behavioral economics: the more effort a behavior requires, the more strongly it indicates genuine motivation. A frictionless action (clicking a link) reveals almost nothing about intent. A high-friction action (entering credit card details) reveals a great deal.
| CTA | Friction | Signal Strength |
|---|---|---|
| "Learn More" | Very Low | Weak |
| "Sign up for updates" | Low | Moderate |
| "Join waitlist" (with email) | Medium | Good |
| "Pre-order for $X" (credit card) | High | Strong |
Pro Tip: Maximize Friction
Counter-intuitively, you want more friction in your test. A "Learn More" click means almost nothing. A credit card entry (even if you don't charge) is strong evidence of real intent.
Think of it like a filtration system: each layer of friction filters out the mildly curious and leaves you with the genuinely motivated. Yes, your conversion numbers will be smaller. But smaller numbers of genuinely interested people are infinitely more valuable than large numbers of casually curious ones. You'd rather know that 20 people were willing to enter their credit card than that 2,000 people clicked "Learn More."
Running Multiple Experiments in Parallel
One of the most common mistakes is running experiments sequentially when they could run in parallel. If you have three assumptions to test and each experiment takes two weeks, sequential testing takes six weeks. Running them in parallel takes two weeks. At the validation stage, speed matters enormously because your runway is finite.
The key constraint is that parallel experiments should test independent assumptions. If Experiment B depends on the outcome of Experiment A, they must run sequentially. But if you're testing a desirability assumption, a pricing assumption, and a channel assumption simultaneously, all three can run in parallel because they don't affect each other's results.
The Parallel Experiment Framework
Use this decision tree to determine whether two experiments can run in parallel:
- Do the experiments test different assumptions? If yes, proceed. If they test the same assumption from different angles, they can still run in parallel as triangulation.
- Does one experiment's result change the other's design? If yes, run them sequentially. If no, run them in parallel.
- Do they compete for the same audience? If you're running two Facebook ad experiments targeting the same segment, they may contaminate each other's results. Use different channels or segments to keep them independent.
- Can you track results independently? Use separate landing pages, distinct UTM parameters, and isolated conversion funnels. If you can't tell which experiment produced which result, you can't run them in parallel.
What You Walk Away With
- Revealed Preference Data: Evidence of what people actually do, not just say -- the gold standard of validation evidence.
- Conversion Metrics: Real numbers from your smoke test that you can compare against industry benchmarks.
- Validated (or Invalidated) Demand: Proof that people will take action for your solution, backed by behavioral data rather than opinions.
- Experiment Artifacts: Landing pages, ad copy, and results for your Lean Vault -- reusable assets for future experiments and investor conversations.
- Experiment Design Skills: The ability to write clear hypotheses, set appropriate thresholds, and design experiments that produce unambiguous results -- a skill that pays dividends throughout your entire startup journey.
Design Your Smoke Test
Create landing page tests, concierge MVPs, and fake door experiments with our Market Signal Tester. Includes hypothesis templates, threshold recommendations, and CTA optimization guidance.
Save Your Progress
Create a free account to save your reading progress, bookmark chapters, and unlock Playbooks 04-08 (MVP, Launch, Growth & Funding).
Ready to Validate Your Idea?
LeanPivot.ai provides 80+ AI-powered tools to help you test assumptions and build evidence.
Start Free TodayRelated Guides
Lean Startup Guide
Master the build-measure-learn loop and the foundations of validated learning to build products people actually want.
From Layoff to Launch
A step-by-step guide to turning industry expertise into a thriving professional practice after a layoff.
Fintech Playbook
Master regulatory moats, ledger architecture, and BaaS partnerships to build successful fintech products.
Works Cited & Recommended Reading
Lean Startup & Innovation Accounting
- 1. Navigating the 2026 AI-Native Enterprise Stack. LeanPivot.ai
- 4. Validated Learning Techniques. LeanPivot.ai
- 5. How to Make "Pivot or Persevere" Decisions. Kromatic
- 6. Lean Methodology - Innovation Accounting Guide. SixSigma.us
- 28. Running Lean, Second Edition. BEL Initiative
Assumption Mapping & Testing
- 7. Invest in Winning Ideas with Assumption Mapping. Miro
- 10. Testing Business Ideas: Book Summary. Strategyzer
- 11. Innovation Tools – The Assumption Mapper. Nico Eggert
- 14. Business Testing: Is your Hypothesis Really Validated? Strategyzer
- 16. An Introduction to Assumptions Mapping. Mural
- 17. Assumption Mapping Techniques. Medium
Customer Interviews & The Mom Test
- 8. Book Summary: The Mom Test by Rob Fitzpatrick. Medium
- 22. The Mom Test for Better Customer Interviews. Looppanel
- 23. The Mom Test by Rob Fitzpatrick [Actionable Summary]. Durmonski.com
- 9. How to Evaluate Customer Validation in Early Stages. Golden Egg Check
Jobs-to-Be-Done Framework
- 24. Jobs to be Done 101: Your Interviewing Style Primer. Dscout
- 25. How To Get Results From Jobs-to-be-Done Interviews. Jobs-to-be-Done
- 26. A Script to Kickstart JTBD Interviews. JTBD.info
Product-Market Fit & Surveys
- 33. Sean Ellis Product Market Fit Survey Template. Zonka Feedback
- 34. How to Use the Product/Market Fit Survey. Lean B2B
- 35. Product Market-Fit Questions: Tips and Examples. Qualaroo
- 36. Product/Market Fit Survey by Sean Ellis. PMF Survey
Pricing Validation Methods
- 38. Willingness to Pay: What It Is and How to Find It. Baremetrics
- 39. Pricing Products - Van Westendorp Model. First Principles
- 40. How To Price Your Product: Van Westendorp Guide. Forbes
- 41. Gabor Granger vs Van Westendorp Models. Drive Research
Smoke Tests & Fake Door Testing
- 43. Smoke Tests in Market Research - Complete Guide. Horizon
- 45. Fake Door Testing - How it Works, Benefits & Risks. Chameleon.io
- 52. High Hurdle Product Experiment. Learning Loop
- 53. Fake Door Testing: Measuring User Interest. UXtweak
Conversion Benchmarks & Metrics
- 46. Landing Page Statistics 2025: 97+ Stats. Marketing LTB
- 47. Understanding Landing Page Conversion Rates 2025. Nudge
- 49. What Is A Good Waitlist Conversion Rate? ScaleMath
- 54. Average Ad Click Through Rates (CTRs). Smart Insights
Decision Making & Kill Criteria
- 57. From Test Results to Business Decisions. M Accelerator
- 58. Kill Criteria for Product Managers. Medium
- 59. When to Kill Your Venture - Session Recap. Bundl
This playbook synthesizes research from Lean Startup methodology, Jobs-to-Be-Done theory, behavioral economics, and validation frameworks. Some book links may be affiliate links.