Chapter 5 of 9

Chapter 5: Behavioral Economics & Smoke Testing

Measuring revealed preferences through landing pages and "fake doors".

What You'll Learn By the end of this chapter, you'll understand why actions beat words, how to run smoke tests and fake door experiments, and how to design experiments that generate real evidence of demand.

Actions Speak Louder Than Words

When someone says "I'd buy that," they're giving you an opinion. When they actually try to buy it, they're giving you evidence.

Economists call this the difference between Stated Preference (what people say) and Revealed Preference (what people do). For validation, only revealed preference counts. This distinction, first articulated by economist Paul Samuelson in 1938, remains one of the most important concepts in customer validation. Samuelson's insight was that we can learn more about someone's preferences from observing their behavior than from asking them questions -- because behavior requires trade-offs, while opinions are free.

The gap between stated and revealed preference is well-documented. In a classic study by the Journal of Marketing Research, 45% of consumers who stated they would purchase an organic product actually did so when given the opportunity -- less than half. For novel products (like those most startups are building), the gap is even wider. The implication for founders is clear: design your experiments to observe behavior, not collect opinions.

Stated Preference

"I would definitely use an app that does X."

Reality: People are polite. They want to be helpful. They have no skin in the game. This tells you almost nothing. The person saying this would say the same thing about dozens of other hypothetical products -- it costs them nothing to be supportive.

Revealed Preference

Person clicks "Pre-order for $99" and enters credit card info.

Reality: They had to overcome friction. They risked something. This is real signal. The act of entering a credit card number involves a psychological commitment that transforms casual interest into genuine intent.

The Currency of Validation

The only valid currency is Skin in the Game: time, money, or reputation. If someone isn't willing to invest at least one of these, their "interest" is meaningless.

Nassim Nicholas Taleb popularized this concept in his book of the same name. The principle is simple: people's opinions become reliable only when they bear consequences for being wrong. A friend who says "I'd invest in your startup" means nothing until they write a check. A user who says "I love this feature" means nothing until they use it repeatedly. Design your experiments to require skin in the game, and you'll separate real demand from polite encouragement.

The Commitment Spectrum

Not all skin-in-the-game commitments are equal. Daniel Kahneman's work on loss aversion shows that people weight potential losses roughly twice as heavily as equivalent gains. This means financial commitments are especially powerful validation signals -- the customer is risking money they could lose. Understanding the commitment spectrum helps you choose the right level of friction for each experiment.

Commitment Type	Example	Evidence Quality	Best For
Verbal	"I'd probably use that"	Very Low	Nothing -- avoid relying on this
Time	Attends a demo, fills out a survey, joins a call	Low-Medium	Early-stage problem validation
Reputation	Introduces you to colleagues, shares your landing page	Medium	Network-effect products, referral-driven models
Financial	Pre-order, deposit, letter of intent, subscription signup	High	Demand validation, willingness-to-pay testing

The most powerful experiments combine multiple commitment types. When a customer pre-orders your product AND recommends it to three colleagues, you're seeing financial commitment plus reputational commitment -- an exceptionally strong signal. The LeanPivot Assumption Mapper helps you identify which assumptions need which level of commitment evidence to be considered validated.

Smoke Tests: The "Fake Door" Method

A smoke test offers a product that doesn't fully exist yet to see if anyone tries to buy it. It's the ultimate test of desirability. The term comes from hardware engineering, where a "smoke test" is the first test of a new circuit: plug it in and see if smoke comes out. In software validation, a smoke test plugs your value proposition into the market and sees if demand comes out.

The beauty of smoke tests is their speed and cost. While building a minimum viable product might take weeks or months, a smoke test can be set up in a weekend for under $500. If the test fails, you've saved yourself months of wasted development. If it succeeds, you have real behavioral data to inform what you build next.

Landing Page Test

Create a simple page describing your value prop. Drive traffic via ads. Measure clicks on "Sign Up" or "Buy Now." This is the simplest and most common smoke test -- and often the most informative.

Cost: $50-500 in ads
Timeline: 1-2 weeks
Best for: Testing value proposition and messaging at scale

Concierge MVP

Manually perform the service your software will automate. Zappos started by manually buying shoes from stores and shipping them. Food on the Table started by personally grocery shopping for customers.

Cost: Your time
Timeline: Ongoing
Best for: Validating the entire customer experience and willingness to pay

Wizard of Oz

Users interact with a frontend that looks finished, but humans do the work behind the scenes. From the customer's perspective, the product works -- they just don't know there's no automation behind the curtain.

Cost: Basic UI + manual work
Timeline: 2-4 weeks
Best for: Testing products where the experience matters as much as the outcome

Smoke Test Case Studies

These aren't hypothetical -- they're real examples of founders who used smoke tests to validate (or invalidate) demand before investing in development:

Dropbox: The Explainer Video

Drew Houston couldn't easily build a working demo of file syncing, so he made a 3-minute video showing the product concept. He posted it to Hacker News and measured waitlist signups. The waitlist went from 5,000 to 75,000 overnight.

What it validated: Massive demand for seamless file syncing. The product didn't exist yet, but the revealed preference (signing up for a waitlist after watching a video) proved people wanted it badly enough to take action.

Food on the Table: Concierge to Scale

Manuel Rosso's meal planning service started with him personally visiting a customer's home every week, asking about dietary preferences, checking grocery store sales, and creating a custom meal plan. For one customer. He didn't write a line of code until he'd manually served a dozen customers and understood exactly what they valued.

What it validated: Willingness to pay, the exact workflow customers needed, and which parts of the service could be automated without reducing value.

Rent the Runway: The Trunk Show

Before building an online dress rental platform, Jennifer Hyman and Jennifer Fleiss rented designer dresses and set up a pop-up on a college campus. Students could try on dresses and rent them on the spot. The conversion rate was astounding -- proving the concept before a single line of code was written.

What it validated: Women would rent rather than buy designer dresses, removing the core desirability risk.

The Hustle: Email Before Everything

Sam Parr validated demand for a tech news newsletter by simply starting one -- writing it himself, sending it to friends, and measuring forwarding rates and organic signups. No app. No platform. Just an email list and compelling content. It eventually grew to 1.5 million subscribers before being acquired by HubSpot.

What it validated: Audience demand for a specific content format and editorial voice, proven by organic growth without paid acquisition.

Ethical Fake Doors

When running fake door tests, treat users with respect. If they click "Buy," tell them you're in beta or taking early signups -- never charge their card without delivering. Offer a discount for launch as thanks for their interest.

Transparency builds trust. A well-handled fake door test can actually create your earliest brand advocates -- people who feel special for being early. The landing page message might read: "We're building [product name] and aren't quite ready yet. Join our founding members list for 30% off at launch and direct access to our product team." This is honest, respectful, and still generates a strong demand signal.

Designing Your Experiment

Every experiment needs a clear hypothesis and success criteria -- defined before you run it. This is non-negotiable. If you define success after seeing the results, you'll unconsciously adjust the threshold to match what happened -- a form of confirmation bias called "moving the goalposts." Write your hypothesis and success criteria down, share them with a co-founder or advisor, and don't change them once the experiment starts.

The Experiment Template

We believe that [target audience]

will [take this action]

for [this reason/value prop].

We'll know we're right if [metric] reaches [threshold] within [timeframe].

Example: "We believe that freelance designers will pre-order our cash flow tool ($29/month) because they're frustrated with unpredictable income. We'll know we're right if 50 people sign up in 2 weeks from $200 in Facebook ads."

Notice the specificity. The hypothesis names the audience (freelance designers), the action (pre-order), the value proposition (frustrated with unpredictable income), the metric (signups), the threshold (50), and the timeframe (2 weeks). There's no ambiguity about what success looks like. If you get 49 signups, that's a miss -- not "close enough." If you get 51, that's a hit. This level of precision is what separates rigorous validation from wishful thinking.

Setting the Right Threshold

How do you decide what threshold counts as "success"? Here are some guidelines:

Landing page conversion: A cold traffic conversion rate of 2-5% is typical for a compelling value proposition. Below 1% suggests weak messaging or wrong audience. Above 5% suggests strong product-market fit signal.
Email signup to engagement: If 40%+ of email signups open your first email, your audience is genuinely interested. Below 20% suggests they signed up impulsively.
Waitlist to pre-order conversion: If 10-15% of waitlist members convert to paid when you launch, you have strong demand. Below 5% suggests the "interest" was superficial.
Ad click-through rate: A CTR above 2% on Facebook/Instagram suggests strong message-market fit. Below 0.5% suggests your messaging doesn't resonate with the audience you're targeting.

The Anatomy of a Well-Designed Smoke Test

David Bland and Alex Osterwalder, in Testing Business Ideas, outline 44 different experiment types. But regardless of which type you choose, every well-designed smoke test shares the same structural elements. Missing any one of these increases the risk of ambiguous or misleading results:

Before the Test

Specific hypothesis -- using the template above, no vague "let's see what happens"
Target audience -- exactly who you're reaching (demographics, psychographics, channels)
Kill criteria -- the threshold below which you pivot or kill, defined before launch
Gray zone -- the range between clear success and clear failure, with prescribed follow-up actions
Timeline -- how long the experiment runs, with no extensions allowed

After the Test

Raw data -- all metrics collected, not just the ones that support your preferred conclusion
Segmented analysis -- break results down by audience segment, channel, and time period
Confounding factors -- anything that might have influenced results (holidays, competitor launches, viral sharing)
Decision -- persevere, pivot, or kill, based strictly on pre-defined criteria
Lean Vault entry -- full documentation for institutional memory

The CTA Strength Ladder

The strength of your evidence depends on the friction of your call-to-action. This is a principle from behavioral economics: the more effort a behavior requires, the more strongly it indicates genuine motivation. A frictionless action (clicking a link) reveals almost nothing about intent. A high-friction action (entering credit card details) reveals a great deal.

CTA	Friction	Signal Strength
"Learn More"	Very Low	Weak
"Sign up for updates"	Low	Moderate
"Join waitlist" (with email)	Medium	Good
"Pre-order for $X" (credit card)	High	Strong

Pro Tip: Maximize Friction

Counter-intuitively, you want more friction in your test. A "Learn More" click means almost nothing. A credit card entry (even if you don't charge) is strong evidence of real intent.

Think of it like a filtration system: each layer of friction filters out the mildly curious and leaves you with the genuinely motivated. Yes, your conversion numbers will be smaller. But smaller numbers of genuinely interested people are infinitely more valuable than large numbers of casually curious ones. You'd rather know that 20 people were willing to enter their credit card than that 2,000 people clicked "Learn More."

Running Multiple Experiments in Parallel

One of the most common mistakes is running experiments sequentially when they could run in parallel. If you have three assumptions to test and each experiment takes two weeks, sequential testing takes six weeks. Running them in parallel takes two weeks. At the validation stage, speed matters enormously because your runway is finite.

The key constraint is that parallel experiments should test independent assumptions. If Experiment B depends on the outcome of Experiment A, they must run sequentially. But if you're testing a desirability assumption, a pricing assumption, and a channel assumption simultaneously, all three can run in parallel because they don't affect each other's results.

The Parallel Experiment Framework

Use this decision tree to determine whether two experiments can run in parallel:

Do the experiments test different assumptions? If yes, proceed. If they test the same assumption from different angles, they can still run in parallel as triangulation.
Does one experiment's result change the other's design? If yes, run them sequentially. If no, run them in parallel.
Do they compete for the same audience? If you're running two Facebook ad experiments targeting the same segment, they may contaminate each other's results. Use different channels or segments to keep them independent.
Can you track results independently? Use separate landing pages, distinct UTM parameters, and isolated conversion funnels. If you can't tell which experiment produced which result, you can't run them in parallel.

What You Walk Away With

Revealed Preference Data: Evidence of what people actually do, not just say -- the gold standard of validation evidence.
Conversion Metrics: Real numbers from your smoke test that you can compare against industry benchmarks.
Validated (or Invalidated) Demand: Proof that people will take action for your solution, backed by behavioral data rather than opinions.
Experiment Artifacts: Landing pages, ad copy, and results for your Lean Vault -- reusable assets for future experiments and investor conversations.
Experiment Design Skills: The ability to write clear hypotheses, set appropriate thresholds, and design experiments that produce unambiguous results -- a skill that pays dividends throughout your entire startup journey.

Design Your Smoke Test

Create landing page tests, concierge MVPs, and fake door experiments with our Market Signal Tester. Includes hypothesis templates, threshold recommendations, and CTA optimization guidance.

Save Your Progress

Create a free account to save your reading progress, bookmark chapters, and unlock Playbooks 04-08 (MVP, Launch, Growth & Funding).

Create Free Account

Ch 4: Verification Ch 6: Synthesis

Ready to Validate Your Idea?

LeanPivot.ai provides 80+ AI-powered tools to help you test assumptions and build evidence.

Start Free Today

Related Guides

Lean Startup Guide

Master the build-measure-learn loop and the foundations of validated learning to build products people actually want.

Read Series

From Layoff to Launch

A step-by-step guide to turning industry expertise into a thriving professional practice after a layoff.

Read Series

Fintech Playbook

Master regulatory moats, ledger architecture, and BaaS partnerships to build successful fintech products.

Read Series

Works Cited & Recommended Reading

Lean Startup & Innovation Accounting

1. Navigating the 2026 AI-Native Enterprise Stack. LeanPivot.ai
4. Validated Learning Techniques. LeanPivot.ai
5. How to Make "Pivot or Persevere" Decisions. Kromatic
6. Lean Methodology - Innovation Accounting Guide. SixSigma.us
28. Running Lean, Second Edition. BEL Initiative

Assumption Mapping & Testing

7. Invest in Winning Ideas with Assumption Mapping. Miro
10. Testing Business Ideas: Book Summary. Strategyzer
11. Innovation Tools – The Assumption Mapper. Nico Eggert
14. Business Testing: Is your Hypothesis Really Validated? Strategyzer
16. An Introduction to Assumptions Mapping. Mural
17. Assumption Mapping Techniques. Medium

Customer Interviews & The Mom Test

8. Book Summary: The Mom Test by Rob Fitzpatrick. Medium
22. The Mom Test for Better Customer Interviews. Looppanel
23. The Mom Test by Rob Fitzpatrick [Actionable Summary]. Durmonski.com
9. How to Evaluate Customer Validation in Early Stages. Golden Egg Check

Jobs-to-Be-Done Framework

24. Jobs to be Done 101: Your Interviewing Style Primer. Dscout
25. How To Get Results From Jobs-to-be-Done Interviews. Jobs-to-be-Done
26. A Script to Kickstart JTBD Interviews. JTBD.info

Product-Market Fit & Surveys

33. Sean Ellis Product Market Fit Survey Template. Zonka Feedback
34. How to Use the Product/Market Fit Survey. Lean B2B
35. Product Market-Fit Questions: Tips and Examples. Qualaroo
36. Product/Market Fit Survey by Sean Ellis. PMF Survey

Pricing Validation Methods

38. Willingness to Pay: What It Is and How to Find It. Baremetrics
39. Pricing Products - Van Westendorp Model. First Principles
40. How To Price Your Product: Van Westendorp Guide. Forbes
41. Gabor Granger vs Van Westendorp Models. Drive Research

Smoke Tests & Fake Door Testing

43. Smoke Tests in Market Research - Complete Guide. Horizon
45. Fake Door Testing - How it Works, Benefits & Risks. Chameleon.io
52. High Hurdle Product Experiment. Learning Loop
53. Fake Door Testing: Measuring User Interest. UXtweak

Conversion Benchmarks & Metrics

46. Landing Page Statistics 2025: 97+ Stats. Marketing LTB
47. Understanding Landing Page Conversion Rates 2025. Nudge
49. What Is A Good Waitlist Conversion Rate? ScaleMath
54. Average Ad Click Through Rates (CTRs). Smart Insights

Decision Making & Kill Criteria

57. From Test Results to Business Decisions. M Accelerator
58. Kill Criteria for Product Managers. Medium
59. When to Kill Your Venture - Session Recap. Bundl

This playbook synthesizes research from Lean Startup methodology, Jobs-to-Be-Done theory, behavioral economics, and validation frameworks. Some book links may be affiliate links.

We value your privacy

Chapter 5: Behavioral Economics & Smoke Testing

Actions Speak Louder Than Words

Stated Preference

Revealed Preference

The Currency of Validation

The Commitment Spectrum

Smoke Tests: The "Fake Door" Method

Landing Page Test

Concierge MVP

Wizard of Oz

Smoke Test Case Studies

Dropbox: The Explainer Video

Food on the Table: Concierge to Scale

Rent the Runway: The Trunk Show

The Hustle: Email Before Everything

Ethical Fake Doors

Designing Your Experiment

The Experiment Template

Setting the Right Threshold

The Anatomy of a Well-Designed Smoke Test

Before the Test

After the Test

The CTA Strength Ladder

Pro Tip: Maximize Friction

Running Multiple Experiments in Parallel

The Parallel Experiment Framework

What You Walk Away With

Design Your Smoke Test

Save Your Progress

Ready to Validate Your Idea?

Related Guides

Lean Startup Guide

From Layoff to Launch

Fintech Playbook

Works Cited & Recommended Reading

Lean Startup & Innovation Accounting

Assumption Mapping & Testing

Customer Interviews & The Mom Test

Jobs-to-Be-Done Framework

Product-Market Fit & Surveys

Pricing Validation Methods

Smoke Tests & Fake Door Testing

Conversion Benchmarks & Metrics

Decision Making & Kill Criteria