Chapter 3 of 12

The Deployment Runbook: Minute-by-Minute Execution

Creating granular, time-bound scripts for launch day to reduce cognitive load and mitigate risk.

PivotBuddy

Unlock This Playbook

Create a free account to access execution playbooks

9 Comprehensive Playbooks
Access to Free-Tier AI Tools
Save Progress & Bookmarks
Create Free Account
Read Aloud AI
Ready
What You'll Learn How to build runbooks with clear tasks, owners, and rollbacks -- plus playbooks for different launch types, dependency management, validation protocols, and automation strategies that eliminate human error during the most critical hours of your product's life.

The Deployment Runbook: Minute-by-Minute Execution

The runbook is the brain of the launch. It replaces vague plans with exact, owned tasks. Decisions are made in advance -- you just execute. A well-crafted runbook transforms launch day from an improvised performance into a rehearsed production. Every team member knows precisely what they need to do, when they need to do it, who depends on them, and what to do if something goes wrong.

The concept of the runbook originates from NASA's mission control procedures, where every action during a launch sequence is scripted, rehearsed, and assigned to a specific console operator. SpaceX famously automates the vast majority of their launch sequence, but every automated step has a human-monitored checkpoint and a manual override procedure. Your product launch deserves the same rigor. The stakes may not be lives, but they are livelihoods -- your team's, your investors', and your customers'.

The fundamental insight behind runbook-driven launches is that cognitive capacity is a finite resource. Under stress, humans make poor decisions. The runbook pre-loads decisions during calm, rational planning sessions so that the stressed, sleep-deprived version of your team simply follows instructions rather than inventing solutions on the fly. As Daniel Kahneman notes in Thinking, Fast and Slow, procedural checklists dramatically reduce error rates even among experts -- from surgical teams to airline pilots to software deployment engineers.

Runbook Architecture

A robust runbook is a structured database of actions, not a casual document. It must include precise fields that eliminate ambiguity and create accountability:

  • Task ID: Unique identifier (e.g., OPS-01) for tracking dependencies and referencing in War Room communications. IDs should follow a naming convention: PRE (pre-launch), TECH (technical), GO (go-live), MKT (marketing), OPS (operations), SUP (support).
  • Action: Specific command (e.g., "Merge PR #405 to main branch and trigger CI/CD pipeline"). Vague instructions like "Check system" are forbidden. If a task cannot be described in a single imperative sentence, it needs to be decomposed into smaller tasks.
  • Owner: Named individual, not a team. "DevOps" is not an owner; "Sarah Chen, DevOps Lead" is an owner. Single thread of accountability means one person is responsible for confirming completion, even if multiple people contribute.
  • Duration: Allotted time to identify delays immediately. If a 15-minute task is still running at 25 minutes, that is a signal to escalate. Include both expected duration and maximum acceptable duration.
  • Rollback Plan: Specific "Undo" steps for every technical action. "Revert" is not a rollback plan. "Run kubectl rollout undo deployment/app-server --to-revision=42" is a rollback plan. Include the exact commands, the expected outcome, and the verification step.
  • Prerequisites: What must be true before this task can begin. "TECH-01 complete" or "Database snapshot verified" or "QA sign-off received." These prevent tasks from starting prematurely.
  • Verification: How you confirm the task completed successfully. A deployment is not "done" when the command finishes -- it is "done" when the health check returns 200 OK on all endpoints.

The T-Minus Countdown Sequence

The countdown sequence choreographs every action from pre-launch preparation through the first hour of live operation. The timeline below represents a typical SaaS product launch. Adjust the timing for your specific context, but preserve the structure: pre-launch preparation, technical deployment, go-live activation, and marketing amplification should always follow this order.

Time ID Task Description Owner Rollback
T-24H PRE-01 Final "Go/No-Go" Decision Meeting Launch Cpt N/A
T-12H PRE-02 Code freeze enforced. Branch protection enabled on main. Eng Lead Lift freeze + delay launch
T-4H PRE-03 Notify on-call teams. Confirm all shift assignments. Ops Lead N/A
T-2H OPS-01 Establish "War Room" (Slack/Zoom). Pin escalation matrix. Ops Lead N/A
T-60m TECH-01 Take production database snapshot. Verify restore procedure. Backend Lead N/A (safety net)
T-45m TECH-02 Execute DB Migration Scripts. Verify schema changes. Backend Lead Restore DB Snapshot (TECH-01)
T-30m TECH-03 Deploy application to production (Blue-Green swap). DevOps Route traffic back to Blue environment
T-15m TECH-04 Smoke Test in Production (Internal IPs Only) QA Lead Revert deployment + migration
T-5m TECH-05 Final status check in War Room. All owners report Green/Red. Launch Captain Abort launch
T-00m GO-01 Flip Feature Flags to "ON" (100% traffic) Product Mgr Flip Flags "OFF"
T+02m MKT-01 Publish Website & Pricing Page updates. Web Team Revert CMS to previous version
T+05m OPS-02 First health check: error rates, latency, DB connections. DevOps Scale Down / Revert
T+10m MKT-02 Send launch email to waitlist. Verify delivery metrics. Marketing Pause email send job
T+15m SOC-01 Publish social media posts. Post to Product Hunt. Founder N/A
T+30m OPS-03 Second health check + first SitRep to stakeholders. Launch Captain -
T+60m OPS-04 Full status report. Confirm marketing campaigns running. Launch Captain -

Notice the deliberate sequencing: technical deployment completes and is verified before any marketing activity begins. This is non-negotiable. If you send a launch email to 50,000 people before confirming that the signup flow works, you are converting marketing investment into support tickets rather than customers. The five-minute gap between GO-01 and MKT-01 is intentional -- it provides a buffer for the smoke test to confirm real user traffic can flow through the system.

Automated vs. Manual Runbooks

Use scripts where possible to cut human error. But keep manual checkpoints for key business decisions. The distinction is important: automation eliminates the risk of a human forgetting a step or typing a command incorrectly. Manual checkpoints preserve the ability to exercise judgment when unexpected situations arise. The goal is not full automation -- it is thoughtful automation of the predictable, with human oversight of the unpredictable.

Automate These

Tasks where human error is the primary risk and where the action is deterministic:

  • Database migrations and rollback scripts (use tools like Flyway, Alembic, or Django migrations)
  • Infrastructure scaling (Kubernetes HPA, AWS Auto Scaling Groups, Terraform apply)
  • Health checks and smoke tests (scripted HTTP requests to critical endpoints)
  • Feature flag toggles via API (LaunchDarkly, Flagsmith, Unleash)
  • CDN cache invalidation (CloudFront, Fastly, Cloudflare API calls)
  • Status page updates via API (Statuspage, Instatus webhooks)
  • Monitoring dashboard provisioning (Grafana, Datadog dashboard-as-code)
  • SSL certificate verification scripts

Keep Manual

Tasks where judgment, context, or business authority is required:

  • Go/No-Go decision checkpoints (requires human assessment of aggregate risk)
  • Marketing campaign activation triggers (timing depends on real-time conditions)
  • Customer communication sends (tone and timing require judgment)
  • Rollback decisions (requires weighing business impact vs. technical risk)
  • Incident escalation calls (severity assessment needs human context)
  • Press embargo lifts (coordination with journalists requires flexibility)
  • Pricing page publication (final approval from business leadership)
  • Partner notification sends (relationship management requires personal touch)

Automation Best Practice: The Wrapper Script

Create a single deployment script that orchestrates the automated steps. The script should pause at manual checkpoints, display the current status, and wait for explicit human confirmation before proceeding. This pattern -- sometimes called "human-in-the-loop automation" -- gives you the reliability of automation with the judgment of human oversight.

Example: your deployment script runs the database migration, verifies schema changes, deploys the application, runs the health check, and then pauses with the message: "All automated checks passed. Type CONFIRM to proceed to feature flag activation, or ABORT to roll back." This approach reduces the number of human decisions from dozens to a handful, concentrating human attention on the decisions that matter most.

Runbook Validation: The Dry Run

A runbook is only as good as its last rehearsal. Run a full dry run before launch. The dry run serves three purposes: it validates that the runbook's instructions are accurate and complete, it trains the team on the execution flow, and it reveals timing gaps that would cause delays on launch day. Teams that skip the dry run almost always discover gaps during the real launch -- and those gaps create panic, delays, and errors.

The rehearsal concept is borrowed from theater and the military, both of which understood centuries ago that complex coordinated performances cannot succeed without practice. Your launch is a performance -- multiple people executing coordinated actions under time pressure with an audience watching. No theater company would open a show without a dress rehearsal. No military unit would execute an operation without a walkthrough. Your launch deserves the same discipline.

The Dress Rehearsal

Execute the entire runbook in a staging environment 48-72 hours before launch. This timing is critical: early enough to fix issues discovered during the rehearsal, but late enough that the staging environment closely mirrors what production will look like on launch day.

  • Same Time: Run at the same time of day as the actual launch. If your launch is at 9 AM EST, your rehearsal is at 9 AM EST. This validates that all team members are available and that no time-zone confusion exists.
  • Same Channels: Use the actual War Room Slack channel (mark messages as "[DRY RUN]" to avoid confusion). This validates that notification routing, channel permissions, and tool integrations work correctly.
  • Same People: All owners must participate -- no proxies, no substitutes. If someone sends a proxy, you have just identified a bus factor risk. What happens if that person is unavailable on launch day?
  • Same Pressure: Inject fake "incidents" to test response protocols. Have someone announce: "[DRY RUN] ALERT: Error rate spiked to 5% on the /api/signup endpoint." Watch how the team responds. Does the escalation matrix work? Does the right person take ownership?
  • Document Gaps: Every "uh, what do I do now?" moment becomes a runbook update. Every question asked is evidence of an ambiguous instruction. Every delay reveals an underestimated duration. Update the runbook within 24 hours of the dry run.
  • Time Every Step: Compare actual durations to estimated durations. If your 15-minute database migration takes 40 minutes in staging, it will take at least that long in production (likely longer, due to larger data volumes). Adjust the timeline accordingly.

After the dry run, hold a 30-minute debrief. Ask three questions: What surprised us? What took longer than expected? What is still unclear? The answers become immediate runbook revisions. A runbook that has survived a dry run is fundamentally different from one that has not -- it has been tested against reality and found sufficient.

Dependency Management: The Critical Path

Some tasks can't start until others finish. Map them to avoid delays. Dependency management is the art of identifying the longest chain of sequential tasks -- the critical path -- and ensuring that nothing on that path is blocked, delayed, or under-resourced. Any delay on the critical path delays the entire launch. Delays on non-critical paths can be absorbed if properly managed.

The critical path concept originates from project management methodologies like PERT (Program Evaluation and Review Technique), developed by the U.S. Navy in the 1950s to manage the Polaris missile program. It remains one of the most powerful tools for managing complex, time-sensitive operations. For a product launch, the critical path typically runs through the technical deployment sequence: database migration, application deployment, smoke test, feature flag activation. Any delay in this chain cascades into marketing and communication delays.

Task ID Task Depends On Blocks On Critical Path?
PRE-01 Go/No-Go Decision All readiness reports TECH-01 (DB Snapshot) Yes
TECH-01 Take DB Snapshot PRE-01 (Go Decision) TECH-02 (Migration) Yes
TECH-02 Run DB Migrations TECH-01 TECH-03 (Deploy) Yes
TECH-03 Deploy Application TECH-02 TECH-04 (Smoke Test) Yes
TECH-04 Production Smoke Test TECH-03 GO-01 (Feature Flags) Yes
GO-01 Flip Feature Flags TECH-04 MKT-01 (Website) Yes
MKT-01 Publish Website Update GO-01 MKT-02 (Email Blast) Yes
MKT-02 Send Launch Email MKT-01 SOC-01 (Social Posts) No (can run in parallel)
SUP-01 Support Team Login PRE-01 None (parallel track) No

Dependency Anti-Patterns

These patterns have derailed launches. Audit your runbook for each one:

  • Circular Dependencies: Task A waits for B, B waits for A. This sounds impossible, but it happens when teams define implicit dependencies. Example: "Deploy app" waits for "CDN cache warmed," but CDN can only warm after the app is deployed. Audit your runbook by drawing a directed graph of dependencies. If you find a cycle, decompose the tasks further.
  • Hidden Dependencies: "Oh, we need Legal sign-off first." All approvals must be explicit in the runbook, with a specific person, a specific deadline, and a specific escalation path if the approval is not received on time. Legal sign-offs obtained on Day -1 do not count as "done" -- they must be confirmed on launch day.
  • Human Bottlenecks: One person owns 5 sequential tasks on the critical path. If they step away for 10 minutes, the entire launch stalls. Parallelize where possible, and always assign a backup owner who has the access and knowledge to execute the task.
  • External Dependencies: Your launch depends on a third-party action (e.g., "Apple approves our app update"). External dependencies should be resolved well before launch day, or the launch plan must account for the possibility of delay. Never put an external dependency on the critical path of a time-sensitive launch.

Launch-Specific Playbooks

Different launches need different runbooks. A Product Hunt launch, a soft launch to beta users, and an enterprise release to key accounts are fundamentally different operations requiring different timing, communication, and escalation strategies. One size does not fit all. Customize your runbook for the specific launch type, but keep the underlying structure consistent.

The Product Hunt Protocol

Product Hunt Mechanics

Product Hunt launches are unique because success is determined within a 24-hour window. The platform resets daily at midnight Pacific Time, and your position is determined by upvotes, comments, and engagement within that window. This creates an intense, time-boxed execution challenge that requires specific preparation.

Timing Strategy:

  • Launch at 12:01 AM PST to maximize your 24-hour visibility window
  • Tuesday-Thursday launches historically perform best (less competition)
  • Avoid holidays, major tech conferences, and Apple/Google event days
  • Check the upcoming launches calendar -- avoid days with well-funded competitors
  • Coordinate with a "Hunter" (established PH community member) to post your product

First Hour Tactics:

  • Maker posts first comment immediately with context on why you built this
  • Pre-notify your network (personal messages, not mass emails) to visit and engage
  • Respond to every comment within 15 minutes -- engagement drives algorithm ranking
  • Have a dedicated person monitoring and responding for the full 24 hours
  • Prepare a "teaser" GIF or video that auto-plays in the PH feed

Post-PH Action: Product Hunt traffic converts differently than organic traffic. Expect a high bounce rate (70-80%) and low activation rate from PH visitors. Track this cohort separately in your analytics. The real value of a successful PH launch is often the backlinks, press inquiries, and investor attention it generates -- not the direct signups.

The Soft Launch Protocol

Soft Launch Mechanics

A "soft launch" limits exposure to a controlled audience before the public announcement. It is the equivalent of a restaurant's "friends and family" night before the grand opening. The purpose is to stress-test operations with real users while limiting blast radius if things go wrong. Soft launches are particularly valuable for first-time launchers, complex products, or products with significant infrastructure risk.

Execution Details:

  • Invite-Only Access: Beta users via unique email invitation codes
  • No Marketing: Zero paid ads, no press, no social amplification
  • Instrumented: Extra logging, session replay (FullStory/Hotjar), and verbose error tracking
  • Direct Communication: Slack channel or Discord server for immediate feedback
  • Cohort Size: 50-200 users is ideal -- enough for pattern detection, small enough for personal outreach

Success/Kill Criteria:

  • Duration: 1-2 weeks before hard launch (3-4 weeks for complex products)
  • Exit Criteria: Error rate below 5%, activation rate above 30%, NPS above 20
  • Kill Criteria: More than 20% negative feedback, critical bugs discovered, data integrity issues
  • Graduation: When exit criteria are met for 3 consecutive days, schedule hard launch
  • Feedback Loop: Daily synthesis of user feedback shared with engineering team

The Enterprise Launch Protocol

Enterprise Release Mechanics

Enterprise launches require white-glove coordination with key accounts. Unlike consumer launches where all users see the change simultaneously, enterprise releases are typically staggered by account tier to limit risk and provide premium support to the highest-value customers. The rollout should proceed from lowest-risk accounts to highest-value accounts, giving your team time to identify and resolve issues before they affect your most important customers.

Pre-Launch Coordination:

  • T-4 Weeks: Notify account managers and customer success managers of upcoming changes
  • T-2 Weeks: Send detailed release notes to enterprise contacts, including breaking changes
  • T-1 Week: Conduct customer-specific training sessions for Tier 1 accounts
  • T-2 Days: Final confirmation from key account CSMs that customers are prepared
  • T-Day: Staggered rollout by account tier with per-account monitoring

Rollout Sequence:

  • Phase 1: Internal accounts and sandbox environments (Day 1)
  • Phase 2: Tier 3 accounts -- smallest, lowest revenue risk (Day 2-3)
  • Phase 3: Tier 2 accounts -- mid-market customers (Day 4-5)
  • Phase 4: Tier 1 accounts -- enterprise, highest value (Day 7+)
  • Per-Account Rollback: Feature flags by org_id allow instant per-customer rollback
  • Executive Notifications: Personalized emails from founder to Tier 1 contacts

Common Runbook Failures

Learn from others' mistakes. The following patterns have caused launch failures across thousands of product releases. Each one is preventable with discipline and process. Review this list during your dry run and actively check for each anti-pattern in your own runbook.

The "It's Obvious" Trap

Missing steps because "everyone knows that." New team members don't. Temporary contractors don't. The engineer who is filling in because the primary owner is sick doesn't. Document every click, every command, every verification step. If you can't hand the runbook to someone who has never seen your system and have them execute it successfully, the runbook is incomplete.

The Stale Runbook

Last updated 6 months ago. APIs changed, people left, infrastructure was restructured. A stale runbook is worse than no runbook because it creates false confidence. The team believes they have a plan, but the plan references systems that no longer exist or commands that no longer work. Review runbooks quarterly at minimum, and always after infrastructure changes.

The Solo Owner

Only one person can execute critical steps. What if they're sick? What if their internet goes down? What if they're unreachable at 2 AM? Every critical path task needs a backup owner who has the necessary access permissions, knowledge, and has practiced the procedure at least once. Document the backup owner in the runbook alongside the primary.

No Rollback Defined

"We'll figure it out if something breaks." You won't. Under the stress of a failing launch, with stakeholders asking for updates and users complaining on social media, your team will not calmly and methodically design a rollback procedure. Panic prevents clear thinking. Pre-written rollback procedures are the safety net that lets you take risks confidently.

Missing Time Buffers

Every task takes exactly as planned. It won't. Network latency varies. Database migrations encounter unexpected data. CDN propagation takes longer than expected. Add 20-30% buffer to every time estimate, and add explicit "buffer checkpoints" after sequences of critical tasks. If you finish early, use the buffer for additional verification. If you finish late, the buffer prevents cascade delays.

No Communication Plan

Technical steps are meticulously defined, but no one told Sales, Support, or the executive team when to expect updates or how to report issues. Cross-functional visibility is mandatory. Your runbook should include communication checkpoints: "At T+15m, Launch Captain posts first SitRep to #all-company." Every stakeholder should know exactly when they will receive updates.

The Runbook as a Living Document

Your runbook improves with every launch. After each deployment, conduct a runbook review alongside the general retrospective. Ask: Which steps were missing? Which durations were inaccurate? Which rollback procedures were untested? Which dependencies were undocumented? Feed these learnings back into the template. Over time, your runbook evolves from a generic checklist into a battle-tested operational manual that reflects the specific realities of your product, your team, and your infrastructure.

The best engineering teams treat their runbooks as code: version-controlled, peer-reviewed, and tested before deployment. Store your runbook in your version control system (not a Google Doc that someone might accidentally edit). Tag versions by release. Require a review before modifying the production runbook. This discipline ensures that the document you rely on during the most critical hours of your product's life is as reliable as the code you are deploying. Use LeanPivot's Launch Checklist tool to generate a customized baseline runbook that you can then refine through practice.

Generate Your Runbook

Use our AI-powered Launch tools to create a customized deployment checklist for your specific launch type, team structure, and technology stack. The Launch Checklist tool generates a structured runbook with task IDs, owners, durations, dependencies, and rollback procedures tailored to your product.

Save Your Progress

Create a free account to save your reading progress, bookmark chapters, and unlock Playbooks 04-08 (MVP, Launch, Growth & Funding).

Ready to Launch Your Startup?

LeanPivot.ai provides 80+ AI-powered tools to execute a successful launch.

Start Free Today

Related Guides

Lean Startup Guide

Master the build-measure-learn loop and the foundations of validated learning to build products people actually want.

From Layoff to Launch

A step-by-step guide to turning industry expertise into a thriving professional practice after a layoff.

Fintech Playbook

Master regulatory moats, ledger architecture, and BaaS partnerships to build successful fintech products.

Works Cited & Recommended Reading
Lean Startup Methodology
Launch Readiness & Strategy
  • 3. "Goals, Readiness and Constraints: The Three Dimensions of a Product Launch." Pragmatic Institute
  • 4. "I Launched a SaaS and Failed - Here's What I Learned." Reddit
  • 5. "SaaS Product Development Checklist: From Idea to Launch." Dev.Pro
  • 6. "10 Biggest SaaS Challenges: How to Protect Your Business." Userpilot
Metrics & KPIs
  • 7. "The Essential Guide to Product Launch Metrics." Gainsight
  • 8. "Product launch plan template for SaaS and B2B marketing teams." Understory Agency
  • 9. "SaaS Metrics Dashboard Examples and When to Use Them." UXCam
  • 10. "B2B SaaS Product Launch Checklist 2025: No-Fluff & AI-Ready." GTM Buddy
  • 11. "The Pre-Launch Metrics Imperative." Venture for All
  • 12. "Average Resolution Time | KPI example." Geckoboard
  • 13. "Burn rate is a better error rate." Datadog
Stakeholder Alignment
  • 14. "Coordinate product launches with internal stakeholders." Product Marketing Alliance
  • 15. "Comprehensive SaaS Product Readiness Checklist." Default
  • 16. "Launching with stakeholders - Open-source product playbook." Coda
  • 17. "Product launch checklist: How to ensure a successful launch." Atlassian
Launch Checklists & Process
Runbooks & Execution
  • 20. "Runbook Example: A Best Practices Guide." Nobl9
  • 21. "10 Steps for a Successful SaaS Product Launch Day." Scenic West Design
  • 22. "SaaS Outages: When Lightning Strikes, Thunder Rolls." Forrester
  • 23. "Developer-Friendly Runbooks: A Guide." Medium
  • 24. "Your Essential Product Launch Checklist Template." VeryCreatives
  • 25. "87-Action-Item Product Launch Checklist." Ignition
Press Kits & Marketing Assets
  • 26. "How to Build a SaaS Media Kit for Your Brand." Webstacks
  • 27. "Press Kit: What It Is, Templates & 10+ Examples For 2025." Prezly
  • 28. "How I Won #1 Product of The Day on Product Hunt." Microns.io
Messaging Frameworks
  • 29. "Product messaging: Guide to frameworks, strategy, and examples." PMA
  • 30. "Product Messaging Framework: A Guide for Ambitious PMMs." Product School
Runbook Templates & Automation
Dashboards & Real-Time Monitoring
  • 39. "8 SaaS Dashboard Examples to Track Key Metrics." Userpilot
  • 40. "Real-time dashboards: are they worth it?" Tinybird
  • 41. "Incident Management - MTBF, MTTR, MTTA, and MTTF." Atlassian
  • 42. "SaaS Metrics Dashboard: Your Revenue Command Center." Rework
  • 43. "12 product adoption metrics to track for success." Appcues
Crisis Communication
  • 44. "How to Create a Crisis Communication Plan." Everbridge
  • 45. "10 Crisis Communication Templates for Every Agency Owner." CoSchedule
  • 46. "Your Complete Crisis Communication Plan Template." Ready Response
  • 47. "Crisis communications: What it is and examples brands can learn from." Sprout Social
Retrospectives & Learning
  • 48. "What the 'Lean Startup' didn't tell me - 3 iterations in." Reddit
  • 49. "Does Your Product Launch Strategy Include Retrospectives?" UserVoice
  • 50. "Retrospective Templates for Efficient Team Meetings." Miro
  • 51. "50+ Retrospective Questions for your Next Meeting." Parabol
  • 52. "Quick Wins for Product Managers." Medium
  • 53. "Showcase Early Wins for Successful Product Adoption." Profit.co
Observability & Tooling
  • 54. "The Lean Startup Method 101: The Essential Ideas." Lean Startup Co
  • 55. "Grafana: The open and composable observability platform." Grafana Labs
  • 56. "The essential product launch checklist for SaaS companies | 2025." Orb Billing

This playbook synthesizes methodologies from DevOps, Site Reliability Engineering (SRE), Incident Command System (ICS), and modern product management practices. References are provided for deeper exploration of each topic.