Software Requirements: Progressive Markdown Documents (PMD)

Summary #

The most effective way to capture software requirements for LLM consumption is Progressive Markdown Documents (PMD) - a tiered approach combining YAML frontmatter for machine-readable metadata, markdown for human-readable content, and a structure that evolves from lightweight to detailed as understanding deepens[1][2].

This approach synthesizes best practices from Shape Up's "pitches"[3], Architecture Decision Records (ADRs)[4], the llms.txt standard[5], and Gherkin's Given-When-Then acceptance criteria[6]. The key insight is that requirements should start as problem statements (not solutions), use consistent terminology, and progressively add detail only when needed.

Markdown is the optimal format for LLMs because it's lightweight (more room for content in context windows), preserves semantic structure through headers and lists, and is universally understood by all major LLMs[7]. YAML frontmatter adds machine-parseable metadata that enables LLMs to quickly assess relevance and status without reading the full document[8].

Philosophy & Mental Model #

Core Principles #

Problem-first, not solution-first: Requirements describe what problem to solve and why it matters, not how to solve it. The LLM (or developer) determines the how[3].
Progressive disclosure: Start with a one-paragraph summary. Add sections only when clarity requires it. Never write detail you don't yet need.
Living documents: Requirements are updated as understanding evolves. Version control tracks changes. Old assumptions are crossed out, not deleted, so the LLM understands what changed and why.
Consistent vocabulary: Use the same term for the same concept everywhere. LLMs build mental models from terminology - inconsistency causes hallucination[5].
Testable outcomes: Every requirement should eventually have acceptance criteria that can be verified. Gherkin's Given-When-Then format makes requirements executable[6].

The Three Maturity Levels #

Level	Name	When to Use	Content
1	Problem	Initial capture	Problem statement, context, appetite
2	Shaped	Ready to explore	+ Solution sketch, boundaries, risks
3	Specified	Ready to build	+ Acceptance criteria, edge cases, examples

An LLM can work with Level 1 documents to explore possibilities. Level 2 documents are sufficient for implementation planning. Level 3 documents enable confident implementation with clear success criteria.

Setup #

Directory Structure #

project-root/
├── requirements/
│   ├── _index.md           # Overview and navigation
│   ├── active/             # Currently being worked on
│   │   └── user-auth.md
│   ├── shaped/             # Ready for betting/prioritization
│   │   └── dark-mode.md
│   ├── exploring/          # Problem understood, solution unclear
│   │   └── perf-issues.md
│   └── completed/          # Shipped (archive)
│       └── signup-flow.md
└── ...

Index File Template #

Create requirements/_index.md:

 1# Requirements
 2
 3## Active
 4- [User Authentication](active/user-auth.md) - Implementing OAuth flow
 5
 6## Shaped (Ready to Build)
 7- [Dark Mode](shaped/dark-mode.md) - Theme system for accessibility
 8
 9## Exploring
10- [Performance Issues](exploring/perf-issues.md) - Dashboard load times
11
12## Completed
13- [Signup Flow](completed/signup-flow.md) - Shipped 2024-03-15

Core Usage Patterns #

Pattern 1: Level 1 - Problem Document #

Use this when you first identify a need. Keep it to one page or less.

 1---
 2id: REQ-042
 3title: Users Cannot Reset Passwords
 4status: exploring
 5priority: high
 6created: 2025-11-28
 7updated: 2025-11-28
 8author: wschenk
 9stakeholders:
10  - support-team
11  - security-team
12appetite: small  # small = days, medium = 1-2 weeks, large = 6 weeks
13tags:
14  - authentication
15  - user-experience
16---
17
18# Users Cannot Reset Passwords
19
20## Problem
21
22Users who forget their passwords have no self-service way to regain access.
23Support is handling 50+ password reset tickets per week, taking ~15 minutes each.
24
25## Context
26
27- Current auth: username/password stored in PostgreSQL
28- No email service currently integrated
29- Users have verified email addresses on file
30- Security requirement: reset links must expire within 1 hour
31
32## Jobs to Be Done
33
34When I **forget my password**,
35I want to **regain access to my account**,
36so that I **can continue using the product without waiting for support**.
37
38## Open Questions
39
40- [ ] Should we support SMS as an alternative to email?
41- [ ] Do we need rate limiting on reset requests?
42- [ ] Should password reset invalidate existing sessions?

Pattern 2: Level 2 - Shaped Document #

Evolve the document when you understand the solution shape. Add boundaries and risks.

 1---
 2id: REQ-042
 3title: Self-Service Password Reset
 4status: shaped
 5priority: high
 6created: 2025-11-28
 7updated: 2025-11-29
 8author: wschenk
 9stakeholders:
10  - support-team
11  - security-team
12appetite: small
13tags:
14  - authentication
15  - user-experience
16depends_on: []
17blocks: []
18---
19
20# Self-Service Password Reset
21
22## Problem
23
24Users who forget their passwords have no self-service way to regain access.
25Support is handling 50+ password reset tickets per week, taking ~15 minutes each.
26
27## Solution Sketch
28
29### Flow
301. User clicks "Forgot Password" on login page
312. User enters email address
323. System sends email with secure, time-limited reset link
334. User clicks link, enters new password
345. System updates password, invalidates link
35
36### Boundaries (What's IN)
37- Email-based reset only
38- Single-use, 1-hour expiry tokens
39- Password strength validation (same as signup)
40- Success/failure notifications
41
42### Boundaries (What's OUT)
43- SMS reset (future enhancement)
44- Security questions
45- Admin override flow (separate requirement)
46
47## Risks & Rabbit Holes
48
49| Risk | Mitigation |
50|------|------------|
51| Email deliverability | Use established service (SendGrid/SES) |
52| Token brute-forcing | Use 256-bit tokens, rate limit attempts |
53| Enumeration attacks | Same response whether email exists or not |
54
55## Open Questions
56
57- [x] ~~Should we support SMS?~~ **Decision: No, email only for v1**
58- [x] ~~Rate limiting?~~ **Decision: Yes, 3 requests per email per hour**
59- [ ] Invalidate existing sessions? (Need security team input)

Pattern 3: Level 3 - Specified Document #

Add acceptance criteria when ready to build. Use Gherkin format for testability.

 1---
 2id: REQ-042
 3title: Self-Service Password Reset
 4status: active
 5priority: high
 6created: 2025-11-28
 7updated: 2025-11-30
 8author: wschenk
 9stakeholders:
10  - support-team
11  - security-team
12appetite: small
13tags:
14  - authentication
15  - user-experience
16depends_on: []
17blocks:
18  - REQ-051  # MFA implementation depends on this
19---
20
21# Self-Service Password Reset
22
23## Problem
24
25Users who forget their passwords have no self-service way to regain access.
26Support is handling 50+ password reset tickets per week, taking ~15 minutes each.
27
28## Solution
29
30[Previous solution sketch content...]
31
32## Acceptance Criteria
33
34### Happy Path
35
36```gherkin
37Feature: Password Reset
38
39  Scenario: Successful password reset
40    Given I am a registered user with email "user@example.com"
41    And I am on the login page
42    When I click "Forgot Password"
43    And I enter "user@example.com"
44    And I click "Send Reset Link"
45    Then I should see "Check your email for reset instructions"
46    And an email should be sent to "user@example.com"
47
48    When I click the reset link in the email
49    Then I should see the "Create New Password" form
50
51    When I enter a valid new password
52    And I click "Reset Password"
53    Then I should see "Password updated successfully"
54    And I should be redirected to login
55    And I should be able to login with my new password

Edge Cases #

 1  Scenario: Reset link expired
 2    Given I have a password reset link older than 1 hour
 3    When I click the reset link
 4    Then I should see "This link has expired"
 5    And I should see a link to request a new reset
 6
 7  Scenario: Email not found (security)
 8    Given no user exists with email "unknown@example.com"
 9    When I request a password reset for "unknown@example.com"
10    Then I should see "Check your email for reset instructions"
11    # Same message as success - prevents enumeration
12    And no email should be sent
13
14  Scenario: Rate limiting
15    Given I have requested 3 password resets in the last hour
16    When I request another password reset
17    Then I should see "Too many requests. Please try again later."

Examples #

Reset Email Content #

Subject: Reset your password

Hi,

Someone requested a password reset for your account. If this was you,
click the link below to create a new password:

https://app.example.com/reset?token=abc123...

This link expires in 1 hour.

If you didn't request this, you can safely ignore this email.

Token Format #

Base64URL encoded:
- 32 bytes random (256-bit)
- Stored as SHA-256 hash in database (never store raw token)

Technical Notes #

Use crypto.randomBytes(32) for token generation
Store SHA256(token) in password_reset_tokens table
Include user_id, created_at, used_at columns
Clean up expired tokens via scheduled job


### Pattern 4: Referencing Requirements in LLM Prompts

When working with an LLM, reference requirements by ID and include the full document in context:

```markdown
I'm implementing REQ-042 (Self-Service Password Reset). Here's the full
requirement:

[paste full requirement document]

Please implement the password reset token generation and validation logic.
Follow the security constraints in the document (256-bit tokens, SHA-256
storage, 1-hour expiry).

Pattern 5: Lightweight Alternative for Quick Captures #

For rapid ideation, use a simplified format that can be expanded later:

 1---
 2id: REQ-099
 3title: Quick Search
 4status: exploring
 5created: 2025-11-28
 6appetite: medium
 7---
 8
 9# Quick Search
10
11**Problem**: Finding content is too slow. Users click through 5+ pages.
12
13**Job**: When I'm looking for something specific, I want to find it instantly.
14
15**Initial thoughts**:
16- Cmd+K modal like Spotlight
17- Search titles, content, tags
18- Recent items first
19
20**Not sure about**:
21- Full-text search vs fuzzy matching?
22- Search as you type vs press enter?

Anti-Patterns & Pitfalls #

Don't: Write Solutions Disguised as Requirements #

1# Bad: Solution-first
2## Requirement
3Build a React modal with a search input that queries Elasticsearch
4and displays results in a virtualized list.

Why it's wrong: This prescribes technology choices that should emerge from implementation. It prevents the LLM from suggesting potentially better alternatives[3].

Instead: Describe the Problem and Constraints #

1# Good: Problem-first
2## Problem
3Users can't find content quickly. Average time to find a document: 3 minutes.
4
5## Constraints
6- Must work offline (some users have poor connectivity)
7- Results should appear in under 200ms
8- Must search 50,000+ documents

Don't: Use Inconsistent Terminology #

1# Bad: Multiple terms for same concept
2- The customer enters their credentials
3- User authentication validates the client
4- Once logged in, the person can access...

Why it's wrong: LLMs build semantic models from your terminology. Mixed terms create ambiguity and lead to inconsistent implementations[5].

Instead: Define Terms and Use Them Consistently #

1# Good: Consistent vocabulary
2## Glossary
3- **User**: A person with an account in the system
4- **Authentication**: The process of verifying a user's identity
5
6## Requirement
7The user enters credentials. Authentication validates the user.
8Once authenticated, the user can access...

Don't: Over-specify Upfront #

 1# Bad: Premature detail
 2---
 3status: exploring
 4---
 5
 6## API Specification
 7POST /api/auth/reset
 8Content-Type: application/json
 9{
10  "email": "string",
11  "captcha_token": "string"
12}
13
14Response 200:
15{
16  "success": true,
17  "message": "string"
18}
19...
20[50 more lines of API spec for a problem we haven't validated]

Why it's wrong: Detailed specs for unvalidated problems waste effort and create false precision. The spec may be completely wrong once you understand the real need[9].

Instead: Match Detail to Maturity #

 1# Good: Progressive detail
 2---
 3status: exploring
 4---
 5
 6## Problem
 7Users need to reset passwords.
 8
 9## Open Questions
10- [ ] How many resets per week currently?
11- [ ] What security constraints exist?
12
13---
14# Later, when status: shaped
15## Solution Sketch
16Email-based flow with secure tokens...
17
18---
19# Later, when status: active
20## Acceptance Criteria
21[Detailed Gherkin scenarios]

Don't: Leave Stale Information #

1# Bad: Outdated content without indication
2## Solution
3We'll use Magic Links for authentication.
4
5## Acceptance Criteria
6Given user clicks magic link...
7[But the team decided to use passwords instead last week]

Why it's wrong: LLMs treat all content as current truth. Stale requirements cause implementations that don't match current decisions.

Instead: Strike Through Changes, Add Context #

1# Good: Clear change history
2## Solution
3~~We'll use Magic Links for authentication.~~
4**Update 2025-11-25**: Switched to password-based auth due to email
5deliverability concerns in target markets. See discussion in #auth-decisions.
6
7Current approach: Traditional password with email reset flow.

Caveats #

Not a replacement for conversation: These documents capture decisions, but the valuable work happens in discussions. Don't let documentation become a substitute for talking to users and stakeholders.
Requires discipline: Progressive disclosure only works if you resist the urge to add detail prematurely. This is culturally harder than it sounds.
No tooling enforcement: Unlike specialized requirements tools (Jira, Linear, Notion databases), plain markdown has no validation or workflow automation. You're responsible for keeping status fields accurate.
May not satisfy compliance: Regulated industries (healthcare, finance, defense) often require specific requirements formats with traceability matrices. This lightweight approach may need augmentation for audit purposes.
LLM context limits: Very long requirement documents may exceed context windows. Consider splitting large features into multiple linked documents, or creating summary views for LLM consumption.
Version control essential: Without git (or similar), you lose the change history that makes "strike through but keep" meaningful. This approach assumes you're using version control.

References #

[1] Why Markdown is the best format for LLMs - Explains why markdown's lightweight structure outperforms JSON/XML for LLM processing

[2] How to optimize your docs for LLMs - Best practices for LLM-friendly documentation structure

[3] Shape Up: Introduction - Basecamp's methodology for problem-first, progressively detailed requirements

[4] Architecture Decision Records - The ADR pattern for capturing decisions with context and rationale

[5] The llms.txt Standard - Specification for LLM-friendly content with consistent structure

[6] Gherkin Reference - Official Gherkin syntax documentation for Given-When-Then acceptance criteria

[7] Converting Content into LLM Friendly Markdown - Guide on optimizing markdown for Claude and other LLMs

[8] Using YAML Frontmatter - GitHub Docs - Standard approach to metadata in markdown files

[9] Agile Documentation: Methodology & Best Practices - Principles for lightweight, evolving documentation

[10] Write better LLM prompts for AI code generation - Best practices for structuring requirements for LLM implementation

[11] MADR - Markdown Architectural Decision Records - Template repository for structured markdown decision records

[12] Jobs to be Done Template - Framework for capturing user needs in structured format

last updated: 2025-12-01