HTML to PDF: Paged.js with cdp (Chrome DevTools Protocol)

· combray's blog


Summary #

Paged.js is an open-source library (1.1k GitHub stars, MIT license) that polyfills CSS Paged Media and Generated Content specifications in the browser[1]. It transforms standard HTML/CSS into paginated, print-ready documents by fragmenting content into discrete pages with proper headers, footers, page numbers, and multi-column layouts.

When combined with the cdp CLI tool for headless Chrome automation, Paged.js provides a complete solution for generating professional PDFs from HTML. The key challenge is ensuring Paged.js completes its rendering before triggering the print command—this is solved by waiting for .pagedjs_page elements to appear in the DOM[2].

The library is maintained by Julien Taquet, Fred Chasen, and Gijs de Heij, with active development on GitLab[1]. Version 0.4.3 is the current stable release available via npm and unpkg CDN.

Philosophy & Mental Model #

Paged.js works by intercepting your HTML document after it loads and transforming it into a series of "page boxes" that simulate printed pages. Think of it as a print layout engine that runs in the browser.

Core concepts:

  1. Chunker: Fragments your document content into page-sized pieces, handling overflow and pagination
  2. Polisher: Converts @page CSS rules into classes that can be applied in the browser
  3. Handlers: Extensible hooks that let you customize the rendering process

The rendering flow:

HTML loads → Paged.js intercepts → Content chunked into pages →
@page CSS applied → DOM transformed → Ready for print/PDF

After rendering, your original <body> content is replaced with a structure like:

1<div class="pagedjs_pages">
2  <div class="pagedjs_page">
3    <div class="pagedjs_margin-top-center">Header content</div>
4    <div class="pagedjs_page_content">Page 1 content</div>
5    <div class="pagedjs_margin-bottom-center">Footer content</div>
6  </div>
7  <!-- More pages... -->
8</div>

Setup #

1. Create Your HTML Document #

Include the Paged.js polyfill from unpkg CDN:

 1<!DOCTYPE html>
 2<html lang="en">
 3<head>
 4  <meta charset="UTF-8">
 5  <title>My Document</title>
 6
 7  <!-- Paged.js Polyfill - loads and auto-runs -->
 8  <script src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>
 9
10  <style>
11    /* Your @page rules and styles go here */
12  </style>
13</head>
14<body>
15  <!-- Your content -->
16</body>
17</html>

2. Basic @page CSS Setup #

 1@page {
 2  size: letter;              /* or: A4, legal, 8.5in 11in */
 3  margin: 1in 0.75in;        /* top/bottom left/right */
 4
 5  /* Header */
 6  @top-center {
 7    content: "Document Title";
 8    font-size: 10pt;
 9    color: #666;
10  }
11
12  /* Footer with page numbers */
13  @bottom-center {
14    content: "Page " counter(page) " of " counter(pages);
15    font-size: 10pt;
16  }
17}

3. PDF Generation Workflow with cdp #

 1# 1. Start headless Chrome
 2./cdp start
 3
 4# 2. Navigate to your HTML file
 5./cdp navigate "file:///path/to/document.html"
 6
 7# 3. Wait for Paged.js to finish rendering
 8#    Poll until .pagedjs_page elements exist
 9./cdp eval "document.querySelectorAll('.pagedjs_page').length"
10# Repeat until value > 0
11
12# 4. Generate PDF
13./cdp print output.pdf --paper letter
14
15# 5. Stop browser
16./cdp stop

4. Automated Script Example #

 1#!/bin/bash
 2# generate-pdf.sh - Generate PDF from Paged.js HTML
 3
 4HTML_FILE="$1"
 5OUTPUT_PDF="$2"
 6
 7./cdp start
 8
 9./cdp navigate "file://$HTML_FILE"
10
11# Wait for Paged.js rendering (poll every 500ms, max 30 seconds)
12for i in {1..60}; do
13  PAGES=$(./cdp eval "document.querySelectorAll('.pagedjs_page').length" 2>/dev/null | grep -o '"value":[0-9]*' | grep -o '[0-9]*')
14  if [ "$PAGES" -gt 0 ]; then
15    echo "Rendered $PAGES pages"
16    break
17  fi
18  sleep 0.5
19done
20
21./cdp print "$OUTPUT_PDF" --paper letter
22./cdp stop
23
24echo "PDF saved to $OUTPUT_PDF"

Core Usage Patterns #

Pattern 1: Page Size and Margins #

 1/* US Letter size with 1-inch margins */
 2@page {
 3  size: letter;
 4  margin: 1in;
 5}
 6
 7/* A4 landscape */
 8@page {
 9  size: A4 landscape;
10  margin: 20mm 25mm;
11}
12
13/* Custom size */
14@page {
15  size: 6in 9in;  /* width height */
16  margin: 0.5in 0.75in 0.75in 0.75in; /* top right bottom left */
17}

Pattern 2: Headers and Footers with Margin Boxes #

The 16 margin boxes available:

@top-left-corner    @top-left    @top-center    @top-right    @top-right-corner
@left-top           +-----------------------------------------+  @right-top
@left-middle        |                                         |  @right-middle
@left-bottom        |            Page Content Area            |  @right-bottom
@bottom-left-corner @bottom-left @bottom-center @bottom-right @bottom-right-corner
 1@page {
 2  size: letter;
 3  margin: 1in 0.75in;
 4
 5  /* Centered header */
 6  @top-center {
 7    content: "My Document Title";
 8    font-size: 10pt;
 9    color: #666;
10  }
11
12  /* Page number in footer */
13  @bottom-center {
14    content: "Page " counter(page) " of " counter(pages);
15    font-size: 9pt;
16  }
17
18  /* Left footer - date or author */
19  @bottom-left {
20    content: "December 2025";
21    font-size: 9pt;
22    color: #999;
23  }
24}

Pattern 3: First Page Different (No Header) #

 1/* Default for all pages */
 2@page {
 3  @top-center {
 4    content: "Document Title";
 5  }
 6  @bottom-center {
 7    content: "Page " counter(page);
 8  }
 9}
10
11/* First page - suppress header */
12@page :first {
13  @top-center {
14    content: none;
15  }
16}

Pattern 4: Multi-Column Layout #

 1/* Two-column content area */
 2.two-column {
 3  column-count: 2;
 4  column-gap: 0.5in;
 5  column-rule: 1px solid #ddd;  /* Optional separator line */
 6}
 7
 8/* Three columns */
 9.three-column {
10  column-count: 3;
11  column-gap: 0.3in;
12}
13
14/* Prevent elements from breaking across columns */
15.two-column p,
16.two-column figure,
17.two-column blockquote {
18  break-inside: avoid;
19}

Pattern 5: Page Break Control #

 1/* Force page break before chapters */
 2.chapter {
 3  break-before: page;
 4}
 5
 6/* Force page break after title page */
 7.title-page {
 8  break-after: page;
 9}
10
11/* Prevent page break inside figures/cards */
12figure,
13.card,
14.keep-together {
15  break-inside: avoid;
16}
17
18/* Keep heading with following content */
19h2, h3 {
20  break-after: avoid;
21}

Pattern 6: Running Headers (Dynamic Content from Document) #

 1/* In HTML: <h2 class="chapter-title">Chapter 1: Introduction</h2> */
 2
 3/* Make chapter titles "running" */
 4.chapter-title {
 5  position: running(chapterTitle);
 6}
 7
 8/* Use in page header */
 9@page {
10  @top-left {
11    content: element(chapterTitle);
12  }
13}

Pattern 7: Named String Headers (Text Only) #

 1/* Capture text from h1 elements */
 2h1 {
 3  string-set: doctitle content();
 4}
 5
 6/* Use captured text in header */
 7@page {
 8  @top-center {
 9    content: string(doctitle);
10  }
11}

Pattern 8: Left/Right Page Differences #

 1/* Left (even) pages */
 2@page :left {
 3  margin-left: 1.25in;
 4  margin-right: 0.75in;
 5
 6  @bottom-left {
 7    content: counter(page);
 8  }
 9}
10
11/* Right (odd) pages */
12@page :right {
13  margin-left: 0.75in;
14  margin-right: 1.25in;
15
16  @bottom-right {
17    content: counter(page);
18  }
19}

Pattern 9: Detecting Render Completion in JavaScript #

 1<script>
 2window.PagedConfig = {
 3  auto: true,
 4  after: function(flow) {
 5    console.log('Rendering complete');
 6    console.log('Total pages:', flow.total);
 7    window.pagedRenderingComplete = true;
 8    window.totalPages = flow.total;
 9  }
10};
11</script>

Then in cdp:

1# Wait for the flag
2./cdp eval "window.pagedRenderingComplete === true"

Alternative: Poll for page elements:

1# More reliable - check for actual rendered pages
2./cdp eval "document.querySelectorAll('.pagedjs_page').length > 0"

Anti-Patterns & Pitfalls #

Don't: Print Before Paged.js Finishes #

1# BAD - PDF will be empty or incomplete
2./cdp navigate "file:///doc.html"
3./cdp print output.pdf  # Too soon!

Why it's wrong: Paged.js runs asynchronously after page load. The print command fires before pagination completes.

Instead: Wait for Rendering #

 1# GOOD - Wait for pages to render
 2./cdp navigate "file:///doc.html"
 3sleep 3  # Simple but unreliable
 4
 5# BETTER - Poll for completion
 6for i in {1..60}; do
 7  PAGES=$(./cdp eval "document.querySelectorAll('.pagedjs_page').length")
 8  [[ "$PAGES" =~ \"value\":([0-9]+) ]] && [ "${BASH_REMATCH[1]}" -gt 0 ] && break
 9  sleep 0.5
10done
11./cdp print output.pdf

Don't: Use Fixed Timeouts #

1# BAD - Arbitrary wait that may be too short or wasteful
2sleep 10
3./cdp print output.pdf

Why it's wrong: Large documents need more time; small documents waste time waiting.

Instead: Use Condition-Based Waiting #

1# GOOD - Wait only as long as needed
2while true; do
3  COUNT=$(./cdp eval "document.querySelectorAll('.pagedjs_page').length" | grep -oP '"value":\K\d+')
4  [ "$COUNT" -gt 0 ] && break
5  sleep 0.5
6done

Don't: Expect @page CSS to Work Without Paged.js #

1/* This WON'T work in browsers without Paged.js */
2@page {
3  @top-center {
4    content: "Header";
5  }
6}

Why it's wrong: Browsers don't natively support margin boxes in @page rules. That's what Paged.js polyfills.

Instead: Always Include the Polyfill #

1<!-- Required for @page margin boxes -->
2<script src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>

Don't: Use Chrome's Header/Footer Options with Paged.js #

1# BAD - Conflicts with Paged.js headers/footers
2./cdp print output.pdf --header "<span>Header</span>" --footer "<span>Footer</span>"

Why it's wrong: Chrome's built-in header/footer options will overlay on top of Paged.js margin boxes.

Instead: Let Paged.js Handle Headers/Footers #

1# GOOD - Use default print settings, Paged.js CSS handles headers
2./cdp print output.pdf --paper letter

Don't: Break Inside Critical Elements #

1/* BAD - Figures may split awkwardly across pages */
2figure {
3  /* no break-inside rule */
4}

Why it's wrong: Content can break mid-figure, mid-table, or mid-card.

Instead: Prevent Breaking Inside #

1/* GOOD - Keep figures together */
2figure,
3table,
4.card,
5blockquote {
6  break-inside: avoid;
7}

Caveats #

Complete Example HTML #

 1<!DOCTYPE html>
 2<html lang="en">
 3<head>
 4  <meta charset="UTF-8">
 5  <title>Professional Document</title>
 6  <script src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>
 7
 8  <style>
 9    @page {
10      size: letter;
11      margin: 1in 0.75in;
12
13      @top-center {
14        content: "Document Title";
15        font-size: 10pt;
16        color: #666;
17      }
18
19      @bottom-center {
20        content: "Page " counter(page) " of " counter(pages);
21        font-size: 10pt;
22      }
23    }
24
25    @page :first {
26      @top-center { content: none; }
27    }
28
29    body {
30      font-family: Georgia, serif;
31      font-size: 12pt;
32      line-height: 1.6;
33    }
34
35    .title-page {
36      text-align: center;
37      padding-top: 3in;
38      break-after: page;
39    }
40
41    .chapter {
42      break-before: page;
43    }
44
45    .two-column {
46      column-count: 2;
47      column-gap: 0.5in;
48      column-rule: 1px solid #ddd;
49    }
50
51    p { break-inside: avoid; }
52    h2 { break-after: avoid; }
53  </style>
54
55  <script>
56    window.PagedConfig = {
57      after: (flow) => {
58        window.pagedRenderingComplete = true;
59        console.log('Rendered ' + flow.total + ' pages');
60      }
61    };
62  </script>
63</head>
64<body>
65  <div class="title-page">
66    <h1>Document Title</h1>
67    <p>Subtitle or Author</p>
68  </div>
69
70  <div class="chapter">
71    <h2>Chapter 1</h2>
72    <div class="two-column">
73      <p>Your content here...</p>
74    </div>
75  </div>
76</body>
77</html>

References #

[1] Paged.js GitHub Repository - Main repository with installation instructions and API documentation

[2] Paged.js Chrome Headless Issue #183 - Discussion of rendering timing issues with headless Chrome

[3] Paged.js Handlers and Hooks Documentation - Reference for lifecycle hooks (beforePreview, afterPageLayout, etc.)

[4] CSS Paged Media Module - MDN - W3C specification that Paged.js polyfills

[5] Mastering Paged.js - Doppio.sh - Practical tips for PDF generation

[6] CSS Multi-column Layout - MDN - Reference for column-count, column-gap, and fragmentation

[7] Paged Media Organization - Overview of margin boxes and running elements

last updated: