Summary #
The Chrome DevTools Protocol (CDP) is the low-level protocol that powers Chrome DevTools and serves as the foundation for libraries like Puppeteer and Playwright[1]. It communicates over WebSocket, allowing direct control of Chromium-based browsers through JSON-RPC messages[2]. While higher-level libraries provide convenience, using CDP directly gives you complete control with zero abstraction overhead.
For your use case (general automation, single browser, Chrome-only), CDP direct is a viable choice. It's particularly interesting because it can be driven from bash scripts using tools like websocat[3], making it accessible without any Node.js dependencies. The protocol is well-documented and stable for common operations like navigation, screenshots, and PDF generation[4].
Key metrics: CDP is the official protocol maintained by the Chrome DevTools team at Google. It's used internally by Puppeteer (90.3k GitHub stars, 6M+ weekly npm downloads) and Playwright (77k+ GitHub stars, 22M+ weekly npm downloads)[5][6].
Philosophy & Mental Model #
CDP operates on a domains and commands model[1]:
- Domains: Logical groupings of functionality (Page, DOM, Runtime, Network, etc.)
- Commands: Methods you call that return results (synchronous request/response)
- Events: Notifications pushed from the browser (asynchronous)
The mental model is simple: you're sending JSON messages over a WebSocket and receiving JSON responses. Every command has:
id: A unique integer you choose (responses echo this back)method: The domain and command (e.g.,Page.navigate)params: Optional parameters object
Browser <--WebSocket--> Your Script
JSON-RPC messages
Key abstractions:
- Browser: The top-level Chrome process, has its own WebSocket endpoint
- Targets: Things you can debug (pages, workers, service workers)
- Sessions: Connections to specific targets (each page gets a session)
Setup #
Option 1: Bash with websocat (Zero Dependencies) #
Install websocat (WebSocket client for command line):
1# macOS
2brew install websocat
3
4# Linux (download binary)
5wget https://github.com/vi/websocat/releases/download/v1.13.0/websocat.x86_64-unknown-linux-musl
6chmod +x websocat.x86_64-unknown-linux-musl
7sudo mv websocat.x86_64-unknown-linux-musl /usr/local/bin/websocat
8
9# Or with cargo
10cargo install websocat
You also need jq for JSON parsing:
1# macOS
2brew install jq
3
4# Linux
5apt install jq
Option 2: Node.js/TypeScript #
No special libraries needed—just the built-in ws package:
1pnpm add ws
2pnpm add -D @types/ws
Launch Chrome with Remote Debugging #
1# macOS
2"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
3 --remote-debugging-port=9222 \
4 --user-data-dir=/tmp/chrome-debug
5
6# Linux
7google-chrome --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
8
9# Headless mode
10google-chrome --headless --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
11
12# New headless (Chrome 112+) - recommended
13google-chrome --headless=new --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-debug
Core Usage Patterns #
Pattern 1: Bash - Get WebSocket URL and Navigate #
1#!/bin/bash
2set -e
3
4# Start Chrome headless (background)
5google-chrome --headless=new --remote-debugging-port=9222 --user-data-dir=/tmp/chrome-cdp &
6CHROME_PID=$!
7sleep 2 # Wait for Chrome to start
8
9# Get the WebSocket URL for a new page
10PAGE_INFO=$(curl -s http://127.0.0.1:9222/json/new)
11WS_URL=$(echo "$PAGE_INFO" | jq -r '.webSocketDebuggerUrl')
12echo "WebSocket URL: $WS_URL"
13
14# Navigate to a page
15echo '{"id":1,"method":"Page.navigate","params":{"url":"https://example.com"}}' | \
16 websocat -n1 "$WS_URL"
17
18# Wait for page to load
19sleep 2
20
21# Clean up
22kill $CHROME_PID
Pattern 2: Bash - Take a Screenshot #
1#!/bin/bash
2set -e
3
4WS_URL="$1" # Pass WebSocket URL as argument
5
6# Capture screenshot (returns base64)
7RESPONSE=$(echo '{"id":1,"method":"Page.captureScreenshot","params":{"format":"png"}}' | \
8 websocat -n1 "$WS_URL")
9
10# Extract base64 data and decode to file
11echo "$RESPONSE" | jq -r '.result.data' | base64 -d > screenshot.png
12
13echo "Screenshot saved to screenshot.png"
Pattern 3: Bash - Get Page HTML #
1#!/bin/bash
2set -e
3
4WS_URL="$1"
5
6# Get the document root node
7DOC_RESPONSE=$(echo '{"id":1,"method":"DOM.getDocument","params":{"depth":-1}}' | \
8 websocat -n1 "$WS_URL")
9
10ROOT_NODE_ID=$(echo "$DOC_RESPONSE" | jq -r '.result.root.nodeId')
11
12# Get outer HTML of the entire document
13HTML_RESPONSE=$(echo "{\"id\":2,\"method\":\"DOM.getOuterHTML\",\"params\":{\"nodeId\":$ROOT_NODE_ID}}" | \
14 websocat -n1 "$WS_URL")
15
16echo "$HTML_RESPONSE" | jq -r '.result.outerHTML'
Pattern 4: Bash - Run JavaScript (evaluate) #
1#!/bin/bash
2set -e
3
4WS_URL="$1"
5JS_EXPRESSION="$2"
6
7# Enable Runtime domain first (required for evaluate)
8echo '{"id":1,"method":"Runtime.enable"}' | websocat -n1 "$WS_URL" > /dev/null
9
10# Run JavaScript expression
11RESPONSE=$(echo "{\"id\":2,\"method\":\"Runtime.evaluate\",\"params\":{\"expression\":\"$JS_EXPRESSION\",\"returnByValue\":true}}" | \
12 websocat -n1 "$WS_URL")
13
14echo "$RESPONSE" | jq '.result.result.value'
Usage:
1./evaluate.sh "$WS_URL" "document.title"
2./evaluate.sh "$WS_URL" "document.querySelectorAll('a').length"
3./evaluate.sh "$WS_URL" "Array.from(document.querySelectorAll('h1')).map(h => h.textContent)"
Pattern 5: Bash - Generate PDF #
1#!/bin/bash
2set -e
3
4WS_URL="$1"
5OUTPUT="${2:-output.pdf}"
6
7# Generate PDF (returns base64)
8RESPONSE=$(echo '{"id":1,"method":"Page.printToPDF","params":{"printBackground":true,"format":"A4"}}' | \
9 websocat -n1 "$WS_URL")
10
11# Decode and save
12echo "$RESPONSE" | jq -r '.result.data' | base64 -d > "$OUTPUT"
13
14echo "PDF saved to $OUTPUT"
Pattern 6: Bash - Query Selectors #
1#!/bin/bash
2set -e
3
4WS_URL="$1"
5SELECTOR="$2"
6
7# Get document first
8DOC=$(echo '{"id":1,"method":"DOM.getDocument"}' | websocat -n1 "$WS_URL")
9ROOT_ID=$(echo "$DOC" | jq -r '.result.root.nodeId')
10
11# Query selector
12QUERY=$(echo "{\"id\":2,\"method\":\"DOM.querySelector\",\"params\":{\"nodeId\":$ROOT_ID,\"selector\":\"$SELECTOR\"}}" | \
13 websocat -n1 "$WS_URL")
14
15NODE_ID=$(echo "$QUERY" | jq -r '.result.nodeId')
16
17if [ "$NODE_ID" = "0" ] || [ "$NODE_ID" = "null" ]; then
18 echo "Element not found"
19 exit 1
20fi
21
22# Get the HTML of the matched element
23HTML=$(echo "{\"id\":3,\"method\":\"DOM.getOuterHTML\",\"params\":{\"nodeId\":$NODE_ID}}" | \
24 websocat -n1 "$WS_URL")
25
26echo "$HTML" | jq -r '.result.outerHTML'
Pattern 7: TypeScript - Full Example #
1import WebSocket from 'ws';
2
3interface CDPResponse {
4 id: number;
5 result?: Record<string, unknown>;
6 error?: { code: number; message: string };
7}
8
9class CDPClient {
10 private ws: WebSocket;
11 private messageId = 0;
12 private pending = new Map<number, {
13 resolve: (value: CDPResponse) => void;
14 reject: (error: Error) => void;
15 }>();
16
17 private constructor(ws: WebSocket) {
18 this.ws = ws;
19 this.ws.on('message', (data) => {
20 const msg = JSON.parse(data.toString()) as CDPResponse;
21 if (msg.id !== undefined) {
22 const handler = this.pending.get(msg.id);
23 if (handler) {
24 this.pending.delete(msg.id);
25 if (msg.error) {
26 handler.reject(new Error(msg.error.message));
27 } else {
28 handler.resolve(msg);
29 }
30 }
31 }
32 });
33 }
34
35 static async connect(wsUrl: string): Promise<CDPClient> {
36 const ws = new WebSocket(wsUrl);
37 await new Promise<void>((resolve, reject) => {
38 ws.once('open', resolve);
39 ws.once('error', reject);
40 });
41 return new CDPClient(ws);
42 }
43
44 async send(method: string, params: Record<string, unknown> = {}): Promise<CDPResponse> {
45 const id = ++this.messageId;
46 return new Promise((resolve, reject) => {
47 this.pending.set(id, { resolve, reject });
48 this.ws.send(JSON.stringify({ id, method, params }));
49 });
50 }
51
52 close() {
53 this.ws.close();
54 }
55}
56
57// Usage
58async function main() {
59 // Get WebSocket URL from Chrome's JSON endpoint
60 const response = await fetch('http://127.0.0.1:9222/json/new');
61 const { webSocketDebuggerUrl } = await response.json();
62
63 const client = await CDPClient.connect(webSocketDebuggerUrl);
64
65 // Navigate
66 await client.send('Page.navigate', { url: 'https://example.com' });
67 await new Promise(r => setTimeout(r, 2000)); // Wait for load
68
69 // Take screenshot
70 const screenshot = await client.send('Page.captureScreenshot', { format: 'png' });
71 const imageData = Buffer.from(screenshot.result!.data as string, 'base64');
72 await Bun.write('screenshot.png', imageData); // or fs.writeFileSync
73
74 // Get HTML via evaluate (simpler than DOM methods)
75 const html = await client.send('Runtime.evaluate', {
76 expression: 'document.documentElement.outerHTML',
77 returnByValue: true
78 });
79 console.log('HTML length:', (html.result!.result as any).value.length);
80
81 // Run any JavaScript
82 const title = await client.send('Runtime.evaluate', {
83 expression: 'document.title',
84 returnByValue: true
85 });
86 console.log('Title:', (title.result!.result as any).value);
87
88 // Generate PDF
89 const pdf = await client.send('Page.printToPDF', {
90 printBackground: true,
91 format: 'A4'
92 });
93 const pdfData = Buffer.from(pdf.result!.data as string, 'base64');
94 await Bun.write('page.pdf', pdfData);
95
96 client.close();
97}
98
99main().catch(console.error);
Pattern 8: Complete Bash Automation Script #
1#!/bin/bash
2# chrome-automate.sh - Complete browser automation in bash
3set -e
4
5CHROME_PORT=9222
6CHROME_PID=""
7WS_URL=""
8
9# Start Chrome
10start_chrome() {
11 local headless="${1:-true}"
12 local url="${2:-about:blank}"
13
14 local headless_flag=""
15 if [ "$headless" = "true" ]; then
16 headless_flag="--headless=new"
17 fi
18
19 google-chrome $headless_flag \
20 --remote-debugging-port=$CHROME_PORT \
21 --user-data-dir=/tmp/chrome-cdp-$$ \
22 --no-first-run \
23 --no-default-browser-check \
24 "$url" &
25 CHROME_PID=$!
26
27 # Wait for CDP to be ready
28 for i in {1..30}; do
29 if curl -s "http://127.0.0.1:$CHROME_PORT/json/version" > /dev/null 2>&1; then
30 break
31 fi
32 sleep 0.1
33 done
34}
35
36# Get WebSocket URL for first page
37get_page_ws() {
38 curl -s "http://127.0.0.1:$CHROME_PORT/json/list" | jq -r '.[0].webSocketDebuggerUrl'
39}
40
41# Create new page and get its WebSocket URL
42new_page() {
43 curl -s "http://127.0.0.1:$CHROME_PORT/json/new" | jq -r '.webSocketDebuggerUrl'
44}
45
46# Send CDP command
47cdp() {
48 local ws_url="$1"
49 local method="$2"
50 local params="${3:-{}}"
51
52 local id=$RANDOM
53 echo "{\"id\":$id,\"method\":\"$method\",\"params\":$params}" | websocat -n1 "$ws_url"
54}
55
56# Navigate to URL
57navigate() {
58 local ws_url="$1"
59 local url="$2"
60 cdp "$ws_url" "Page.navigate" "{\"url\":\"$url\"}"
61}
62
63# Wait for page load (simple version)
64wait_load() {
65 sleep "${1:-2}"
66}
67
68# Take screenshot
69screenshot() {
70 local ws_url="$1"
71 local output="${2:-screenshot.png}"
72 local format="${output##*.}"
73
74 cdp "$ws_url" "Page.captureScreenshot" "{\"format\":\"$format\"}" | \
75 jq -r '.result.data' | base64 -d > "$output"
76}
77
78# Generate PDF
79pdf() {
80 local ws_url="$1"
81 local output="${2:-page.pdf}"
82
83 cdp "$ws_url" "Page.printToPDF" '{"printBackground":true}' | \
84 jq -r '.result.data' | base64 -d > "$output"
85}
86
87# Get page HTML
88get_html() {
89 local ws_url="$1"
90 cdp "$ws_url" "Runtime.evaluate" '{"expression":"document.documentElement.outerHTML","returnByValue":true}' | \
91 jq -r '.result.result.value'
92}
93
94# Run JavaScript
95evaluate() {
96 local ws_url="$1"
97 local expression="$2"
98 cdp "$ws_url" "Runtime.evaluate" "{\"expression\":$(echo "$expression" | jq -Rs .),\"returnByValue\":true}" | \
99 jq -r '.result.result.value'
100}
101
102# Query selector and get text
103query_text() {
104 local ws_url="$1"
105 local selector="$2"
106 evaluate "$ws_url" "document.querySelector('$selector')?.textContent"
107}
108
109# Close browser
110close_chrome() {
111 if [ -n "$CHROME_PID" ]; then
112 kill "$CHROME_PID" 2>/dev/null || true
113 fi
114}
115
116# Cleanup on exit
117trap close_chrome EXIT
118
119# ============ EXAMPLE USAGE ============
120if [ "${BASH_SOURCE[0]}" = "$0" ]; then
121 echo "Starting Chrome..."
122 start_chrome true
123
124 WS_URL=$(get_page_ws)
125 echo "WebSocket: $WS_URL"
126
127 echo "Navigating to example.com..."
128 navigate "$WS_URL" "https://example.com"
129 wait_load 2
130
131 echo "Page title: $(evaluate "$WS_URL" 'document.title')"
132
133 echo "Taking screenshot..."
134 screenshot "$WS_URL" "example.png"
135
136 echo "Generating PDF..."
137 pdf "$WS_URL" "example.pdf"
138
139 echo "Getting HTML..."
140 get_html "$WS_URL" > example.html
141
142 echo "Done! Created: example.png, example.pdf, example.html"
143fi
Anti-Patterns & Pitfalls #
Don't: Forget to Wait for Navigation #
1# BAD - screenshot might capture blank page
2navigate "$WS_URL" "https://example.com"
3screenshot "$WS_URL" "bad.png"
Why it's wrong: Page.navigate returns immediately when navigation starts, not when the page is loaded.
Instead: Wait for Load Event or Use Timeout #
1# GOOD - wait for page to load
2navigate "$WS_URL" "https://example.com"
3sleep 2 # Simple but effective
4screenshot "$WS_URL" "good.png"
5
6# BETTER - wait for specific condition via evaluate
7while [ "$(evaluate "$WS_URL" 'document.readyState')" != "complete" ]; do
8 sleep 0.1
9done
Don't: Use DOM Methods for Simple HTML Extraction #
1# BAD - complex chain of commands
2cdp "$WS_URL" "DOM.getDocument" '{"depth":-1}'
3# Then extract nodeId, then getOuterHTML...
Why it's wrong: DOM methods require multiple round trips and node ID management.
Instead: Use Runtime.evaluate #
1# GOOD - single command
2evaluate "$WS_URL" "document.documentElement.outerHTML"
3evaluate "$WS_URL" "document.querySelector('h1').textContent"
Don't: Hardcode Message IDs in Production #
1# BAD - will break with concurrent requests
2echo '{"id":1,"method":"Page.navigate"...}'
3echo '{"id":1,"method":"Page.captureScreenshot"...}' # Same ID!
Why it's wrong: If you send concurrent requests with the same ID, you can't match responses.
Instead: Generate Unique IDs #
1# GOOD
2cdp() {
3 local id=$RANDOM # Or use incrementing counter
4 echo "{\"id\":$id,\"method\":\"$method\"...}"
5}
Don't: Forget Error Handling #
1# BAD - ignores errors
2RESPONSE=$(cdp "$WS_URL" "Page.navigate" '{"url":"invalid"}')
3echo "Success!"
Instead: Check for Errors #
1# GOOD
2RESPONSE=$(cdp "$WS_URL" "Page.navigate" '{"url":"https://example.com"}')
3ERROR=$(echo "$RESPONSE" | jq -r '.error.message // empty')
4if [ -n "$ERROR" ]; then
5 echo "Error: $ERROR" >&2
6 exit 1
7fi
Don't: Leave Chrome Processes Running #
1# BAD - orphaned Chrome process
2start_chrome
3navigate "$WS_URL" "https://example.com"
4# Script ends without cleanup
Instead: Use Trap for Cleanup #
1# GOOD
2trap close_chrome EXIT
3start_chrome
4# ... do work ...
5# Chrome is automatically killed on exit
Caveats #
-
No Auto-Wait: Unlike Puppeteer/Playwright, CDP doesn't wait for elements. You must implement waiting logic yourself (polling
Runtime.evaluateor listening for events). -
Protocol Stability: The "tot" (tip-of-tree) protocol can change. For stability, reference a specific Chrome version's protocol or stick to well-established commands like
Page.navigate,Page.captureScreenshot, andRuntime.evaluate[4]. -
Headless Mode: Chrome's new headless mode (
--headless=new, Chrome 112+) behaves identically to headed Chrome. The old headless (--headlessor--headless=old) is a separate implementation with some differences[7]. -
Session Management: For complex scenarios with multiple tabs, you need to manage sessions explicitly. Each
Target.attachToTargetcreates a session with its own message ID namespace[2]. -
Event Handling in Bash: The bash approach works well for sequential scripts but handling async events (like network requests) is awkward. For event-driven automation, use Node.js/TypeScript.
-
PDF Generation:
Page.printToPDFonly works in headless mode. In headed mode, it will fail[4]. -
Base64 Size: Screenshots and PDFs are returned as base64, which is ~33% larger than binary. For large pages, responses can be several MB of JSON.
References #
[1] Chrome DevTools Protocol - Official protocol documentation
[2] Getting Started With Chrome DevTools Protocol - Excellent tutorial by Puppeteer maintainer
[3] Chrome DevTools Remote Control Using Linux Bash - Bash + websocat examples
[4] CDP Page Domain - Navigate, screenshot, PDF commands
[5] Puppeteer npm - npm package stats
[6] Playwright npm trends - npm download stats
[7] Chrome Headless Mode - Official headless documentation
[8] websocat GitHub - WebSocket CLI tool
[9] CDP Runtime Domain - JavaScript evaluation
[10] CDP DOM Domain - DOM manipulation commands