[{"data":1,"prerenderedAt":3623},["ShallowReactive",2],{"/blog/agent-browser-vs-puppeteer-and-playwright":3,"related-/blog/agent-browser-vs-puppeteer-and-playwright":1254},{"id":4,"title":5,"authorId":6,"body":7,"category":1210,"created":1211,"description":1212,"extension":1213,"faqs":1214,"featurePriority":1230,"head":1231,"landingPath":1231,"meta":1232,"navigation":751,"ogImage":1231,"path":1246,"robots":1231,"schemaOrg":1231,"seo":1247,"sitemap":1248,"stem":1249,"tags":1250,"__hash__":1253},"blog/blog/1038.agent-browser-vs-puppeteer-and-playwright.md","Agent Browser vs Puppeteer & Playwright","salome-koshadze",{"type":8,"value":9,"toc":1196},"minimark",[10,22,25,54,72,75,84,87,92,95,113,127,130,135,139,151,154,195,273,276,298,310,314,317,321,325,336,339,342,559,562,566,578,587,607,616,835,838,842,852,869,871,930,934,937,940,944,948,951,1079,1091,1094,1098,1102,1105,1109,1121,1124,1128,1132,1135,1179,1182,1186,1192],[11,12,13,14,18,19],"p",{},"Choosing between ",[15,16,17],"strong",{},"agent-browser, Puppeteer, and Playwright"," comes down to one question: ",[15,20,21],{},"do you need deterministic browser automation, or an AI agent that must interpret and adapt to live web pages?",[11,23,24],{},"If you want the fastest decision:",[26,27,28,38,46],"ul",{},[29,30,31,34,35],"li",{},[15,32,33],{},"Best overall for most teams:"," ",[15,36,37],{},"Playwright",[29,39,40,34,43],{},[15,41,42],{},"Best for Chromium-only scripting:",[15,44,45],{},"Puppeteer",[29,47,48,34,51],{},[15,49,50],{},"Best for LLM-driven agent workflows:",[15,52,53],{},"Vercel agent-browser",[11,55,56,57,59,60,62,63,65,66,71],{},"Use ",[15,58,37],{}," for testing, cross-browser automation, and repeatable multi-step workflows. Use ",[15,61,45],{}," for lighter Chromium automation with direct CDP-oriented control. Use ",[15,64,53],{}," when compact page state, persistent sessions, and ref-based actions matter more than classic test-runner ergonomics. For a deeper two-tool breakdown, see ",[67,68,70],"a",{"href":69},"/blog/playwright-vs-puppeteer-which-is-better-for-ai-agent-control","Playwright vs Puppeteer for AI Agents",".",[11,73,74],{},"At a glance, the three tools split along two axes - how much low-level control you get vs. how easy it is to set up and operate:",[76,77],"nuxt-picture",{":width":78,"alt":79,"format":80,"loading":81,"src":82,"provider":83},"900","Quadrant chart positioning Puppeteer and Playwright in the developer zone and agent-browser in the business-user zone","webp","eager","/blog/agent-browser-vs-puppeteer-and-playwright/1.svg","none",[11,85,86],{},"The quadrant makes the tradeoff concrete: Puppeteer and Playwright give you deeper control at the cost of more setup, while agent-browser trades some of that control for a managed, LLM-friendly workflow. The rest of this guide works through that split across browser support, architecture, developer ergonomics, LLM-readiness, and real-world tradeoffs, with a small token-count benchmark to make the page-state discussion concrete.",[88,89,91],"h2",{"id":90},"what-changed-in-2026","What changed in 2026",[11,93,94],{},"The comparison is more interesting in 2026 than it was a year ago because the tools are converging in some areas and diverging in others:",[26,96,97,102,107],{},[29,98,99,101],{},[15,100,37],{}," is still the strongest default for engineering teams because it combines cross-browser coverage, mature debugging, and increasingly useful structured page-state features.",[29,103,104,106],{},[15,105,45],{}," remains relevant because many automation workloads are still Chromium-first, and direct CDP-oriented control is often enough.",[29,108,109,112],{},[15,110,111],{},"agent-browser"," matters because more teams are building agent loops instead of fixed scripts, which changes the value of compact snapshots, persistent sessions, and ref-based actions.",[11,114,115,116,119,120,123,124,71],{},"The decision now comes down to what you're optimising for: ",[15,117,118],{},"test reliability",", ",[15,121,122],{},"Chromium scripting simplicity",", or ",[15,125,126],{},"LLM-driven adaptability",[11,128,129],{},"Zooming out, each of these tools reflects a distinct era in how the web has been automated - from raw Chromium scripting to cross-browser testing to the LLM-driven agents showing up now:",[76,131],{":width":78,"alt":132,"format":80,"loading":133,"src":134,"provider":83},"Timeline of web automation evolution: Puppeteer (2017), Playwright (2020), LLM surge (2022), anti-bot peak (2024), agent-browser (2026)","lazy","/blog/agent-browser-vs-puppeteer-and-playwright/6.svg",[88,136,138],{"id":137},"token-benchmark-same-page-three-output-formats","Token benchmark: same page, three output formats",[11,140,141,142,146,147,150],{},"To ground the token-efficiency discussion in something reproducible, I measured the representative contact-form example used later in this article with the ",[143,144,145],"code",{},"cl100k_base"," tokenizer family commonly used for GPT-4-class workflows. This is a ",[15,148,149],{},"small serialization benchmark",": it measures how much page state each format produces on one page. Navigation speed, task success rate, and anti-bot outcomes are out of scope.",[11,152,153],{},"The benchmark uses each tool's documented output style:",[26,155,156,170,182],{},[29,157,158,160,161,71],{},[15,159,45],{},": ",[67,162,166,169],{"href":163,"rel":164},"https://pptr.dev/api/puppeteer.page.content",[165],"nofollow",[143,167,168],{},"page.content()"," returns the full HTML contents of the page, including the DOCTYPE",[29,171,172,160,174,71],{},[15,173,37],{},[67,175,178,181],{"href":176,"rel":177},"https://playwright.dev/docs/api/class-locator#locator-aria-snapshot",[165],[143,179,180],{},"locator.ariaSnapshot()"," returns a YAML ARIA snapshot, and AI mode can include element refs",[29,183,184,186,187,71],{},[15,185,53],{},": the official repository documents ",[67,188,191,194],{"href":189,"rel":190},"https://github.com/vercel-labs/agent-browser",[165],[143,192,193],{},"snapshot"," as an accessibility tree with refs and a daemon-backed workflow",[196,197,198,219],"table",{},[199,200,201],"thead",{},[202,203,204,209,213,216],"tr",{},[205,206,208],"th",{"align":207},"left","Output format",[205,210,212],{"align":211},"right","Measured characters",[205,214,215],{"align":211},"Measured tokens",[205,217,218],{"align":207},"What it shows",[220,221,222,239,256],"tbody",{},[202,223,224,230,233,236],{},[225,226,227,228],"td",{"align":207},"Puppeteer ",[143,229,168],{},[225,231,232],{"align":211},"927",[225,234,235],{"align":211},"242",[225,237,238],{"align":207},"Richest raw output, but also the heaviest because it serializes page markup rather than an agent-oriented view",[202,240,241,247,250,253],{},[225,242,243,244],{"align":207},"Playwright ",[143,245,246],{},"ariaSnapshot()",[225,248,249],{"align":211},"184",[225,251,252],{"align":211},"50",[225,254,255],{"align":207},"Much smaller structured page state; good fit when you want an accessibility-first snapshot inside a broader automation stack",[202,257,258,264,267,270],{},[225,259,260,261],{"align":207},"agent-browser ",[143,262,263],{},"snapshot -i",[225,265,266],{"align":211},"149",[225,268,269],{"align":211},"49",[225,271,272],{"align":207},"Similar compactness on this simple page, with direct refs for follow-up actions",[11,274,275],{},"What this benchmark supports:",[26,277,278,285,295],{},[29,279,280,281,284],{},"On this controlled example, ",[15,282,283],{},"raw HTML was about 4.8x heavier than Playwright's snapshot and about 4.9x heavier than agent-browser's snapshot"," by token count.",[29,286,287,290,291,294],{},[15,288,289],{},"Playwright and agent-browser were nearly identical on this small form",", so the more important difference here is workflow shape: Playwright gives you structured page state inside a testing/automation library, while agent-browser pairs compact snapshots with direct ",[143,292,293],{},"@ref"," actions and persistent sessions.",[29,296,297],{},"The gap can widen or narrow on real sites depending on hidden DOM size, framework markup, iframes, and whether you snapshot the whole page or only interactive elements.",[11,299,300,301,305,306,71],{},"Treat this as a single data point: on this page, raw HTML was the heaviest format to send to an LLM, and structured accessibility snapshots were far more compact. Real sites will vary. For the broader theory behind compact page-state representations, see ",[67,302,304],{"href":303},"/blog/snapshots-provide-llms-with-website-state","Snapshots Provide LLMs With Website State"," and ",[67,307,309],{"href":308},"/blog/dom-downsampling-for-llm-based-web-agents","DOM Downsampling for LLM-Based Web Agents",[88,311,313],{"id":312},"the-3-tools","The 3 tools",[11,315,316],{},"The three tools solve related problems, but they are optimized for different workflow shapes.",[318,319,45],"h3",{"id":320},"puppeteer",[76,322],{":width":78,"alt":323,"format":80,"loading":133,"src":324},"Puppeteer GitHub repository","/blog/agent-browser-vs-puppeteer-and-playwright/puppeteer-github.png",[11,326,327,328,330,331,71],{},"Google launched ",[15,329,45],{}," in 2017 as a Node.js library designed for controlling Chrome and Chromium browsers. It operates through the Chrome DevTools Protocol (CDP), which gives developers low-level access to browser behavior. In practice it remains mostly associated with Chromium automation, though Firefox support also exists. Its API is exclusive to JavaScript and TypeScript development environments, as reflected in the ",[67,332,335],{"href":333,"rel":334},"https://pptr.dev/",[165],"official Puppeteer docs",[11,337,338],{},"The tool has a sizable community, with roughly 94,000 GitHub stars as of April 2026. Puppeteer is widely used for web scraping, PDF generation, screenshots, and repeatable Chromium-first automation routines. Its main strengths are maturity and direct browser control, especially when you do not need cross-browser coverage.",[11,340,341],{},"Example:",[343,344,349],"pre",{"className":345,"code":346,"language":347,"meta":348,"style":348},"language-javascript shiki shiki-themes catppuccin-latte night-owl","const puppeteer = require(\"puppeteer\");\n(async () => {\n  const browser = await puppeteer.launch();\n  const page = await browser.newPage();\n  await page.goto(\"https://example.com\");\n  const html = await page.content();\n  console.log(html);\n  await browser.close();\n})();\n","javascript","",[143,350,351,392,411,439,462,488,511,532,548],{"__ignoreMap":348},[352,353,356,360,364,368,372,376,380,383,385,388],"span",{"class":354,"line":355},"line",1,[352,357,359],{"class":358},"s76yb","const",[352,361,363],{"class":362},"scsc5"," puppeteer",[352,365,367],{"class":366},"s-_ek"," =",[352,369,371],{"class":370},"sNstc"," require",[352,373,375],{"class":374},"s2kId","(",[352,377,379],{"class":378},"sbuKk","\"",[352,381,320],{"class":382},"sfrMT",[352,384,379],{"class":378},[352,386,387],{"class":374},")",[352,389,391],{"class":390},"scGhl",";\n",[352,393,395,397,401,405,408],{"class":354,"line":394},2,[352,396,375],{"class":374},[352,398,400],{"class":399},"srhcd","async",[352,402,404],{"class":403},"sMtgK"," ()",[352,406,407],{"class":358}," =>",[352,409,410],{"class":390}," {\n",[352,412,414,417,420,422,425,428,431,434,437],{"class":354,"line":413},3,[352,415,416],{"class":358},"  const",[352,418,419],{"class":362}," browser",[352,421,367],{"class":366},[352,423,424],{"class":399}," await",[352,426,363],{"class":427},"sP4PM",[352,429,71],{"class":430},"s5FwJ",[352,432,433],{"class":370},"launch",[352,435,436],{"class":374},"()",[352,438,391],{"class":390},[352,440,442,444,447,449,451,453,455,458,460],{"class":354,"line":441},4,[352,443,416],{"class":358},[352,445,446],{"class":362}," page",[352,448,367],{"class":366},[352,450,424],{"class":399},[352,452,419],{"class":427},[352,454,71],{"class":430},[352,456,457],{"class":370},"newPage",[352,459,436],{"class":374},[352,461,391],{"class":390},[352,463,465,468,470,472,475,477,479,482,484,486],{"class":354,"line":464},5,[352,466,467],{"class":399},"  await",[352,469,446],{"class":427},[352,471,71],{"class":430},[352,473,474],{"class":370},"goto",[352,476,375],{"class":374},[352,478,379],{"class":378},[352,480,481],{"class":382},"https://example.com",[352,483,379],{"class":378},[352,485,387],{"class":374},[352,487,391],{"class":390},[352,489,491,493,496,498,500,502,504,507,509],{"class":354,"line":490},6,[352,492,416],{"class":358},[352,494,495],{"class":362}," html",[352,497,367],{"class":366},[352,499,424],{"class":399},[352,501,446],{"class":427},[352,503,71],{"class":430},[352,505,506],{"class":370},"content",[352,508,436],{"class":374},[352,510,391],{"class":390},[352,512,514,517,519,522,524,528,530],{"class":354,"line":513},7,[352,515,516],{"class":427},"  console",[352,518,71],{"class":430},[352,520,521],{"class":370},"log",[352,523,375],{"class":374},[352,525,527],{"class":526},"soAP-","html",[352,529,387],{"class":374},[352,531,391],{"class":390},[352,533,535,537,539,541,544,546],{"class":354,"line":534},8,[352,536,467],{"class":399},[352,538,419],{"class":427},[352,540,71],{"class":430},[352,542,543],{"class":370},"close",[352,545,436],{"class":374},[352,547,391],{"class":390},[352,549,551,554,557],{"class":354,"line":550},9,[352,552,553],{"class":390},"}",[352,555,556],{"class":374},")()",[352,558,391],{"class":390},[318,560,37],{"id":561},"playwright",[76,563],{":width":78,"alt":564,"format":80,"loading":133,"src":565},"Playwright GitHub repository","/blog/agent-browser-vs-puppeteer-and-playwright/playwright-github.png",[11,567,568,569,571,572,577],{},"Microsoft introduced ",[15,570,37],{}," in 2020 to provide a unified API for browser automation and testing across multiple browser engines. Its ",[67,573,576],{"href":574,"rel":575},"https://playwright.dev/",[165],"official docs"," support Chromium, Firefox, and WebKit, allowing developers to run similar workflows across the major browser families. Playwright also extends beyond JavaScript and TypeScript to Python, Java, and .NET.",[579,580],"article-cheatsheet-card",{"description":581,"href":582,"image":583,"imageAlt":584,"label":585,"title":586},"Quick reference for Playwright primitives, locators, auto-waiting, tracing, and browser contexts.","/playwright-cheat-sheet","/misc/playwright-cheatsheet.png","Playwright Cheat Sheet preview","Cheat Sheet","Playwright Cheat Sheet",[11,588,589,590,119,595,600,601,606],{},"Playwright has grown quickly, with about 85,000 GitHub stars as of April 2026. Its major capabilities include ",[67,591,594],{"href":592,"rel":593},"https://playwright.dev/docs/actionability",[165],"auto-waiting and actionability checks",[67,596,599],{"href":597,"rel":598},"https://playwright.dev/docs/browser-contexts",[165],"browser contexts"," for session isolation, ",[67,602,605],{"href":603,"rel":604},"https://playwright.dev/docs/trace-viewer",[165],"tracing",", a built-in test runner, and extensive device emulation. That combination makes it a common default for end-to-end testing, complex browser workflows, and cross-browser automation.",[11,608,609,610,612,613,615],{},"For AI-oriented workflows, ",[143,611,246],{}," is usually a better fit than ",[143,614,168],{}," because it returns a compact accessibility tree instead of raw HTML.",[343,617,619],{"className":345,"code":618,"language":347,"meta":348,"style":348},"const { chromium } = require(\"playwright\");\n(async () => {\n  const browser = await chromium.launch();\n  const page = await browser.newPage();\n  await page.goto(\"https://example.com\", { waitUntil: \"networkidle\" });\n\n  const snapshot = await page.locator(\"body\").ariaSnapshot();\n  console.log(snapshot);\n\n  await browser.close();\n})();\n",[143,620,621,651,663,683,703,747,753,791,807,811,826],{"__ignoreMap":348},[352,622,623,625,629,632,635,637,639,641,643,645,647,649],{"class":354,"line":355},[352,624,359],{"class":358},[352,626,628],{"class":627},"sgNGR"," {",[352,630,631],{"class":362}," chromium",[352,633,634],{"class":627}," }",[352,636,367],{"class":366},[352,638,371],{"class":370},[352,640,375],{"class":374},[352,642,379],{"class":378},[352,644,561],{"class":382},[352,646,379],{"class":378},[352,648,387],{"class":374},[352,650,391],{"class":390},[352,652,653,655,657,659,661],{"class":354,"line":394},[352,654,375],{"class":374},[352,656,400],{"class":399},[352,658,404],{"class":403},[352,660,407],{"class":358},[352,662,410],{"class":390},[352,664,665,667,669,671,673,675,677,679,681],{"class":354,"line":413},[352,666,416],{"class":358},[352,668,419],{"class":362},[352,670,367],{"class":366},[352,672,424],{"class":399},[352,674,631],{"class":427},[352,676,71],{"class":430},[352,678,433],{"class":370},[352,680,436],{"class":374},[352,682,391],{"class":390},[352,684,685,687,689,691,693,695,697,699,701],{"class":354,"line":441},[352,686,416],{"class":358},[352,688,446],{"class":362},[352,690,367],{"class":366},[352,692,424],{"class":399},[352,694,419],{"class":427},[352,696,71],{"class":430},[352,698,457],{"class":370},[352,700,436],{"class":374},[352,702,391],{"class":390},[352,704,705,707,709,711,713,715,717,719,721,724,726,729,733,736,739,741,743,745],{"class":354,"line":464},[352,706,467],{"class":399},[352,708,446],{"class":427},[352,710,71],{"class":430},[352,712,474],{"class":370},[352,714,375],{"class":374},[352,716,379],{"class":378},[352,718,481],{"class":382},[352,720,379],{"class":378},[352,722,723],{"class":390},",",[352,725,628],{"class":390},[352,727,728],{"class":374}," waitUntil",[352,730,732],{"class":731},"sVS64",":",[352,734,735],{"class":378}," \"",[352,737,738],{"class":382},"networkidle",[352,740,379],{"class":378},[352,742,634],{"class":390},[352,744,387],{"class":374},[352,746,391],{"class":390},[352,748,749],{"class":354,"line":490},[352,750,752],{"emptyLinePlaceholder":751},true,"\n",[352,754,755,757,760,762,764,766,768,771,773,775,778,780,782,784,787,789],{"class":354,"line":513},[352,756,416],{"class":358},[352,758,759],{"class":362}," snapshot",[352,761,367],{"class":366},[352,763,424],{"class":399},[352,765,446],{"class":427},[352,767,71],{"class":430},[352,769,770],{"class":370},"locator",[352,772,375],{"class":374},[352,774,379],{"class":378},[352,776,777],{"class":382},"body",[352,779,379],{"class":378},[352,781,387],{"class":374},[352,783,71],{"class":430},[352,785,786],{"class":370},"ariaSnapshot",[352,788,436],{"class":374},[352,790,391],{"class":390},[352,792,793,795,797,799,801,803,805],{"class":354,"line":534},[352,794,516],{"class":427},[352,796,71],{"class":430},[352,798,521],{"class":370},[352,800,375],{"class":374},[352,802,193],{"class":526},[352,804,387],{"class":374},[352,806,391],{"class":390},[352,808,809],{"class":354,"line":550},[352,810,752],{"emptyLinePlaceholder":751},[352,812,814,816,818,820,822,824],{"class":354,"line":813},10,[352,815,467],{"class":399},[352,817,419],{"class":427},[352,819,71],{"class":430},[352,821,543],{"class":370},[352,823,436],{"class":374},[352,825,391],{"class":390},[352,827,829,831,833],{"class":354,"line":828},11,[352,830,553],{"class":390},[352,832,556],{"class":374},[352,834,391],{"class":390},[318,836,53],{"id":837},"vercel-agent-browser",[76,839],{":width":78,"alt":840,"format":80,"loading":133,"src":841},"Vercel agent-browser GitHub repository","/blog/agent-browser-vs-puppeteer-and-playwright/agent-browser-github.png",[11,843,844,846,847,851],{},[15,845,53],{}," is a Rust-based command-line interface (CLI) and daemon designed for AI-agent workflows. It is under active development, with 0.25.x releases available as of April 2026. The ",[67,848,850],{"href":189,"rel":849},[165],"official repository"," describes a client-daemon architecture that communicates via CDP and does not require Node.js for the daemon, while still being able to use existing browser installations.",[11,853,854,855,860,861,864,865,868],{},"The tool's main innovation lies in its ",[67,856,858],{"href":189,"rel":857},[165],[143,859,193],{}," capability. This command generates an accessibility tree and assigns ",[143,862,863],{},"@refs"," (for example ",[143,866,867],{},"@e1",") to interactive elements, which makes follow-up actions easier for an LLM to map back to the page. Optional annotated screenshots provide visual context. It's newer than Puppeteer or Playwright, and positioned squarely around LLM-driven browser workflows.",[11,870,341],{},[343,872,876],{"className":873,"code":874,"language":875,"meta":348,"style":348},"language-bash shiki shiki-themes catppuccin-latte night-owl","# Terminal/LLM shell commands\nagent-browser open https://example.com\nagent-browser snapshot -i   # Returns refs for interactive elements\nagent-browser click @e2     # Uses a ref from the latest snapshot\nagent-browser screenshot result.png\n","bash",[143,877,878,884,894,907,920],{"__ignoreMap":348},[352,879,880],{"class":354,"line":355},[352,881,883],{"class":882},"sDmS1","# Terminal/LLM shell commands\n",[352,885,886,888,891],{"class":354,"line":394},[352,887,111],{"class":370},[352,889,890],{"class":382}," open",[352,892,893],{"class":382}," https://example.com\n",[352,895,896,898,900,904],{"class":354,"line":413},[352,897,111],{"class":370},[352,899,759],{"class":382},[352,901,903],{"class":902},"sPg8w"," -i",[352,905,906],{"class":882},"   # Returns refs for interactive elements\n",[352,908,909,911,914,917],{"class":354,"line":441},[352,910,111],{"class":370},[352,912,913],{"class":382}," click",[352,915,916],{"class":382}," @e2",[352,918,919],{"class":882},"     # Uses a ref from the latest snapshot\n",[352,921,922,924,927],{"class":354,"line":464},[352,923,111],{"class":370},[352,925,926],{"class":382}," screenshot",[352,928,929],{"class":382}," result.png\n",[318,931,933],{"id":932},"adoption-in-2026","Adoption in 2026",[11,935,936],{},"As of 2026, Puppeteer and Playwright remain widely used in scripted automation and testing, while Vercel agent-browser is drawing growing attention in AI-agent workflows. The deeper split is product shape: Puppeteer and Playwright target deterministic automation first, while agent-browser is built around agent-readable page state and command-driven interaction.",[11,938,939],{},"Side-by-side, the three tools profile very differently - a mature JS-only Chromium library, a cross-browser testing default, and a young CLI built for LLM loops:",[76,941],{":width":78,"alt":942,"format":80,"loading":133,"src":943,"provider":83},"Side-by-side cards comparing Puppeteer, Playwright, and agent-browser by launch year, popularity, language, and primary use","/blog/agent-browser-vs-puppeteer-and-playwright/2.svg",[88,945,947],{"id":946},"comparison-table","Comparison table",[11,949,950],{},"Here are the core differences side by side.",[196,952,953,966],{},[199,954,955],{},[202,956,957,960,962,964],{},[205,958,959],{"align":207},"Best for",[205,961,45],{"align":207},[205,963,37],{"align":207},[205,965,53],{"align":207},[220,967,968,983,999,1015,1031,1047,1063],{},[202,969,970,975,978,981],{},[225,971,972],{"align":207},[15,973,974],{},"Browser support",[225,976,977],{"align":207},"Mostly Chromium",[225,979,980],{"align":207},"Chromium, Firefox, WebKit",[225,982,977],{"align":207},[202,984,985,990,993,996],{},[225,986,987],{"align":207},[15,988,989],{},"API style",[225,991,992],{"align":207},"JS/TS library",[225,994,995],{"align":207},"Multi-language library",[225,997,998],{"align":207},"CLI + daemon",[202,1000,1001,1006,1009,1012],{},[225,1002,1003],{"align":207},[15,1004,1005],{},"Workflow style",[225,1007,1008],{"align":207},"Scripted steps",[225,1010,1011],{"align":207},"Scripted steps + auto-wait",[225,1013,1014],{"align":207},"Agent loop + refs",[202,1016,1017,1022,1025,1028],{},[225,1018,1019],{"align":207},[15,1020,1021],{},"LLM readiness",[225,1023,1024],{"align":207},"Low",[225,1026,1027],{"align":207},"Medium",[225,1029,1030],{"align":207},"High",[202,1032,1033,1038,1041,1044],{},[225,1034,1035],{"align":207},[15,1036,1037],{},"Session handling",[225,1039,1040],{"align":207},"Manual",[225,1042,1043],{"align":207},"Browser contexts",[225,1045,1046],{"align":207},"Persistent daemon",[202,1048,1049,1054,1057,1060],{},[225,1050,1051],{"align":207},[15,1052,1053],{},"Best use case",[225,1055,1056],{"align":207},"Simple Chromium tasks",[225,1058,1059],{"align":207},"Default choice for most teams",[225,1061,1062],{"align":207},"Open-ended AI agent workflows",[202,1064,1065,1070,1073,1076],{},[225,1066,1067],{"align":207},[15,1068,1069],{},"Main tradeoff",[225,1071,1072],{"align":207},"Limited browser coverage",[225,1074,1075],{"align":207},"More tooling than Puppeteer",[225,1077,1078],{"align":207},"Newer and less mature",[11,1080,1081,1082,1084,1085,1087,1088,1090],{},"In practice, the pattern is simple: ",[15,1083,37],{}," is the safest default, ",[15,1086,45],{}," is the leaner Chromium-first option, and ",[15,1089,111],{}," is strongest when an LLM needs compact page state and ref-based actions.",[11,1092,1093],{},"A related tradeoff is how well each tool hides from anti-bot systems out of the box, and how much stealth work you're expected to bolt on yourself:",[76,1095],{":width":78,"alt":1096,"format":80,"loading":133,"src":1097,"provider":83},"Anti-bot detection risk spectrum: agent-browser low risk with built-in stealth, Playwright medium risk via plugin-based stealth, Puppeteer high risk with manual plugins and a detectable CDP signature","/blog/agent-browser-vs-puppeteer-and-playwright/5.svg",[88,1099,1101],{"id":1100},"when-to-choose-each-tool","When to choose each tool",[11,1103,1104],{},"The right choice depends on the shape of the work. Each tool is sharpest at a distinctly different cluster of use cases - precise Chrome scripting, reliable cross-browser testing, or AI-native web interaction:",[76,1106],{":width":78,"alt":1107,"format":80,"loading":133,"src":1108,"provider":83},"Use-case decision guide listing best-fit workloads per tool: Puppeteer for Chrome scripting and PDFs, Playwright for E2E testing and multi-language CI, agent-browser for LLM agents, agentic RPA, and low-token browsing","/blog/agent-browser-vs-puppeteer-and-playwright/3.svg",[11,1110,1111,1112,1114,1115,1117,1118,1120],{},"Start with the workflow shape: choose ",[15,1113,37],{}," for deterministic, test-like workflows; choose ",[15,1116,45],{}," for Chromium-only scripted control; choose ",[15,1119,53],{}," when the next step depends on model interpretation of the current page.",[11,1122,1123],{},"If it helps to see the same decision as a flowchart, the two questions that matter most are whether you're writing scripts at all, and - if so - whether you need multi-browser coverage:",[76,1125],{":width":78,"alt":1126,"format":80,"loading":133,"src":1127,"provider":83},"Decision flowchart: will developers write scripts? If yes, Playwright for multi-browser or Puppeteer for Chromium-only; if no, agent-browser","/blog/agent-browser-vs-puppeteer-and-playwright/4.svg",[88,1129,1131],{"id":1130},"sources","Sources",[11,1133,1134],{},"The comparison in this article is based primarily on official product documentation and repository materials reviewed in April 2026, with secondary context from industry comparison articles and community discussions.",[26,1136,1137,1166],{},[29,1138,1139,1142,1143],{},[15,1140,1141],{},"Official sources reviewed",":\n",[26,1144,1145,1151,1157],{},[29,1146,1147,1148],{},"Playwright documentation: ",[143,1149,1150],{},"playwright.dev",[29,1152,1153,1154],{},"Puppeteer documentation: ",[143,1155,1156],{},"pptr.dev",[29,1158,1159,1160,305,1163],{},"Vercel agent-browser repository and docs: ",[143,1161,1162],{},"github.com/vercel-labs/agent-browser",[143,1164,1165],{},"agent-browser.dev",[29,1167,1168,1142,1171],{},[15,1169,1170],{},"Secondary sources",[26,1172,1173,1176],{},[29,1174,1175],{},"Comparison articles from browser automation vendors and tooling blogs",[29,1177,1178],{},"Community discussions about automation reliability, scraping maintenance, and agent workflows",[11,1180,1181],{},"Because these tools are evolving quickly, especially in the AI-agent category, it is worth checking the latest official docs before making architecture decisions based on version-specific features.",[88,1183,1185],{"id":1184},"conclusion","Conclusion",[11,1187,1188,1189,71],{},"Playwright remains the strongest default for most engineering teams because it combines reliability, cross-browser automation, and mature tooling. Puppeteer still fits focused Chromium scripting, while Vercel agent-browser shines when the browser is part of an ",[15,1190,1191],{},"LLM-driven agent loop",[1193,1194,1195],"style",{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .scsc5, html code.shiki .scsc5{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .srhcd, html code.shiki .srhcd{--shiki-default:#8839EF;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sMtgK, html code.shiki .sMtgK{--shiki-default:#7C7F93;--shiki-dark:#D9F5DD}html pre.shiki code .sP4PM, html code.shiki .sP4PM{--shiki-default:#4C4F69;--shiki-default-font-style:inherit;--shiki-dark:#7FDBCA;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .soAP-, html code.shiki .soAP-{--shiki-default:#4C4F69;--shiki-dark:#D7DBE0}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sgNGR, html code.shiki .sgNGR{--shiki-default:#7C7F93;--shiki-dark:#C792EA}html pre.shiki code .sVS64, html code.shiki .sVS64{--shiki-default:#179299;--shiki-dark:#D6DEEB}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .sPg8w, html code.shiki .sPg8w{--shiki-default:#40A02B;--shiki-dark:#82AAFF}",{"title":348,"searchDepth":394,"depth":394,"links":1197},[1198,1199,1200,1206,1207,1208,1209],{"id":90,"depth":394,"text":91},{"id":137,"depth":394,"text":138},{"id":312,"depth":394,"text":313,"children":1201},[1202,1203,1204,1205],{"id":320,"depth":413,"text":45},{"id":561,"depth":413,"text":37},{"id":837,"depth":413,"text":53},{"id":932,"depth":413,"text":933},{"id":946,"depth":394,"text":947},{"id":1100,"depth":394,"text":1101},{"id":1130,"depth":394,"text":1131},{"id":1184,"depth":394,"text":1185},"ai-agents","2026-04-10","Compare agent-browser, Puppeteer, and Playwright for AI agents and web automation in 2026. See browser support, token efficiency, performance tradeoffs, and when to use each tool.","md",[1215,1218,1221,1224,1227],{"question":1216,"answer":1217},"What is an agent browser?","An agent browser is a browser environment built for AI-agent workflows. Instead of relying mainly on raw DOM access and hand-written selectors, it exposes structured page state, persistent sessions, and action tools that map more cleanly to how language models observe and act. Puppeteer and Playwright can be adapted for similar use, but purpose-built agent browsers focus more directly on multi-step, model-driven interaction.",{"question":1219,"answer":1220},"Can Puppeteer or Playwright be used for AI agents?","Yes. Both Puppeteer and Playwright can be used for AI agents. Playwright is often the better fit because its accessibility snapshot features can produce a more compact page representation than raw HTML, and its browser contexts make session isolation easier. Puppeteer remains useful when you need direct Chromium control. The tradeoff is that both were designed primarily for deterministic automation, so agent workflows usually need extra orchestration.",{"question":1222,"answer":1223},"How does Vercel agent-browser differ from Playwright for AI agents?","Vercel agent-browser differs from Playwright by focusing more directly on LLM-driven workflows. Its snapshot command returns an accessibility tree with `@ref` labels, and its daemon keeps browser state alive between commands. Playwright can support similar workflows through `ariaSnapshot()` and broader automation APIs, but it remains more general-purpose and test-oriented.",{"question":1225,"answer":1226},"Which tool uses the fewest tokens when feeding page state to an LLM?","For LLM input, raw HTML from Puppeteer is usually the heaviest format. In the benchmark used in this article, Playwright `ariaSnapshot()` and agent-browser `snapshot -i` were both far smaller than Puppeteer `page.content()`, and agent-browser was only slightly smaller than Playwright on that sample page. Exact token usage still depends on the page, the snapshot scope, and the model tokenizer.",{"question":1228,"answer":1229},"When should I use Playwright instead of an agent browser?","Use Playwright instead of an agent browser when you need deterministic, reproducible automation. It is the stronger choice for end-to-end testing, cross-browser support, tracing, CI pipelines, and workflows that can be written as stable scripted steps. An agent browser becomes more useful when the next action depends on interpreting the current page rather than following a fixed selector chain.",0,null,{"shortTitle":1233,"relatedLinks":1234},"Agent Browser vs Puppeteer vs Playwright",[1235,1238,1242],{"text":1236,"href":69,"description":1237},"Playwright vs. Puppeteer for AI Agents","A detailed comparison of Playwright and Puppeteer for building AI browser agents.",{"text":1239,"href":1240,"description":1241},"Top 5 MCP Servers for AI Agent Browser Automation","/blog/the-top-5-best-mcp-servers-for-ai-agent-browser-automation","Compare the five best MCP servers for connecting AI agents to live browsers.",{"text":1243,"href":1244,"description":1245},"Playwright vs. Selenium in 2026","/blog/playwright-vs-selenium-which-automation-tool-is-right-for-you-in-2026","An in-depth look at how Playwright and Selenium compare for modern web automation.","/blog/agent-browser-vs-puppeteer-and-playwright",{"title":5,"description":1212},{"loc":1246},"blog/1038.agent-browser-vs-puppeteer-and-playwright",[1251,1210,320,561,111,1252],"browser-automation","web-agents","R9XqDy3ik4zqIg5_QnYLU_c2lmburQljn3TB3kp59Uw",[1255,2877],{"id":1256,"title":309,"authorId":1257,"body":1258,"category":1210,"created":2855,"description":2856,"extension":1213,"faqs":1231,"featurePriority":1231,"head":1231,"landingPath":1231,"meta":2857,"navigation":751,"ogImage":1231,"path":308,"robots":1231,"schemaOrg":1231,"seo":2868,"sitemap":2869,"stem":2870,"tags":2871,"__hash__":2876},"blog/blog/1012.dom-downsampling-for-llm-based-web-agents.md","thassilo-schiepanski",{"type":8,"value":1259,"toc":2840},[1260,1264,1286,1290,1296,1300,1316,1320,1326,1330,1348,1374,1377,1381,1384,1395,1401,1432,1436,1456,1468,1473,1488,1502,1505,1509,1529,1533,1541,1553,1557,1560,1935,1941,1948,2112,2119,2210,2217,2289,2298,2304,2313,2317,2323,2333,2345,2568,2586,2608,2614,2657,2661,2673,2682,2687,2692,2695,2699,2705,2710,2747,2751,2757,2761,2771,2775,2778,2837],[76,1261],{":width":78,"alt":1262,"format":80,"loading":133,"src":1263},"Downsampling visualised for digital images and HTML","/blog/dom-downsampling-for-web-agents/1.png",[11,1265,1266,119,1271,119,1276,1281,1282,1285],{},[67,1267,1270],{"href":1268,"rel":1269},"https://operator.chatgpt.com",[165],"Operator (OpenAI)",[67,1272,1275],{"href":1273,"rel":1274},"https://www.director.ai",[165],"Director (Browserbase)",[67,1277,1280],{"href":1278,"rel":1279},"https://browser-use.com",[165],"Browser Use"," – we are currently witnessing the rise of ",[15,1283,1284],{},"web AI agents",". The first iteration of serviceable web agents was enabled by frontier LLMs, which act as instantaneous domain model backends. The domain, hereby, corresponds to the landscape of web application UIs.",[88,1287,1289],{"id":1288},"what-is-a-snapshot","What is a Snapshot?",[11,1291,1292,1293,1295],{},"Web agents provide an LLM with a task, and serialised runtime state of a currently browsed web application (e.g., a screenshot). The LLM is ought to suggest relevant actions to perform in the web application. Serialisation of such runtime state is referred to as a ",[15,1294,193],{},". And the snapshot technique primarily decides the quality of LLM interaction suggestions.",[318,1297,1299],{"id":1298},"gui-snapshots","GUI Snapshots",[11,1301,1302,1303,1306,1307,1311,1312,1315],{},"Screenshots – for consistency reasons referred to as ",[15,1304,1305],{},"GUI snapshots"," – resemble how humans visually perceive web application UIs. LLM APIs subsidise the use of image input through upstream compression. Compresssion, however, irreversibly affects image dimensions, which takes away pixel precision; no way to suggest interactions like ",[1308,1309,1310],"em",{},"“click at 100, 735”",". As a workaround, early web agents used ",[1308,1313,1314],{},"grounded"," GUI snapshots. Grounding describes adding visual cues to the GUI, such as bounding boxes with numerical identifiers. Grounding lets the LLM refer to specific parts of the page by identifier, so the agent can trace back interaction targets.",[76,1317],{":width":78,"alt":1318,"format":80,"loading":133,"src":1319},"Grounded GUI snapshot as implemented by Browser Use","/blog/dom-downsampling-for-web-agents/2.png",[11,1321,1322],{},[1323,1324,1325],"small",{},"Grounded GUI snapshot as implemented by Browser Use.",[318,1327,1329],{"id":1328},"dom-snapshots","DOM Snapshots",[11,1331,1332,1333,1343,1344,1347],{},"LLMs arguably are much better at understanding code than images. Research supports they excel at describing and classifying HTML, and also navigating an inherent UI",[1334,1335,1336],"sup",{},[67,1337,1342],{"href":1338,"ariaDescribedBy":1339,"dataFootnoteRef":348,"id":1341},"#user-content-fn-1",[1340],"footnote-label","user-content-fnref-1","1",". The DOM (document object model) – a web browser's runtime state model of a web application – translates back to HTML. For this reason, ",[15,1345,1346],{},"DOM snapshots"," offer a compelling alternative to GUI snapshots. DOM snapshots offer a handful of key advantages:",[1349,1350,1351,1354,1357,1360,1363],"ol",{},[29,1352,1353],{},"DOM snapshots connect with LLM code (HTML) interpretation abilities.",[29,1355,1356],{},"DOM snapshots can be compiled from deep clones, hidden from supervision (unlike GUI grounding).",[29,1358,1359],{},"DOM snapshots render text input that on average consume less bandwidth than screnshots.",[29,1361,1362],{},"DOM snapshots allow for exact programmatic targeting of elements (e.g., via CSS selectors).",[29,1364,1365,1366,1369,1370,1373],{},"DOM snapshots are available with the ",[143,1367,1368],{},"DOMContentLoaded"," event (whereas the GUI completes initial rendering with ",[143,1371,1372],{},"load",").",[11,1375,1376],{},"Yet, DOM snapshots have a major problem: potentially exhaustive model context. Whereas GUI snapshot commonly cost four figures of tokens, a raw DOM snapshot can cost into hundreds of thousands of tokens. To connect with LLM code interpretation abilities, however, developers have used element extraction techniques – picking only (likely) important elements from the DOM. Element extraction flattens the DOM tree, which disregards hierarchy as a potential UI feature (how do elements relate to each other?).",[88,1378,1380],{"id":1379},"dom-downsampling-a-novel-approach","DOM Downsampling: A Novel Approach",[11,1382,1383],{},"To enable DOM snapshots for use with web agents, it requires client-side pre-processing – similar to how LLM vision APIs process image input. Downsampling is a fundamental signal processing technique that reduces data that scales out of time or space constraints under the assumption that the majority of relevant features is retained. Picture JPEG compression as an example: put simply, a JPEG image stores only an average colour for patches of pixels. The bigger the patches, the smaller the file. Although some detail is lost, key image features – colours, edges, objects – keep being recognisable – up to a large patch size.",[11,1385,1386,1387,1390,1391,1394],{},"We transfer the concept of ",[15,1388,1389],{},"downsampling"," to ",[15,1392,1393],{},"DOMs",". Particularly, since such an approach retains HTML characteristics that might be valuable for an LLM backend. We define UI features as concepts that, to a substantial degree, facilitate LLM suggestions on how to act in the UI in order to solve related web-based tasks.",[88,1396,1398],{"id":1397},"d2snap",[1308,1399,1400],{},"D2Snap",[11,1402,1403,1404,1412,1420,1428,1429,1431],{},"We recently proposed ",[67,1405,1408],{"href":1406,"rel":1407},"https://arxiv.org/abs/2508.04412",[165],[15,1409,1410],{},[1308,1411,1400],{},[1334,1413,1414],{},[67,1415,1419],{"href":1416,"ariaDescribedBy":1417,"dataFootnoteRef":348,"id":1418},"#user-content-fn-2",[1340],"user-content-fnref-2","2",[1334,1421,1422],{},[67,1423,1427],{"href":1424,"ariaDescribedBy":1425,"dataFootnoteRef":348,"id":1426},"#user-content-fn-3",[1340],"user-content-fnref-3","3"," – a first-of-its-kind downsampling algorithm for DOMs. Herein, we'll briefly explain how the ",[1308,1430,1400],{}," algorithm works, and how it can be utilised to build efficient and performant web agents.",[318,1433,1435],{"id":1434},"how-it-works","How it works",[11,1437,1438,1439,1441,1442,119,1445,1448,1449,1452,1453,1373],{},"There are basically three redundant types of DOM nodes, and HTML concepts: elements, text, and attributes. We defined and empirically adjusted three node-specific procedures. ",[1308,1440,1400],{}," downsamples at a variable ratio, configured through procedure-specific parameters  ",[143,1443,1444],{},"k",[143,1446,1447],{},"l",", and ",[143,1450,1451],{},"m"," (",[143,1454,1455],{},"∈ [0, 1]",[1457,1458,1459],"blockquote",{},[11,1460,1461,1462,1467],{},"We used ",[67,1463,1466],{"href":1464,"rel":1465},"https://openai.com/index/hello-gpt-4o/",[165],"GPT-4o"," to create a downsampling ground truth dataset by having it classify HTML elements and scoring semantics regarding relevance for understanding the inherent UI – a UI feature degree.",[1469,1470,1472],"h4",{"id":1471},"procedure-elements","Procedure: Elements",[11,1474,1475,1477,1478,305,1481,1484,1485,1487],{},[1308,1476,1400],{}," downsamples (simplifies) elements by merging container elements like ",[143,1479,1480],{},"section",[143,1482,1483],{},"div"," together. A parameter ",[143,1486,1444],{}," controls the merge ratio depending on the total DOM tree height. For competing concepts, such as element name, the ground truth determines which element's characterisitics to keep – comparing UI feature scores.",[11,1489,1490,1491,119,1493,1495,1496,1501],{},"Elements in content elements (",[143,1492,11],{},[143,1494,1457],{},", ...) are translated to a more comprehensive ",[67,1497,1500],{"href":1498,"rel":1499},"https://www.markdownguide.org/basic-syntax/",[165],"Markdown"," representation.",[11,1503,1504],{},"Interactive elements, definite interaction target candidates, are kept as is.",[1469,1506,1508],{"id":1507},"procedure-text","Procedure: Text",[11,1510,1511,1513,1514,1517,1525,1526,1528],{},[1308,1512,1400],{}," downsamples text by dropping a fraction. Natural units of text are space-separated words, or punctuation-separated sentences. We reuse the ",[1308,1515,1516],{},"TextRank",[1334,1518,1519],{},[67,1520,1524],{"href":1521,"ariaDescribedBy":1522,"dataFootnoteRef":348,"id":1523},"#user-content-fn-4",[1340],"user-content-fnref-4","4"," algorithm to rank sentences in text nodes. The lowest-ranking fraction of sentences, denoted by parameter ",[143,1527,1447],{},", is dropped.",[1469,1530,1532],{"id":1531},"procedure-attributes","Procedure: Attributes",[11,1534,1535,1537,1538,1540],{},[1308,1536,1400],{}," downsamples attributes by dropping those with a name that, according to ground truth, holds a UI feature degree below a threshold. Parameter ",[143,1539,1451],{}," denotes this threshold.",[1457,1542,1543],{},[11,1544,1545,1546,1552],{},"Check out the ",[67,1547,1549,1551],{"href":1406,"rel":1548},[165],[1308,1550,1400],{}," paper"," to learn about the algorithm in-depth.",[318,1554,1556],{"id":1555},"example-of-a-downsampled-dom","Example of a Downsampled DOM",[11,1558,1559],{},"Consider a partial DOM state, serialised as HTML:",[343,1561,1564],{"className":1562,"code":1563,"language":527,"meta":348,"style":348},"language-html shiki shiki-themes catppuccin-latte night-owl","\u003Csection class=\"container\" tabindex=\"3\" required=\"true\" type=\"example\">\n  \u003Cdiv class=\"mx-auto\" data-topic=\"products\" required=\"false\">\n    \u003Ch1>Our Pizza\u003C/h1>\n    \u003Cdiv>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Margherita\u003C/h2>\n        \u003Cp>\n          A simple classic: mozzarela, tomatoes and basil.\n          An everyday choice!\n        \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n      \u003Cdiv class=\"shadow-lg\">\n        \u003Ch2>Capricciosa\u003C/h2>\n        \u003Cp>\n          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n          A true favourite!\n          \u003C/p>\n        \u003Cbutton type=\"button\">Add\u003C/button>\n      \u003C/div>\n    \u003C/div>\n  \u003C/div>\n\u003C/section>\n",[143,1565,1566,1627,1670,1691,1699,1719,1737,1745,1750,1755,1764,1792,1802,1821,1839,1848,1854,1860,1870,1897,1906,1916,1926],{"__ignoreMap":348},[352,1567,1568,1572,1575,1579,1582,1584,1587,1589,1592,1594,1596,1598,1600,1603,1605,1607,1610,1612,1615,1617,1619,1622,1624],{"class":354,"line":355},[352,1569,1571],{"class":1570},"s9rnR","\u003C",[352,1573,1480],{"class":1574},"sY2RG",[352,1576,1578],{"class":1577},"swkLt"," class",[352,1580,1581],{"class":1570},"=",[352,1583,379],{"class":378},[352,1585,1586],{"class":382},"container",[352,1588,379],{"class":378},[352,1590,1591],{"class":1577}," tabindex",[352,1593,1581],{"class":1570},[352,1595,379],{"class":378},[352,1597,1427],{"class":382},[352,1599,379],{"class":378},[352,1601,1602],{"class":1577}," required",[352,1604,1581],{"class":1570},[352,1606,379],{"class":378},[352,1608,1609],{"class":382},"true",[352,1611,379],{"class":378},[352,1613,1614],{"class":1577}," type",[352,1616,1581],{"class":1570},[352,1618,379],{"class":378},[352,1620,1621],{"class":382},"example",[352,1623,379],{"class":378},[352,1625,1626],{"class":1570},">\n",[352,1628,1629,1632,1634,1636,1638,1640,1643,1645,1648,1650,1652,1655,1657,1659,1661,1663,1666,1668],{"class":354,"line":394},[352,1630,1631],{"class":1570},"  \u003C",[352,1633,1483],{"class":1574},[352,1635,1578],{"class":1577},[352,1637,1581],{"class":1570},[352,1639,379],{"class":378},[352,1641,1642],{"class":382},"mx-auto",[352,1644,379],{"class":378},[352,1646,1647],{"class":1577}," data-topic",[352,1649,1581],{"class":1570},[352,1651,379],{"class":378},[352,1653,1654],{"class":382},"products",[352,1656,379],{"class":378},[352,1658,1602],{"class":1577},[352,1660,1581],{"class":1570},[352,1662,379],{"class":378},[352,1664,1665],{"class":382},"false",[352,1667,379],{"class":378},[352,1669,1626],{"class":1570},[352,1671,1672,1675,1678,1681,1684,1687,1689],{"class":354,"line":413},[352,1673,1674],{"class":1570},"    \u003C",[352,1676,1677],{"class":1574},"h1",[352,1679,1680],{"class":1570},">",[352,1682,1683],{"class":374},"Our Pizza",[352,1685,1686],{"class":1570},"\u003C/",[352,1688,1677],{"class":1574},[352,1690,1626],{"class":1570},[352,1692,1693,1695,1697],{"class":354,"line":441},[352,1694,1674],{"class":1570},[352,1696,1483],{"class":1574},[352,1698,1626],{"class":1570},[352,1700,1701,1704,1706,1708,1710,1712,1715,1717],{"class":354,"line":464},[352,1702,1703],{"class":1570},"      \u003C",[352,1705,1483],{"class":1574},[352,1707,1578],{"class":1577},[352,1709,1581],{"class":1570},[352,1711,379],{"class":378},[352,1713,1714],{"class":382},"shadow-lg",[352,1716,379],{"class":378},[352,1718,1626],{"class":1570},[352,1720,1721,1724,1726,1728,1731,1733,1735],{"class":354,"line":490},[352,1722,1723],{"class":1570},"        \u003C",[352,1725,88],{"class":1574},[352,1727,1680],{"class":1570},[352,1729,1730],{"class":374},"Margherita",[352,1732,1686],{"class":1570},[352,1734,88],{"class":1574},[352,1736,1626],{"class":1570},[352,1738,1739,1741,1743],{"class":354,"line":513},[352,1740,1723],{"class":1570},[352,1742,11],{"class":1574},[352,1744,1626],{"class":1570},[352,1746,1747],{"class":354,"line":534},[352,1748,1749],{"class":374},"          A simple classic: mozzarela, tomatoes and basil.\n",[352,1751,1752],{"class":354,"line":550},[352,1753,1754],{"class":374},"          An everyday choice!\n",[352,1756,1757,1760,1762],{"class":354,"line":813},[352,1758,1759],{"class":1570},"        \u003C/",[352,1761,11],{"class":1574},[352,1763,1626],{"class":1570},[352,1765,1766,1768,1771,1773,1775,1777,1779,1781,1783,1786,1788,1790],{"class":354,"line":828},[352,1767,1723],{"class":1570},[352,1769,1770],{"class":1574},"button",[352,1772,1614],{"class":1577},[352,1774,1581],{"class":1570},[352,1776,379],{"class":378},[352,1778,1770],{"class":382},[352,1780,379],{"class":378},[352,1782,1680],{"class":1570},[352,1784,1785],{"class":374},"Add",[352,1787,1686],{"class":1570},[352,1789,1770],{"class":1574},[352,1791,1626],{"class":1570},[352,1793,1795,1798,1800],{"class":354,"line":1794},12,[352,1796,1797],{"class":1570},"      \u003C/",[352,1799,1483],{"class":1574},[352,1801,1626],{"class":1570},[352,1803,1805,1807,1809,1811,1813,1815,1817,1819],{"class":354,"line":1804},13,[352,1806,1703],{"class":1570},[352,1808,1483],{"class":1574},[352,1810,1578],{"class":1577},[352,1812,1581],{"class":1570},[352,1814,379],{"class":378},[352,1816,1714],{"class":382},[352,1818,379],{"class":378},[352,1820,1626],{"class":1570},[352,1822,1824,1826,1828,1830,1833,1835,1837],{"class":354,"line":1823},14,[352,1825,1723],{"class":1570},[352,1827,88],{"class":1574},[352,1829,1680],{"class":1570},[352,1831,1832],{"class":374},"Capricciosa",[352,1834,1686],{"class":1570},[352,1836,88],{"class":1574},[352,1838,1626],{"class":1570},[352,1840,1842,1844,1846],{"class":354,"line":1841},15,[352,1843,1723],{"class":1570},[352,1845,11],{"class":1574},[352,1847,1626],{"class":1570},[352,1849,1851],{"class":354,"line":1850},16,[352,1852,1853],{"class":374},"          A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[352,1855,1857],{"class":354,"line":1856},17,[352,1858,1859],{"class":374},"          A true favourite!\n",[352,1861,1863,1866,1868],{"class":354,"line":1862},18,[352,1864,1865],{"class":1570},"          \u003C/",[352,1867,11],{"class":1574},[352,1869,1626],{"class":1570},[352,1871,1873,1875,1877,1879,1881,1883,1885,1887,1889,1891,1893,1895],{"class":354,"line":1872},19,[352,1874,1723],{"class":1570},[352,1876,1770],{"class":1574},[352,1878,1614],{"class":1577},[352,1880,1581],{"class":1570},[352,1882,379],{"class":378},[352,1884,1770],{"class":382},[352,1886,379],{"class":378},[352,1888,1680],{"class":1570},[352,1890,1785],{"class":374},[352,1892,1686],{"class":1570},[352,1894,1770],{"class":1574},[352,1896,1626],{"class":1570},[352,1898,1900,1902,1904],{"class":354,"line":1899},20,[352,1901,1797],{"class":1570},[352,1903,1483],{"class":1574},[352,1905,1626],{"class":1570},[352,1907,1909,1912,1914],{"class":354,"line":1908},21,[352,1910,1911],{"class":1570},"    \u003C/",[352,1913,1483],{"class":1574},[352,1915,1626],{"class":1570},[352,1917,1919,1922,1924],{"class":354,"line":1918},22,[352,1920,1921],{"class":1570},"  \u003C/",[352,1923,1483],{"class":1574},[352,1925,1626],{"class":1570},[352,1927,1929,1931,1933],{"class":354,"line":1928},23,[352,1930,1686],{"class":1570},[352,1932,1480],{"class":1574},[352,1934,1626],{"class":1570},[11,1936,1937,1938,1940],{},"Here are some ",[1308,1939,1400],{}," downsampling results, which are based on different parametric configurations. A percentage denotes the reduced size.",[1469,1942,1944,1947],{"id":1943},"k3-l3-m3-55",[143,1945,1946],{},"k=.3, l=.3, m=.3"," (55%)",[343,1949,1951],{"className":1562,"code":1950,"language":527,"meta":348,"style":348},"\u003Csection tabindex=\"3\" type=\"example\" class=\"container\" required=\"true\">\n  # Our Pizza\n  \u003Cdiv class=\"shadow-lg\">\n    ## Margherita\n    A simple classic: mozzarela, tomatoes, and basil.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n    ## Capricciosa\n    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n    \u003Cbutton type=\"button\">Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[143,1952,1953,2001,2006,2024,2029,2034,2060,2065,2070,2096,2104],{"__ignoreMap":348},[352,1954,1955,1957,1959,1961,1963,1965,1967,1969,1971,1973,1975,1977,1979,1981,1983,1985,1987,1989,1991,1993,1995,1997,1999],{"class":354,"line":355},[352,1956,1571],{"class":1570},[352,1958,1480],{"class":1574},[352,1960,1591],{"class":1577},[352,1962,1581],{"class":1570},[352,1964,379],{"class":378},[352,1966,1427],{"class":382},[352,1968,379],{"class":378},[352,1970,1614],{"class":1577},[352,1972,1581],{"class":1570},[352,1974,379],{"class":378},[352,1976,1621],{"class":382},[352,1978,379],{"class":378},[352,1980,1578],{"class":1577},[352,1982,1581],{"class":1570},[352,1984,379],{"class":378},[352,1986,1586],{"class":382},[352,1988,379],{"class":378},[352,1990,1602],{"class":1577},[352,1992,1581],{"class":1570},[352,1994,379],{"class":378},[352,1996,1609],{"class":382},[352,1998,379],{"class":378},[352,2000,1626],{"class":1570},[352,2002,2003],{"class":354,"line":394},[352,2004,2005],{"class":374},"  # Our Pizza\n",[352,2007,2008,2010,2012,2014,2016,2018,2020,2022],{"class":354,"line":413},[352,2009,1631],{"class":1570},[352,2011,1483],{"class":1574},[352,2013,1578],{"class":1577},[352,2015,1581],{"class":1570},[352,2017,379],{"class":378},[352,2019,1714],{"class":382},[352,2021,379],{"class":378},[352,2023,1626],{"class":1570},[352,2025,2026],{"class":354,"line":441},[352,2027,2028],{"class":374},"    ## Margherita\n",[352,2030,2031],{"class":354,"line":464},[352,2032,2033],{"class":374},"    A simple classic: mozzarela, tomatoes, and basil.\n",[352,2035,2036,2038,2040,2042,2044,2046,2048,2050,2052,2054,2056,2058],{"class":354,"line":490},[352,2037,1674],{"class":1570},[352,2039,1770],{"class":1574},[352,2041,1614],{"class":1577},[352,2043,1581],{"class":1570},[352,2045,379],{"class":378},[352,2047,1770],{"class":382},[352,2049,379],{"class":378},[352,2051,1680],{"class":1570},[352,2053,1785],{"class":374},[352,2055,1686],{"class":1570},[352,2057,1770],{"class":1574},[352,2059,1626],{"class":1570},[352,2061,2062],{"class":354,"line":513},[352,2063,2064],{"class":374},"    ## Capricciosa\n",[352,2066,2067],{"class":354,"line":534},[352,2068,2069],{"class":374},"    A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[352,2071,2072,2074,2076,2078,2080,2082,2084,2086,2088,2090,2092,2094],{"class":354,"line":550},[352,2073,1674],{"class":1570},[352,2075,1770],{"class":1574},[352,2077,1614],{"class":1577},[352,2079,1581],{"class":1570},[352,2081,379],{"class":378},[352,2083,1770],{"class":382},[352,2085,379],{"class":378},[352,2087,1680],{"class":1570},[352,2089,1785],{"class":374},[352,2091,1686],{"class":1570},[352,2093,1770],{"class":1574},[352,2095,1626],{"class":1570},[352,2097,2098,2100,2102],{"class":354,"line":813},[352,2099,1921],{"class":1570},[352,2101,1483],{"class":1574},[352,2103,1626],{"class":1570},[352,2105,2106,2108,2110],{"class":354,"line":828},[352,2107,1686],{"class":1570},[352,2109,1480],{"class":1574},[352,2111,1626],{"class":1570},[1469,2113,2115,2118],{"id":2114},"k4-l6-m8-27",[143,2116,2117],{},"k=.4, l=.6, m=.8"," (27%)",[343,2120,2122],{"className":1562,"code":2121,"language":527,"meta":348,"style":348},"\u003Csection>\n  # Our Pizza\n  \u003Cdiv>\n    ## Margherita\n    A simple classic:\n    \u003Cbutton>Add\u003C/button>\n    ## Capricciosa\n    A rich taste:\n    \u003Cbutton>Add\u003C/button>\n  \u003C/div>\n\u003C/section>\n",[143,2123,2124,2132,2136,2144,2148,2153,2169,2173,2178,2194,2202],{"__ignoreMap":348},[352,2125,2126,2128,2130],{"class":354,"line":355},[352,2127,1571],{"class":1570},[352,2129,1480],{"class":1574},[352,2131,1626],{"class":1570},[352,2133,2134],{"class":354,"line":394},[352,2135,2005],{"class":374},[352,2137,2138,2140,2142],{"class":354,"line":413},[352,2139,1631],{"class":1570},[352,2141,1483],{"class":1574},[352,2143,1626],{"class":1570},[352,2145,2146],{"class":354,"line":441},[352,2147,2028],{"class":374},[352,2149,2150],{"class":354,"line":464},[352,2151,2152],{"class":374},"    A simple classic:\n",[352,2154,2155,2157,2159,2161,2163,2165,2167],{"class":354,"line":490},[352,2156,1674],{"class":1570},[352,2158,1770],{"class":1574},[352,2160,1680],{"class":1570},[352,2162,1785],{"class":374},[352,2164,1686],{"class":1570},[352,2166,1770],{"class":1574},[352,2168,1626],{"class":1570},[352,2170,2171],{"class":354,"line":513},[352,2172,2064],{"class":374},[352,2174,2175],{"class":354,"line":534},[352,2176,2177],{"class":374},"    A rich taste:\n",[352,2179,2180,2182,2184,2186,2188,2190,2192],{"class":354,"line":550},[352,2181,1674],{"class":1570},[352,2183,1770],{"class":1574},[352,2185,1680],{"class":1570},[352,2187,1785],{"class":374},[352,2189,1686],{"class":1570},[352,2191,1770],{"class":1574},[352,2193,1626],{"class":1570},[352,2195,2196,2198,2200],{"class":354,"line":813},[352,2197,1921],{"class":1570},[352,2199,1483],{"class":1574},[352,2201,1626],{"class":1570},[352,2203,2204,2206,2208],{"class":354,"line":828},[352,2205,1686],{"class":1570},[352,2207,1480],{"class":1574},[352,2209,1626],{"class":1570},[1469,2211,2213,2216],{"id":2212},"k-l0-m-35",[143,2214,2215],{},"k→∞, l=0, ∀m"," (35%)",[343,2218,2220],{"className":1562,"code":2219,"language":527,"meta":348,"style":348},"# Our Pizza\n## Margherita\nA simple classic: mozzarela, tomatoes, and basil.\nAn everyday choice!\n\u003Cbutton>Add\u003C/button>\n## Capricciosa\nA rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\nA true favourite!\n\u003Cbutton>Add\u003C/button>\n",[143,2221,2222,2227,2232,2237,2242,2258,2263,2268,2273],{"__ignoreMap":348},[352,2223,2224],{"class":354,"line":355},[352,2225,2226],{"class":374},"# Our Pizza\n",[352,2228,2229],{"class":354,"line":394},[352,2230,2231],{"class":374},"## Margherita\n",[352,2233,2234],{"class":354,"line":413},[352,2235,2236],{"class":374},"A simple classic: mozzarela, tomatoes, and basil.\n",[352,2238,2239],{"class":354,"line":441},[352,2240,2241],{"class":374},"An everyday choice!\n",[352,2243,2244,2246,2248,2250,2252,2254,2256],{"class":354,"line":464},[352,2245,1571],{"class":1570},[352,2247,1770],{"class":1574},[352,2249,1680],{"class":1570},[352,2251,1785],{"class":374},[352,2253,1686],{"class":1570},[352,2255,1770],{"class":1574},[352,2257,1626],{"class":1570},[352,2259,2260],{"class":354,"line":490},[352,2261,2262],{"class":374},"## Capricciosa\n",[352,2264,2265],{"class":354,"line":513},[352,2266,2267],{"class":374},"A rich taste: mozzarella, ham, mushrooms, artichokes, and olives.\n",[352,2269,2270],{"class":354,"line":534},[352,2271,2272],{"class":374},"A true favourite!\n",[352,2274,2275,2277,2279,2281,2283,2285,2287],{"class":354,"line":550},[352,2276,1571],{"class":1570},[352,2278,1770],{"class":1574},[352,2280,1680],{"class":1570},[352,2282,1785],{"class":374},[352,2284,1686],{"class":1570},[352,2286,1770],{"class":1574},[352,2288,1626],{"class":1570},[11,2290,2291,2292,2294,2295,2297],{},"Asymptotic ",[143,2293,1444],{}," (kind of 'infinite' ",[143,2296,1444],{},") completely flattens the DOM, that is, leads to a full content linearisation similar to reader views as present in most browsers. Notably, it preserves all interactive elements like buttons – which are essential for a web agent.",[318,2299,2301],{"id":2300},"adaptived2snap",[1308,2302,2303],{},"AdaptiveD2Snap",[11,2305,2306,2307,2309,2310,2312],{},"Fixed parameters might not be ideal for arbitrary DOMs – sourced from a landscape of web applications. We created ",[1308,2308,2303],{}," – a wrapper for ",[1308,2311,1400],{}," that infers suitable parameters from a given DOM in order to hit a certain token budget.",[318,2314,2316],{"id":2315},"implementation-integration","Implementation & Integration",[11,2318,2319,2320,2322],{},"Picture an LLM-based weg agent that is premised on DOM snapshots. Implementing ",[1308,2321,1400],{}," is simple: Deep clone the DOM, and feed it to the algorithm. Now, take the snapshot; this is, serialise the resulting DOM. Done.",[1457,2324,2325],{},[11,2326,2327,2328,2332],{},"Read our ",[67,2329,2331],{"href":2330},"/blog/a-gentle-introduction-to-ai-agents-for-the-web","gentle introduction to AI agents for the web"," to get started with high-level web agent concepts.",[11,2334,2335,2336,2338,2339,2344],{},"The open source ",[1308,2337,1400],{}," API, provided as a ",[67,2340,2343],{"href":2341,"rel":2342},"https://github.com/webfuse-com/D2Snap",[165],"package on GitHub"," provides the following signature:",[343,2346,2350],{"className":2347,"code":2348,"language":2349,"meta":348,"style":348},"language-ts shiki shiki-themes catppuccin-latte night-owl","type DOM = Document | Element | string;\ntype Options = {\n  assignUniqueIDs?: boolean; // false\n  debug?: boolean;           // true\n};\n\nD2Snap.d2Snap(\n  dom: DOM,\n  k: number, l: number, m: number,\n  options?: Options\n): Promise\u003Cstring>\n\nD2Snap.adaptiveD2Snap(\n  dom: DOM,\n  maxTokens: number = 4096,\n  maxIterations: number = 5,\n  options?: Options\n): Promise\u003Cstring>\n\n","ts",[143,2351,2352,2381,2392,2410,2424,2429,2433,2445,2457,2474,2484,2500,2504,2515,2523,2536,2548,2556],{"__ignoreMap":348},[352,2353,2354,2357,2361,2363,2367,2370,2373,2375,2379],{"class":354,"line":355},[352,2355,2356],{"class":358},"type",[352,2358,2360],{"class":2359},"sXbZB"," DOM ",[352,2362,1581],{"class":366},[352,2364,2366],{"class":2365},"s-DR7"," Document",[352,2368,2369],{"class":1570}," |",[352,2371,2372],{"class":2365}," Element",[352,2374,2369],{"class":1570},[352,2376,2378],{"class":2377},"scrte"," string",[352,2380,391],{"class":390},[352,2382,2383,2385,2388,2390],{"class":354,"line":394},[352,2384,2356],{"class":358},[352,2386,2387],{"class":2359}," Options ",[352,2389,1581],{"class":366},[352,2391,410],{"class":390},[352,2393,2394,2398,2401,2404,2407],{"class":354,"line":413},[352,2395,2397],{"class":2396},"swl0y","  assignUniqueIDs",[352,2399,2400],{"class":1570},"?:",[352,2402,2403],{"class":2377}," boolean",[352,2405,2406],{"class":390},";",[352,2408,2409],{"class":882}," // false\n",[352,2411,2412,2415,2417,2419,2421],{"class":354,"line":441},[352,2413,2414],{"class":2396},"  debug",[352,2416,2400],{"class":1570},[352,2418,2403],{"class":2377},[352,2420,2406],{"class":390},[352,2422,2423],{"class":882},"           // true\n",[352,2425,2426],{"class":354,"line":464},[352,2427,2428],{"class":390},"};\n",[352,2430,2431],{"class":354,"line":490},[352,2432,752],{"emptyLinePlaceholder":751},[352,2434,2435,2437,2439,2442],{"class":354,"line":513},[352,2436,1400],{"class":374},[352,2438,71],{"class":430},[352,2440,2441],{"class":370},"d2Snap",[352,2443,2444],{"class":374},"(\n",[352,2446,2447,2450,2454],{"class":354,"line":534},[352,2448,2449],{"class":374},"  dom: ",[352,2451,2453],{"class":2452},"sqxXB","DOM",[352,2455,2456],{"class":390},",\n",[352,2458,2459,2462,2464,2467,2469,2472],{"class":354,"line":550},[352,2460,2461],{"class":374},"  k: number",[352,2463,723],{"class":390},[352,2465,2466],{"class":374}," l: number",[352,2468,723],{"class":390},[352,2470,2471],{"class":374}," m: number",[352,2473,2456],{"class":390},[352,2475,2476,2479,2481],{"class":354,"line":813},[352,2477,2478],{"class":374},"  options",[352,2480,2400],{"class":366},[352,2482,2483],{"class":374}," Options\n",[352,2485,2486,2489,2493,2495,2498],{"class":354,"line":828},[352,2487,2488],{"class":374},"): ",[352,2490,2492],{"class":2491},"s8Irk","Promise",[352,2494,1571],{"class":366},[352,2496,2497],{"class":374},"string",[352,2499,1626],{"class":366},[352,2501,2502],{"class":354,"line":1794},[352,2503,752],{"emptyLinePlaceholder":751},[352,2505,2506,2508,2510,2513],{"class":354,"line":1804},[352,2507,1400],{"class":374},[352,2509,71],{"class":430},[352,2511,2512],{"class":370},"adaptiveD2Snap",[352,2514,2444],{"class":374},[352,2516,2517,2519,2521],{"class":354,"line":1823},[352,2518,2449],{"class":374},[352,2520,2453],{"class":2452},[352,2522,2456],{"class":390},[352,2524,2525,2528,2530,2534],{"class":354,"line":1841},[352,2526,2527],{"class":374},"  maxTokens: number ",[352,2529,1581],{"class":366},[352,2531,2533],{"class":2532},"sZ_Zo"," 4096",[352,2535,2456],{"class":390},[352,2537,2538,2541,2543,2546],{"class":354,"line":1850},[352,2539,2540],{"class":374},"  maxIterations: number ",[352,2542,1581],{"class":366},[352,2544,2545],{"class":2532}," 5",[352,2547,2456],{"class":390},[352,2549,2550,2552,2554],{"class":354,"line":1856},[352,2551,2478],{"class":374},[352,2553,2400],{"class":366},[352,2555,2483],{"class":374},[352,2557,2558,2560,2562,2564,2566],{"class":354,"line":1862},[352,2559,2488],{"class":374},[352,2561,2492],{"class":2491},[352,2563,1571],{"class":366},[352,2565,2497],{"class":374},[352,2567,1626],{"class":366},[11,2569,2570,2571,2573,2574,2579,2580,2585],{},"Moreover, ",[1308,2572,1400],{}," it is available on the ",[67,2575,2578],{"href":2576,"rel":2577},"https://dev.webfuse.com/automation-api",[165],"Webfuse Automation API",". ",[67,2581,2584],{"href":2582,"rel":2583},"https://www.webfuse.com",[165],"Webfuse"," essentially is a proxy to seamlessly serve any existing web application with custom augmentations, such as a web agent widget.",[343,2587,2591],{"className":2588,"code":2589,"language":2590,"meta":348,"style":348},"language-js shiki shiki-themes catppuccin-latte night-owl","const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({ modifier: 'downsample' })\n","js",[143,2592,2593,2598,2603],{"__ignoreMap":348},[352,2594,2595],{"class":354,"line":355},[352,2596,2597],{},"const domSnapshot = await browser.webfuseSession\n",[352,2599,2600],{"class":354,"line":394},[352,2601,2602],{},"    .automation\n",[352,2604,2605],{"class":354,"line":413},[352,2606,2607],{},"    .take_dom_snapshot({ modifier: 'downsample' })\n",[11,2609,2610,2611,2613],{},"Need precise control over the underlying ",[1308,2612,1400],{}," invocation? Configure it exactly how you want:",[343,2615,2617],{"className":2588,"code":2616,"language":2590,"meta":348,"style":348},"const domSnapshot = await browser.webfuseSession\n    .automation\n    .take_dom_snapshot({\n        modifier: {\n            name: 'D2Snap',\n            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n        }\n    })\n",[143,2618,2619,2623,2627,2632,2637,2642,2647,2652],{"__ignoreMap":348},[352,2620,2621],{"class":354,"line":355},[352,2622,2597],{},[352,2624,2625],{"class":354,"line":394},[352,2626,2602],{},[352,2628,2629],{"class":354,"line":413},[352,2630,2631],{},"    .take_dom_snapshot({\n",[352,2633,2634],{"class":354,"line":441},[352,2635,2636],{},"        modifier: {\n",[352,2638,2639],{"class":354,"line":464},[352,2640,2641],{},"            name: 'D2Snap',\n",[352,2643,2644],{"class":354,"line":490},[352,2645,2646],{},"            params: { hierarchyRatio: 0.6, textRatio: 0.2, attributeRatio: 0.8 }\n",[352,2648,2649],{"class":354,"line":513},[352,2650,2651],{},"        }\n",[352,2653,2654],{"class":354,"line":534},[352,2655,2656],{},"    })\n",[318,2658,2660],{"id":2659},"performance-evaluation","Performance Evaluation",[11,2662,2663,2664,2666,2667,2669,2670,2672],{},"Now for the moment of truth: How does ",[1308,2665,1400],{}," stack up against the industry standard? We evaluated ",[1308,2668,1400],{}," in comparison to a grounded GUI snapshot baseline close to those used by ",[1308,2671,1280],{}," – coloured bounding boxes around visible interactive elements.",[11,2674,2675,2676,2681],{},"To evaluate snapshots isolated from specific agent logic, we crafted a dataset that spans all UI states that occur while solving a related task. We sampled our dataset from the existing ",[67,2677,2680],{"href":2678,"rel":2679},"https://github.com/OSU-NLP-Group/Online-Mind2Web",[165],"Online-Mind2Web"," dataset.",[76,2683],{":width":2684,"alt":2685,"format":80,"loading":133,"src":2686},"800","Exemplary solution UI state trajectory of a defined web-based task","/blog/dom-downsampling-for-web-agents/3.png",[11,2688,2689],{},[1323,2690,2691],{},"Exemplary solution UI state trajectory for the task: “View the pricing plan for 'Business'. Specifically, we have 100 users. We need a 1PB storage quota and a 50 TB transfer quota.”",[11,2693,2694],{},"These are our key findings...",[1469,2696,2698],{"id":2697},"substantial-success-rates","Substantial Success Rates",[11,2700,2701,2702,2704],{},"The results exceeded our expectations. Not only did ",[1308,2703,1400],{}," meet the baseline's performance – our best configuration outperformed it by a significant margin. Full linearisation matches performance, and estimated model input token size order of the baseline.",[76,2706],{":width":2707,"alt":2708,"format":80,"loading":133,"src":2709},"550","Success rate per web agent snapshot subject evaluated across the dataset","/blog/dom-downsampling-for-web-agents/4.png",[1323,2711,2712,2713,2720,2721,2723,2724,2727,2728,123,2731,2734,2735,2738,2739,2742,2743,2746],{},"\n  Success rate per web agent snapshot subject evaluated across the dataset.\n  Labels: ",[143,2714,2715,2716],{},"GUI",[2717,2718,2719],"sub",{}," gr.",": Baseline, ",[143,2722,2453],{},": Raw DOM (cut-off at ~8K tokens), ",[143,2725,2726],{},"k( l m)",": Parameter values; e.g., ",[143,2729,2730],{},".9 .3 .6",[143,2732,2733],{},".4"," if equal). ",[143,2736,2737],{},"∞",": Linearisation,  ",[143,2740,2741],{},"8192 / 32768",": via token-limited (resp.) ",[2744,2745,2303],"i",{},".\n",[1469,2748,2750],{"id":2749},"containable-token-and-byte-size","Containable Token and Byte Size",[11,2752,2753,2754,2756],{},"Even light downsampling delivers dramatic size reductions. Most ",[1308,2755,1400],{}," configurations average just one token order above the baseline – a massive improvement over raw DOM snapshots. Better yet, most DOMs from the dataset could actually be downsampled to the baseline order. And while image data balloons in file size, our text-based approach stays lean and efficient.",[76,2758],{":width":2684,"alt":2759,"format":80,"loading":133,"src":2760},"Comparison of mean input size across and per subject","/blog/dom-downsampling-for-web-agents/5.png",[1323,2762,2763,2764,2767,2768,2770],{},"\n  Left: Comparison of mean input size (tokens vs bytes) across and per subject.",[2765,2766],"br",{},"\n  Right: Estimated input token size across the dataset created by a single ",[2744,2769,1400],{}," evaluation subject.\n",[1469,2772,2774],{"id":2773},"hierarchy-actually-matters","Hierarchy Actually Matters",[11,2776,2777],{},"Which UI feature matters most for LLM web agent backend performance? We alternated parameter configurations to find out. Interestingly, hierarchy reveals itself as the strongest of the three assessed features. Element extraction throws away hierarchy, which suggests that downsampling is a superior technique.",[1480,2779,2782,2787],{"className":2780,"dataFootnotes":348},[2781],"footnotes",[88,2783,2786],{"className":2784,"id":1340},[2785],"sr-only","Footnotes",[1349,2788,2789,2803,2814,2825],{},[29,2790,2792,34,2796],{"id":2791},"user-content-fn-1",[67,2793,2794],{"href":2794,"rel":2795},"https://arxiv.org/abs/2210.03945",[165],[67,2797,2802],{"href":2798,"ariaLabel":2799,"className":2800,"dataFootnoteBackref":348},"#user-content-fnref-1","Back to reference 1",[2801],"data-footnote-backref","↩",[29,2804,2806,34,2809],{"id":2805},"user-content-fn-2",[67,2807,1406],{"href":1406,"rel":2808},[165],[67,2810,2802],{"href":2811,"ariaLabel":2812,"className":2813,"dataFootnoteBackref":348},"#user-content-fnref-2","Back to reference 2",[2801],[29,2815,2817,34,2820],{"id":2816},"user-content-fn-3",[67,2818,2341],{"href":2341,"rel":2819},[165],[67,2821,2802],{"href":2822,"ariaLabel":2823,"className":2824,"dataFootnoteBackref":348},"#user-content-fnref-3","Back to reference 3",[2801],[29,2826,2828,34,2832],{"id":2827},"user-content-fn-4",[67,2829,2830],{"href":2830,"rel":2831},"https://aclanthology.org/W04-3252",[165],[67,2833,2802],{"href":2834,"ariaLabel":2835,"className":2836,"dataFootnoteBackref":348},"#user-content-fnref-4","Back to reference 4",[2801],[1193,2838,2839],{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .sY2RG, html code.shiki .sY2RG{--shiki-default:#1E66F5;--shiki-dark:#CAECE6}html pre.shiki code .swkLt, html code.shiki .swkLt{--shiki-default:#DF8E1D;--shiki-default-font-style:inherit;--shiki-dark:#C5E478;--shiki-dark-font-style:italic}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sfrMT, html code.shiki .sfrMT{--shiki-default:#40A02B;--shiki-dark:#ECC48D}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s-_ek, html code.shiki .s-_ek{--shiki-default:#179299;--shiki-dark:#C792EA}html pre.shiki code .s-DR7, html code.shiki .s-DR7{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#FFCB8B;--shiki-dark-font-style:inherit}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .swl0y, html code.shiki .swl0y{--shiki-default:#4C4F69;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .sDmS1, html code.shiki .sDmS1{--shiki-default:#7C7F93;--shiki-default-font-style:italic;--shiki-dark:#637777;--shiki-dark-font-style:italic}html pre.shiki code .s5FwJ, html code.shiki .s5FwJ{--shiki-default:#179299;--shiki-default-font-style:inherit;--shiki-dark:#C792EA;--shiki-dark-font-style:italic}html pre.shiki code .sNstc, html code.shiki .sNstc{--shiki-default:#1E66F5;--shiki-default-font-style:italic;--shiki-dark:#82AAFF;--shiki-dark-font-style:italic}html pre.shiki code .sqxXB, html code.shiki .sqxXB{--shiki-default:#4C4F69;--shiki-dark:#82AAFF}html pre.shiki code .s8Irk, html code.shiki .s8Irk{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#C5E478;--shiki-dark-font-style:inherit}html pre.shiki code .sZ_Zo, html code.shiki .sZ_Zo{--shiki-default:#FE640B;--shiki-dark:#F78C6C}",{"title":348,"searchDepth":394,"depth":394,"links":2841},[2842,2846,2847,2854],{"id":1288,"depth":394,"text":1289,"children":2843},[2844,2845],{"id":1298,"depth":413,"text":1299},{"id":1328,"depth":413,"text":1329},{"id":1379,"depth":394,"text":1380},{"id":1397,"depth":394,"text":1400,"children":2848},[2849,2850,2851,2852,2853],{"id":1434,"depth":413,"text":1435},{"id":1555,"depth":413,"text":1556},{"id":2300,"depth":413,"text":2303},{"id":2315,"depth":413,"text":2316},{"id":2659,"depth":413,"text":2660},{"id":1340,"depth":394,"text":2786},"2025-08-18","We propose D2Snap – a first-of-its-kind downsampling algorithm for DOMs. D2Snap can be used as a pre-processing technique for DOM snapshots to optimise web agency context quality and token costs.",{"homepage":751,"relatedLinks":2858},[2859,2862,2865],{"text":2860,"href":303,"description":2861},"What is a Website Snapshot?","Learn what a website snapshot is and how to utilise it for web agents",{"text":2863,"href":2330,"description":2864},"What is a Web Agent?","Learn the basics of web agents",{"text":2578,"href":2866,"external":751,"description":2867},"https://dev.webfuse.com/automation-api#take_dom_snapshot","Check out the Webfuse Automation API",{"title":309,"description":2856},{"loc":308},"blog/1012.dom-downsampling-for-llm-based-web-agents",[1210,2872,2873,2874,1252,2875],"browser-agents","llms","llm-context","web-automation","bGJtg_9k7O95O2CJswaRFj4ONGhX4hGr_8aL5dhDZms",{"id":2878,"title":2879,"authorId":1257,"body":2880,"category":1210,"created":3607,"description":3608,"extension":1213,"faqs":1231,"featurePriority":394,"head":1231,"landingPath":1231,"meta":3609,"navigation":751,"ogImage":1231,"path":2330,"robots":1231,"schemaOrg":1231,"seo":3618,"sitemap":3619,"stem":3620,"tags":3621,"__hash__":3622},"blog/blog/1011.a-gentle-introduction-to-ai-agents-for-the-web.md","A Gentle Introduction to AI Agents for the Web",{"type":8,"value":2881,"toc":3588},[2882,2896,2899,2906,2912,2916,2919,2934,2938,2948,2952,2956,2969,2973,2977,2980,2985,2989,2998,3002,3013,3018,3022,3040,3044,3050,3151,3154,3387,3403,3407,3410,3415,3419,3422,3426,3444,3469,3476,3480,3518,3521,3532,3536,3539,3567,3571,3579,3585],[11,2883,2884,2885,119,2889,1448,2892,2895],{},"In no time, AI became a natural part of modern web interfaces. AI agents for the web enjoy a recent hype, sparked by the means of ",[67,2886,1270],{"href":2887,"rel":2888},"https://openai.com/index/introducing-operator/",[165],[67,2890,1275],{"href":1273,"rel":2891},[165],[67,2893,1280],{"href":1278,"rel":2894},[165],". By now, it is within reach to automate arbitrary web-based tasks, such as booking the cheapest flight from Berlin to Amsterdam.",[88,2897,2863],{"id":2898},"what-is-a-web-agent",[11,2900,2901,2902,2905],{},"For starters, let us break down the term ",[15,2903,2904],{},"web AI agent",": An agent is an entity that autonomously acts on behalf of another entity. An artificially intelligent agent is an application that acts on behalf of a human. In contrast to non-AI computer agents, it solves complex tasks with at least human-grade effectiveness and efficiency. For a human-centric web, web agents have deliberately been designed to browse the web in a human fashion – through UIs rather than APIs.",[76,2907],{":width":2908,"alt":2909,"format":2910,"loading":133,"src":2911},"610","High-level agent description comparing human and computer agents","svg","/blog/a-gentle-introduction-to-ai-agents-for-the-web/1.svg",[318,2913,2915],{"id":2914},"the-role-of-frontier-llms","The Role of Frontier LLMs",[11,2917,2918],{},"Web agents have been a vague desire for a long time. AI agents used to rely on complete models of a problem domain in order to allow (heuristic) search through problem states. Such models would comprise the problem world (e.g., a chessboard), actors (pawns, rooks, etc.), possible actions per actor (rook moves straight), and constraints (i.a., max one piece per field). A heterogeneous space of web application UIs describes the problem domain of a web agent: how to understand a web page, and how to interact with it to solve the declared task?",[11,2920,2921,2922,2929,2930,2933],{},"Frontier LLMs disrupted the AI agent world: explicit problem domain models beyond feasibility can now be replaced by an LLM. The LLM thereby acts as an instantaneous domain model backend that can be consulted with twofold context: serialised problem state, such as a chess position code (",[1308,2923,2924,2925,2928],{},"“",[352,2926,2927],{},"..."," e4 e5 2. Nc3 f5”","), and the respective task (",[1308,2931,2932],{},"“What is the best move for white?”","). For web agents, problem state corresponds to the currently browsed web application's runtime state, for instance, a screenshot.",[318,2935,2937],{"id":2936},"generalist-web-agents","Generalist Web Agents",[11,2939,2940,2941,1448,2944,2947],{},"Generalist web agents are supposed to solve arbitrary tasks through a web browser. Web-based tasks can be as diverse as ",[1308,2942,2943],{},"“Find a picture of a cat.”",[1308,2945,2946],{},"“Book the cheapest flight from Berlin to Amsterdam tomorrow afternoon (business class, window seat).”"," In reality, generalist agents still fail uncommon or too precise tasks. While they have been critically acclaimed, they mainly act as early proofs-of-concept. Tasks that are indeed solvable with a generalist agent promise great results with an according specialist agent.",[76,2949],{":width":78,"alt":2950,"format":80,"loading":133,"src":2951},"Screenshot of a generalist web agent UI (Director)","/blog/a-gentle-introduction-to-ai-agents-for-the-web/2.png",[318,2953,2955],{"id":2954},"specialist-web-agents","Specialist Web Agents",[11,2957,2958,2959,2962,2963,2968],{},"Other than generalist agents, specialist web agents are constrained to a certain task and application domain. Specialist agents bear the major share of commercial value. Most prominently, modal chat agents that provide users with on-page help. Picture a little floating widget that can be chatted to via text or voice input. In most cases, in fact, the term ",[1308,2960,2961],{},"web (AI) agent"," refers to chat agents. Chat agents – text or voice – can be implemented on top of virtually any existing website. Frontier LLMs provide a lot of commonsense out-of-the-box. A ",[67,2964,2967],{"href":2965,"rel":2966},"https://docs.claude.com/en/docs/build-with-claude/prompt-engineering/system-prompts",[165],"system prompt"," can, moreover, be leveraged to drive specialist agent quality for the respective problem domain.",[76,2970],{":width":78,"alt":2971,"format":80,"loading":133,"src":2972},"Screenshots of two modal specialist web agent UIs augmenting an underlying website's UI","/blog/a-gentle-introduction-to-ai-agents-for-the-web/3.png",[88,2974,2976],{"id":2975},"how-does-a-web-agent-work","How Does a Web Agent Work?",[11,2978,2979],{},"LLM-based web agents are premised on a more or less uniform architecture. The agent application embodies a mediator between a web browser (environment), and the LLM backend (model).",[76,2981],{":width":2982,"alt":2983,"format":2910,"loading":133,"src":2984},"480","High-level web agent architecture component view","/blog/a-gentle-introduction-to-ai-agents-for-the-web/4.svg",[318,2986,2988],{"id":2987},"the-agent-lifecycle","The Agent Lifecycle",[11,2990,2991,2992,2997],{},"To reduce a user's cognitive load, solving a web-based task is usually chunked into a sequence of UI states. Consider looking for rental apartments on ",[67,2993,2996],{"href":2994,"rel":2995},"https://www.redfin.com",[165],"redfin.com",": In the first step, you specify a location. Only subsequently are you provided with a grid of available apartments for that location.",[76,2999],{":width":78,"alt":3000,"format":80,"loading":133,"src":3001},"Example of separated UI states in a rental home search application","/blog/a-gentle-introduction-to-ai-agents-for-the-web/5.png",[11,3003,3004,3005,3012],{},"Web agent logic is iterative; not least for a sequential web interaction model, but also for a conversational agent interaction model. Browsing the web, human and computer agents represent users alike. That said, Norman's well-known ",[67,3006,3009],{"href":3007,"rel":3008},"https://mitpress.mit.edu/9780262640374/the-design-of-everyday-things/",[165],[1308,3010,3011],{},"Seven Stages of Action",", which hierarchically model the human cognition cycle, transfer to the web agent lifecycle. For each UI state in a web browser (environment) and web-based task (action intention); decide where to click, type, etc. (action planning), and perform those clicks, etc. (action execution). Afterwards, perceive, interpret, and evaluate the results of those actions in the web browser (state). As long as there is a mismatch between the evaluated state and the declared goal state, repeat that cycle. Potentially prompt the user with more required information.",[76,3014],{":width":3015,"alt":3016,"format":2910,"loading":133,"src":3017},"580","Donald 'Norman's Seven Stages of Action' model of the human cognition cycle that transfers to non-human agents","/blog/a-gentle-introduction-to-ai-agents-for-the-web/6.svg",[318,3019,3021],{"id":3020},"web-context-for-llms","Web Context for LLMs",[11,3023,3024,3025,3027,3028,3031,3032,3035,3036,3039],{},"The gap from an agent towards the environment, according to ",[1308,3026,3011],{},", is known as the ",[1308,3029,3030],{},"gulf of execution",". In real-world scenarios, how to act in the environment in respect to a planned sequence of actions might be difficult (e.g., how to actually open the trunk of a new car?). Arguably, web agents face a novel ",[1308,3033,3034],{},"gulf of intention"," towards the action planning stage: how to serialise a currently browsed web page's runtime state for LLMs? ",[1308,3037,3038],{},"Snapshot"," is a more comprehensive term to describe the serialisation of a web page's current runtime state. Screenshots, for instance, represent a type of snapshot that closely resembles how humans perceive a web page at a given point in time. But are they as accessible to LLMs?",[318,3041,3043],{"id":3042},"agentic-ui-interaction","Agentic UI Interaction",[11,3045,3046,3047,3049],{},"With a qualified set of well-defined actuation methods, web agents are able to close the ",[1308,3048,3030],{}," quite well. HTML element types strongly afford a certain action (e.g., click a button, type to a field). Below is how an actuation schema to present the LLM backend with could look like:",[343,3051,3053],{"className":2347,"code":3052,"language":2349,"meta":348,"style":348},"interface ActuationSchema = {\n    thought: string;\n    action: \"click\"\n        | \"scroll\"\n        | \"type\";\n    cssSelector: string;\n    data?: string;\n}[];\n",[143,3054,3055,3069,3080,3096,3108,3120,3131,3142],{"__ignoreMap":348},[352,3056,3057,3060,3063,3066],{"class":354,"line":355},[352,3058,3059],{"class":358},"interface",[352,3061,3062],{"class":2359}," ActuationSchema",[352,3064,3065],{"class":374}," = ",[352,3067,3068],{"class":390},"{\n",[352,3070,3071,3074,3076,3078],{"class":354,"line":394},[352,3072,3073],{"class":374},"    thought",[352,3075,732],{"class":1570},[352,3077,2378],{"class":2377},[352,3079,391],{"class":390},[352,3081,3082,3085,3087,3089,3093],{"class":354,"line":413},[352,3083,3084],{"class":374},"    action",[352,3086,732],{"class":1570},[352,3088,735],{"class":378},[352,3090,3092],{"class":3091},"sgAC-","click",[352,3094,3095],{"class":378},"\"\n",[352,3097,3098,3101,3103,3106],{"class":354,"line":441},[352,3099,3100],{"class":1570},"        |",[352,3102,735],{"class":378},[352,3104,3105],{"class":3091},"scroll",[352,3107,3095],{"class":378},[352,3109,3110,3112,3114,3116,3118],{"class":354,"line":464},[352,3111,3100],{"class":1570},[352,3113,735],{"class":378},[352,3115,2356],{"class":3091},[352,3117,379],{"class":378},[352,3119,391],{"class":390},[352,3121,3122,3125,3127,3129],{"class":354,"line":490},[352,3123,3124],{"class":374},"    cssSelector",[352,3126,732],{"class":1570},[352,3128,2378],{"class":2377},[352,3130,391],{"class":390},[352,3132,3133,3136,3138,3140],{"class":354,"line":513},[352,3134,3135],{"class":374},"    data",[352,3137,2400],{"class":1570},[352,3139,2378],{"class":2377},[352,3141,391],{"class":390},[352,3143,3144,3146,3149],{"class":354,"line":534},[352,3145,553],{"class":390},[352,3147,3148],{"class":374},"[]",[352,3150,391],{"class":390},[11,3152,3153],{},"And a suggested actions response could, in turn, look as follows:",[343,3155,3159],{"className":3156,"code":3157,"language":3158,"meta":348,"style":348},"language-json shiki shiki-themes catppuccin-latte night-owl","[\n    {\n        \"thought\": \"Scroll newsletter cta into view\",\n        \"action\": \"scroll\",\n        \"cssSelector\": \"section#newsletter\"\n    },\n    {\n        \"thought\": \"Type email address to newsletter cta\",\n        \"action\": \"type\",\n        \"cssSelector\": \"section#newsletter > input\",\n        \"data\": \"user@example.org\"\n    },\n    {\n        \"thought\": \"Submit newsletter sign up\",\n        \"action\": \"click\",\n        \"cssSelector\": \"section#newsletter > button\"\n    }\n]\n","json",[143,3160,3161,3166,3171,3195,3214,3232,3237,3241,3260,3278,3297,3315,3319,3323,3342,3360,3377,3382],{"__ignoreMap":348},[352,3162,3163],{"class":354,"line":355},[352,3164,3165],{"class":390},"[\n",[352,3167,3168],{"class":354,"line":394},[352,3169,3170],{"class":390},"    {\n",[352,3172,3173,3177,3181,3183,3185,3187,3191,3193],{"class":354,"line":413},[352,3174,3176],{"class":3175},"srFR9","        \"",[352,3178,3180],{"class":3179},"s30W1","thought",[352,3182,379],{"class":3175},[352,3184,732],{"class":390},[352,3186,735],{"class":378},[352,3188,3190],{"class":3189},"sCC8C","Scroll newsletter cta into view",[352,3192,379],{"class":378},[352,3194,2456],{"class":390},[352,3196,3197,3199,3202,3204,3206,3208,3210,3212],{"class":354,"line":441},[352,3198,3176],{"class":3175},[352,3200,3201],{"class":3179},"action",[352,3203,379],{"class":3175},[352,3205,732],{"class":390},[352,3207,735],{"class":378},[352,3209,3105],{"class":3189},[352,3211,379],{"class":378},[352,3213,2456],{"class":390},[352,3215,3216,3218,3221,3223,3225,3227,3230],{"class":354,"line":464},[352,3217,3176],{"class":3175},[352,3219,3220],{"class":3179},"cssSelector",[352,3222,379],{"class":3175},[352,3224,732],{"class":390},[352,3226,735],{"class":378},[352,3228,3229],{"class":3189},"section#newsletter",[352,3231,3095],{"class":378},[352,3233,3234],{"class":354,"line":490},[352,3235,3236],{"class":390},"    },\n",[352,3238,3239],{"class":354,"line":513},[352,3240,3170],{"class":390},[352,3242,3243,3245,3247,3249,3251,3253,3256,3258],{"class":354,"line":534},[352,3244,3176],{"class":3175},[352,3246,3180],{"class":3179},[352,3248,379],{"class":3175},[352,3250,732],{"class":390},[352,3252,735],{"class":378},[352,3254,3255],{"class":3189},"Type email address to newsletter cta",[352,3257,379],{"class":378},[352,3259,2456],{"class":390},[352,3261,3262,3264,3266,3268,3270,3272,3274,3276],{"class":354,"line":550},[352,3263,3176],{"class":3175},[352,3265,3201],{"class":3179},[352,3267,379],{"class":3175},[352,3269,732],{"class":390},[352,3271,735],{"class":378},[352,3273,2356],{"class":3189},[352,3275,379],{"class":378},[352,3277,2456],{"class":390},[352,3279,3280,3282,3284,3286,3288,3290,3293,3295],{"class":354,"line":813},[352,3281,3176],{"class":3175},[352,3283,3220],{"class":3179},[352,3285,379],{"class":3175},[352,3287,732],{"class":390},[352,3289,735],{"class":378},[352,3291,3292],{"class":3189},"section#newsletter > input",[352,3294,379],{"class":378},[352,3296,2456],{"class":390},[352,3298,3299,3301,3304,3306,3308,3310,3313],{"class":354,"line":828},[352,3300,3176],{"class":3175},[352,3302,3303],{"class":3179},"data",[352,3305,379],{"class":3175},[352,3307,732],{"class":390},[352,3309,735],{"class":378},[352,3311,3312],{"class":3189},"user@example.org",[352,3314,3095],{"class":378},[352,3316,3317],{"class":354,"line":1794},[352,3318,3236],{"class":390},[352,3320,3321],{"class":354,"line":1804},[352,3322,3170],{"class":390},[352,3324,3325,3327,3329,3331,3333,3335,3338,3340],{"class":354,"line":1823},[352,3326,3176],{"class":3175},[352,3328,3180],{"class":3179},[352,3330,379],{"class":3175},[352,3332,732],{"class":390},[352,3334,735],{"class":378},[352,3336,3337],{"class":3189},"Submit newsletter sign up",[352,3339,379],{"class":378},[352,3341,2456],{"class":390},[352,3343,3344,3346,3348,3350,3352,3354,3356,3358],{"class":354,"line":1841},[352,3345,3176],{"class":3175},[352,3347,3201],{"class":3179},[352,3349,379],{"class":3175},[352,3351,732],{"class":390},[352,3353,735],{"class":378},[352,3355,3092],{"class":3189},[352,3357,379],{"class":378},[352,3359,2456],{"class":390},[352,3361,3362,3364,3366,3368,3370,3372,3375],{"class":354,"line":1850},[352,3363,3176],{"class":3175},[352,3365,3220],{"class":3179},[352,3367,379],{"class":3175},[352,3369,732],{"class":390},[352,3371,735],{"class":378},[352,3373,3374],{"class":3189},"section#newsletter > button",[352,3376,3095],{"class":378},[352,3378,3379],{"class":354,"line":1856},[352,3380,3381],{"class":390},"    }\n",[352,3383,3384],{"class":354,"line":1862},[352,3385,3386],{"class":390},"]\n",[1457,3388,3389],{},[11,3390,3391,3396,3397,3402],{},[67,3392,3395],{"href":3393,"rel":3394},"https://platform.openai.com/docs/guides/function-calling",[165],"Function Calling"," and the ",[67,3398,3401],{"href":3399,"rel":3400},"https://modelcontextprotocol.io",[165],"Model Context Protocol"," represent two ends to outsource an explicit actuation model – server- and client-side, respectively.",[318,3404,3406],{"id":3405},"agentic-ui-augmentation","Agentic UI Augmentation",[11,3408,3409],{},"An agent represents yet another feature to integrate with an application and its UI. Discoverability and availability, however, are among the most fundamental requirements of a web agent. Evidently, when a user experiences UI/UX friction, at least the agent should be interactive. That said, a scrolling modal web agent UI has been the go-to approach, that is, a little floating widget on top of the underlying application's UI. It comes with a major advantage: the agent application can be decoupled from the underlying, self-contained application.",[76,3411],{":width":3412,"alt":3413,"format":2910,"loading":133,"src":3414},"360","Depiction of a web agent application augmenting an underlying application in an isolated layer","/blog/a-gentle-introduction-to-ai-agents-for-the-web/7.svg",[88,3416,3418],{"id":3417},"how-to-build-a-web-agent","How to Build a Web Agent?",[11,3420,3421],{},"Believe it or not: enhancing an existing web application with a purposeful agent is a lower-hanging fruit. The evolving agent ecosystem provides you with a spectrum of solutions: instantly use a pre-compiled agent, tweak a templated agent, or develop an agent from scratch. Either way, LLMs and web browsers exist for reuse, boiling down agent development to LLM context engineering, and UI augmentation.",[318,3423,3425],{"id":3424},"develop-a-web-agent","Develop a Web Agent",[11,3427,3428,3429,3432,3433,1448,3438,3443],{},"Opting for a ",[15,3430,3431],{},"pre-compiled agent"," does not necessarily involve any actual development step. Instead, pre-compiled agents allow for high-level configuration through an agent-as-a-service provider's interface. Popular agent-as-a-service providers are, i.a., ",[67,3434,3437],{"href":3435,"rel":3436},"https://elevenlabs.io/conversational-ai",[165],"ElevenLabs",[67,3439,3442],{"href":3440,"rel":3441},"https://www.intercom.com/drlp/ai-agent",[165],"Intercom",". Serviced agents hide LLM communication and potentially interaction with a web browser behind the configuration interface.",[11,3445,3446,3447,3450,3451,3456,3457,3462,3463,3468],{},"Using a ",[15,3448,3449],{},"templated agent"," resembles the agent-as-a-service approach on a lower level. Openly sourced from a ",[67,3452,3455],{"href":3453,"rel":3454},"https://github.com/webfuse-com/agent-extension-blueprint",[165],"code repository",", templated agents allow for any kind of development tweaks. Favourably, agent templates shortcut integration with ",[67,3458,3461],{"href":3459,"rel":3460},"https://openai.com/api/",[165],"LLM APIs"," and web ",[67,3464,3467],{"href":3465,"rel":3466},"https://developer.mozilla.org/en-US/docs/Web/API",[165],"browser APIs",". Using a templated agent usually represents the preferable, best-of-both-worlds approach; common- and best-practice code snippets are available from the beginning, but everything can be customised as desired.",[11,3470,3471,3472,3475],{},"Of course, developing an ",[15,3473,3474],{},"agent from scratch"," is always an option. It is preferable whenever agent requirements deviate to a large extent from what exists in the service or template landscape.",[318,3477,3479],{"id":3478},"deploy-a-web-agent","Deploy a Web Agent",[11,3481,3482,3483,305,3488,3493,3494,3499,3500,3505,3506,3511,3512,3517],{},"When web agent code lives side-by-side with the augmented application's code, agent deployment is covered by a generic pipeline. Something like: ",[67,3484,3487],{"href":3485,"rel":3486},"https://eslint.org",[165],"linting",[67,3489,3492],{"href":3490,"rel":3491},"https://prettier.io",[165],"formatting"," agent code, ",[67,3495,3498],{"href":3496,"rel":3497},"https://esbuild.github.io",[165],"transpiling and bundling"," agent modules, ",[67,3501,3504],{"href":3502,"rel":3503},"https://www.cypress.io",[165],"testing"," agent, ",[67,3507,3510],{"href":3508,"rel":3509},"https://pages.cloudflare.com",[165],"hosting"," agent bundle, and ",[67,3513,3516],{"href":3514,"rel":3515},"https://docs.github.com/en/actions/get-started/continuous-integration",[165],"tiggering"," post deployment events. In that case, an agent represents a modular feature component in the application, no different than, for instance, a sign-up component.",[11,3519,3520],{},"Web agent source code right inside the application codebase comes at a cost:",[26,3522,3523,3526,3529],{},[29,3524,3525],{},"Agent developers can manipulate the source code of the underlying application.",[29,3527,3528],{},"Agent functionality could introduce side effects on the underlying application.",[29,3530,3531],{},"Agent changes require deployment of the entire application.",[318,3533,3535],{"id":3534},"best-practices-of-agentic-ux","Best Practices of Agentic UX",[11,3537,3538],{},"When designing user experiences for agent-enhanced applications, there are a few things to consider:",[26,3540,3541,3542,3541,3551,3541,3559],{},"\n    ",[29,3543,3544,3545,3544,3548,3550],{},"\n        ",[15,3546,3547],{},"Stream input and output to reduce latency",[2765,3549],{},"\n        LLMs (re-)introduce noticeable communication round-trip time. To reduce wait time for the human user, stream chunks of data whenever they are available.\n    ",[29,3552,3544,3553,3544,3556,3558],{},[15,3554,3555],{},"Provide fine-grained feedback to bridge high-latency",[2765,3557],{},"\n        Human attention is sensitive to several seconds of [system response time](https://www.nngroup.com/articles/response-times-3-important-limits/). Periodically provide agent _thoughts_ as feedback to perceptibly break down round-trip time.\n    ",[29,3560,3544,3561,3544,3564,3566],{},[15,3562,3563],{},"Always prompt the human user for consent to perform critical actions",[2765,3565],{},"\n        Some actions in a web application lead to irreversible or significant changes of state. Never have the agent perform such actions on behalf of the user without explicitly asking for the permission.\n    ",[318,3568,3570],{"id":3569},"non-invasive-web-agents-with-webfuse","Non-Invasive Web Agents with Webfuse",[11,3572,3573,3578],{},[67,3574,3576],{"href":2582,"rel":3575},[165],[15,3577,2584],{}," is a configurable web proxy that lets you augment any web application. As pictured, web agents represent highly self-contained applications. Moreover, web agents and underlying applications communicate at runtime in the client. This does, in fact, render opportunities to bridge the above-mentioned drawbacks with Webfuse: Develop web agents with a sandbox extension methodology, and deploy them through the low-latency proxy layer. On demand, seamlessly serve users with your agent-enhanced website. Benefit from information hiding, safe code, and fewer deployments.",[3580,3581],"article-signup-cta",{":demoAction":3582,"heading":3583,"subtitle":3584},"{\"text\":\"Read more\",\"showIcon\":false,\"href\":\"https://www.webfuse.com/blog/category/ai-agents\"}","Deploy Web Agents with Webfuse","Develop or deploy web agents in minutes; serve agent-enhanced websites through an isolated application layer.",[1193,3586,3587],{},"html pre.shiki code .s76yb, html code.shiki .s76yb{--shiki-default:#8839EF;--shiki-dark:#C792EA}html pre.shiki code .sXbZB, html code.shiki .sXbZB{--shiki-default:#DF8E1D;--shiki-default-font-style:italic;--shiki-dark:#D6DEEB;--shiki-dark-font-style:inherit}html pre.shiki code .s2kId, html code.shiki .s2kId{--shiki-default:#4C4F69;--shiki-dark:#D6DEEB}html pre.shiki code .scGhl, html code.shiki .scGhl{--shiki-default:#7C7F93;--shiki-dark:#D6DEEB}html pre.shiki code .s9rnR, html code.shiki .s9rnR{--shiki-default:#179299;--shiki-dark:#7FDBCA}html pre.shiki code .scrte, html code.shiki .scrte{--shiki-default:#8839EF;--shiki-dark:#C5E478}html pre.shiki code .sbuKk, html code.shiki .sbuKk{--shiki-default:#40A02B;--shiki-dark:#D9F5DD}html pre.shiki code .sgAC-, html code.shiki .sgAC-{--shiki-default:#40A02B;--shiki-default-font-style:italic;--shiki-dark:#ECC48D;--shiki-dark-font-style:inherit}html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .srFR9, html code.shiki .srFR9{--shiki-default:#7C7F93;--shiki-dark:#7FDBCA}html pre.shiki code .s30W1, html code.shiki .s30W1{--shiki-default:#1E66F5;--shiki-dark:#7FDBCA}html pre.shiki code .sCC8C, html code.shiki .sCC8C{--shiki-default:#40A02B;--shiki-dark:#C789D6}",{"title":348,"searchDepth":394,"depth":394,"links":3589},[3590,3595,3601],{"id":2898,"depth":394,"text":2863,"children":3591},[3592,3593,3594],{"id":2914,"depth":413,"text":2915},{"id":2936,"depth":413,"text":2937},{"id":2954,"depth":413,"text":2955},{"id":2975,"depth":394,"text":2976,"children":3596},[3597,3598,3599,3600],{"id":2987,"depth":413,"text":2988},{"id":3020,"depth":413,"text":3021},{"id":3042,"depth":413,"text":3043},{"id":3405,"depth":413,"text":3406},{"id":3417,"depth":394,"text":3418,"children":3602},[3603,3604,3605,3606],{"id":3424,"depth":413,"text":3425},{"id":3478,"depth":413,"text":3479},{"id":3534,"depth":413,"text":3535},{"id":3569,"depth":413,"text":3570},"2025-06-15","LLMs only recently enabled serviceable web agents: autonomous systems that browse web on behalf of a human. Get started with fundamental methodology, key design challenges, and technological opportunities.",{"homepage":751,"relatedLinks":3610},[3611,3612,3616],{"text":2860,"href":303,"description":2861},{"text":3613,"href":3614,"description":3615},"Develop an AI Agent for Any Website with Webfuse","/blog/develop-an-ai-agent-for-any-website-with-webfuse","Learn how to develop and deploy a web agent for any website with Webfuse",{"text":2578,"href":3617,"external":751,"description":2867},"https://dev.webfuse.com/automation-api/",{"title":2879,"description":3608},{"loc":2330},"blog/1011.a-gentle-introduction-to-ai-agents-for-the-web",[1210,2872,2873,1252,2875],"Ky-gggxmZkldeN3wb7OvPpBxNaP72MwefaxFypvbUzY",1777376332612]