
But today we’ll be exploring headless Chrome via Puppeteer, as it’s a relatively newer player, released at the start of 2018. There are many web scraping tools that can be used for headless browsing, like Zombie.js or headless Firefox using Selenium. Instead of interacting with visual elements the way you normally would-for example with a mouse or touch device-you automate use cases with a command-line interface (CLI). Headless? Excuse me? Yes, this just means there’s no graphical user interface (GUI). Now, what if we could leverage this functionality for our scraping needs and had a way to control browsers programmatically? That’s exactly where headless browser automation steps in! Now, this is a problem if we are doing some kind of web scraping or web automation because more times than not, the content that we’d like to see or scrape is actually rendered by JavaScript code and is not accessible from the raw HTML response that the server delivers.Īs we mentioned above, browsers do know how to process the JavaScript and render beautiful web pages. The server returns JavaScript files or scripts injected into an HTML response, and the browser processes it. In other words, nowadays JavaScript rules the web, including almost everything you interact with on websites.įor our purposes, JavaScript is a client-side language.

Now there are much more interactive web apps with beautiful UIs, which are often built with frameworks such as Angular or React. The last few years have seen the web evolve from simplistic websites built with bare HTML and CSS. What Is a Headless Browser and Why Is It Needed? Puppeteer creates its own browser user profile which it cleans up on every run.In this article, we’ll see how easy it is to perform web scraping (web automation) with the somewhat non-traditional method of using a headless browser. This article describes some differences for Linux users. See this article for a description of the differences between Chromium and Chrome. See Puppeteer.launch() for more information.


You can also use Puppeteer with Firefox Nightly (experimental support). const puppeteer = require ( 'puppeteer' ) Ĭonst browser = await puppeteer. You create an instance of Browser, open pages, and then manipulate them with Puppeteer's API.Įxample: navigating to and saving a screenshot as example.png: Puppeteer will be familiar to people using other browser testing frameworks. All examples below use async/await which is only supported in Node v7.6.0 or greater. Starting from v3.0.0 Puppeteer starts to rely on Node 10.18.1+.

Prior to v1.18.1, Puppeteer required at least Node v6.4.0.
