How to Set Up a Browser Inside Docker for Puppeteer Using Node 20 Slim Image
Web scraping and automated testing are essential parts of modern web development. Puppeteer, a Node library, provides a high-level API over the Chrome or Chromium browser. However, setting it up in a Docker container can be challenging. This blog post will guide you through setting up a browser inside Docker for Puppeteer, using the slim variant of the Node 20 image.
# Use the slim variant of the Node 20 image
FROM node:20-slim
WORKDIR /app
# Install dependencies for Puppeteer
# The slim image is Debian-based, so we use apt-get
RUN apt-get update && apt-get install -y \
wget \
curl \
git \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxi6 \
libxtst6 \
libnss3 \
libcups2 \
libxss1 \
libxrandr2 \
libasound2 \
libpangocairo-1.0-0 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libgtk-3-0 \
libgbm1 \
tzdata \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Set timezone if needed
ENV TZ=UTC
RUN ln -fs /usr/share/zoneinfo/$TZ /etc/localtime && dpkg-reconfigure -f noninteractive tzdata
# Install gnupg and other basic utilities
RUN apt-get update && apt-get install -y wget gnupg2 apt-transport-https curl && apt-get clean
# Add Google Chrome's public key
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
# Add Google Chrome to the list of repositories
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
# Update apt and install Chrome along with other dependencies
RUN apt-get update && apt-get install -y \
git \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxi6 \
libxtst6 \
libnss3 \
libcups2 \
libxss1 \
libxrandr2 \
libasound2 \
libpangocairo-1.0-0 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libgtk-3-0 \
libgbm1 \
tzdata \
google-chrome-stable \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Environment variables to help Puppeteer
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true \
PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable \
CHROME_BIN=/usr/bin/google-chrome-stable
# Copy package.json and package-lock.json (if available)
COPY package*.json ./
# Install dependencies
RUN npm install
# Copy the rest of the application
COPY . .
# Build the application if necessary
# RUN npm run build # Uncomment and add your logic for building your nodejs server
# Start the application
# CMD ["npm", "run", "start:dev"] # Uncomment and add your logic for starting your nodejs server
Dockerfile Breakdown
- Base Image: We start with
node:20-slim
as our base image. This is a lightweight version of the Node image, ideal for a small footprint.
FROM node:20-slim
WORKDIR /app
- Install Dependencies for Puppeteer: The slim image is Debian-based, so we use
apt-get
to install necessary libraries. These libraries are required for Puppeteer to interact with the browser correctly.
RUN apt-get update && apt-get install -y \
wget \
curl \
git \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxi6 \
libxtst6 \
libnss3 \
libcups2 \
libxss1 \
libxrandr2 \
libasound2 \
libpangocairo-1.0-0 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libgtk-3-0 \
libgbm1 \
tzdata \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
- Timezone Configuration: Setting the timezone is crucial for some applications. We use the
tzdata
package for this.
ENV TZ=UTC
RUN ln -fs /usr/share/zoneinfo/$TZ /etc/localtime && dpkg-reconfigure -f noninteractive tzdata
- Install Basic Utilities: We install
wget
,gnupg2
, andcurl
for downloading and verifying files.
RUN apt-get update && apt-get install -y wget gnupg2 apt-transport-https curl && apt-get clean
- Setting Up Google Chrome: Puppeteer works best with Chrome, so we add Google Chrome’s public key and repository to install the stable version.
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
RUN apt-get update && apt-get install -y \
git \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxi6 \
libxtst6 \
libnss3 \
libcups2 \
libxss1 \
libxrandr2 \
libasound2 \
libpangocairo-1.0-0 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libgtk-3-0 \
libgbm1 \
tzdata \
google-chrome-stable \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
- Puppeteer Environment Variables: We set environment variables to prevent Puppeteer from downloading Chromium as we are using Chrome.
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable
ENV CHROME_BIN=/usr/bin/google-chrome-stable
- Node.js Dependencies: Copy the
package.json
andpackage-lock.json
files and install dependencies.
COPY package*.json ./
RUN npm install
- Application Setup: Copy the application source code and build it if necessary.
COPY . .
RUN npm run build
- Starting the Application: Finally, we set the command to start the application.
CMD ["npm", "run", "start:dev"]