Skip to content

Streamlit project to test Selenium running on Streamlit Cloud

License

Notifications You must be signed in to change notification settings

deepakskepit/Streamlit-Selenium

 
 

Repository files navigation

Streamlit Selenium Test

Streamlit project to test Selenium running in Streamlit Cloud runtime.

  • Local Windows 10 machine works
  • Local Docker container works
  • Streamlit Cloud runtime works, see example app here: Docker

Issues 🐛

  • None

ToDo ☑️

  • improve example
  • try also undetected_chromedriver package
  • try also seleniumbase package

Problem 🤔

The suggestion for this repo came from a post on the Streamlit Community Forum.

https://discuss.streamlit.io/t/issue-with-selenium-on-a-streamlit-app/11563

It is not that easy to install and use Selenium based webscraper in container based environments. On the local computer, this usually works much more smoothly because a browser is already installed here and can be controlled by the associated webdriver. In container-based environments, however, headless operation is mandatory because no UI can be used there.

Therefore, in this repository a small example is given to get Selenium working on:

  • Local Windows 10 machine
  • Local Docker container that mimics the Streamlit Cloud runtime
  • Streamlit Cloud runtime

Pitfalls 🚩

  • To use Selenium (even headless in a container) you need always two components to be installed on your machine:
    • A webbrowser and its associated webdriver.
  • The version of the headless webbrowser and its associated webdriver must match.
  • If your are using Selenium in a docker container or on Streamlit Cloud, the --headless option is mandatory, because there is no graphical user interface available.
  • There are three options of webbrowser/webdriver combinations for Selenium:
    1. chrome & chromedriver
    2. chromium & chromedriver
    3. firefox & geckodriver
  • Unfortunately in the default Debian Bullseye apt package repositories, not all of these packages are available. If we want an installation from the default repositories, only chromium & chromedriver is left.
  • To make this repository cross-platform, the Windows 10 chromedriver must be stored here in the root folder or add to the PATH. Be aware, that the version of this chromedriver must match the version of your installed Chrome browser.
  • The chromedriver has a lot of options, that can be set. It may be necessary to tweak these options on different platforms to make headless operation work smoothly.
  • The deployment to Streamlit Cloud has unfortunately failed sometimes in the past. A concrete cause of the error or an informative error message could not be identified. Currently it seems to be stable during deplyoment.

Development Setup 🛠️

In the Streamlit Cloud runtime, neither chrome, chromedriver nor geckodriver are available in the default apt package sources.

The Streamlit Cloud runtime seems to be very similar to the official docker image python:3.XX-slim on Docker Hub, which is based on Debian Bookworm.

In this repository a Dockerfile is provided that mimics the Streamlit Cloud runtime. It can be used for local testing.

A packages.txt is provided with the following minimal content:

chromium
chromium-driver

A requirements.txt is provided with the following minimal content:

streamlit
selenium

Docker 🐋

Docker Hub

Docker Images that come close to the actual Streamlit Cloud runtime:

Docker Container local

The provided Dockerfile tries to mimic the Streamlit Cloud runtime.

Build local custom Docker Image from Dockerfile

docker build --progress=plain --tag selenium:latest .

Run custom Docker Container

docker run -ti -p 8501:8501 --rm selenium:latest
docker run -ti -p 8501:8501 --rm selenium:latest /bin/bash
docker run -ti -p 8501:8501 -v $(pwd):/app --rm selenium:latest  # linux
docker run -ti -p 8501:8501 -v ${pwd}:/app --rm selenium:latest  # powershell
docker run -ti -p 8501:8501 -v �%:/app --rm selenium:latest  # cmd.exe

Selenium

https://selenium-python.readthedocs.io/getting-started.html

pip install selenium

Chromium

Required packages to install

apt install chromium
apt install chromium-driver

Chromium Options

https://peter.sh/experiments/chromium-command-line-switches/


undetected_chromedriver

Another option to try


Status ✔️

Last changed: 2023-10-24

About

Streamlit project to test Selenium running on Streamlit Cloud

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.5%
  • Dockerfile 2.0%
  • Shell 1.5%