
In today’s online era, finding the best deals online can save you both time and money. However, manually checking multiple e-commerce websites for prices is tedious. What if you could automate this process?
In this article, we’ll create a web scraper using Selenium in Python to extract and compare prices from popular e-commerce sites like Amazon and eBay. Whether you’re a beginner or an experienced programmer, this tutorial will help you create a powerful tool to simplify your shopping experience.
Read also: Web Scraping for Job Listings: Automate Job Search with Python
Why Use Selenium for Web Scraping?
Before we start writing our code, let’s understand why Selenium is the best choice for this project compared to other tools like BeautifulSoup, requests, or Scrapy.
Dynamic Content Handling
Many modern websites, including Amazon and eBay, use JavaScript to load content dynamically. Tools like BeautifulSoup and requests can only fetch static HTML content, making them ineffective for such sites.
Selenium, on the other hand, can interact with JavaScript-rendered content, making it ideal for scraping dynamic websites.
Browser Automation
Selenium mimics real user behavior by controlling a web browser. This reduces the chances of being detected as a bot compared to sending direct HTTP requests.
Flexibility
Selenium allows you to interact with web pages just like a human would—clicking buttons, filling out forms, and scrolling. This makes it useful for complex scraping tasks.
Debugging
With Selenium, you can visually see what the browser is doing, making it easier to debug your scraper.
Setting Up Selenium and ChromeDriver
To use Selenium, you need two things:
Selenium Python Package: Install it using the following command:
pip install selenium
ChromeDriver: A tool that allows Selenium to control Google Chrome.
Selenium documentation is here.
Finding and Installing ChromeDriver
Check Your Chrome Version
Open Google Chrome and go to Settings > About Chrome. Note the version number (e.g., 133.0.6943.98
).
Download ChromeDriver
Visit the ChromeDriver download page and download the version that matches your Chrome browser.
Install ChromeDriver
1. Extract the downloaded file.
2. Move the chromedriver executable to a directory in your system’s PATH (e.g., /usr/local/bin on Linux or C:\Windows on Windows).
3. Make it executable (Linux/Mac)
sudo chmod +x /usr/local/bin/chromedriver
4. Verify the Installation
Run the following command to ensure ChromeDriver is installed correctly:
chromedriver --version
Building the Web Scraper
Now that everything is set up, let’s build the scraper step by step.
Step 1: Import Required Libraries
We’ll use Selenium for web scraping and the pandas library for organizing the data.
from selenium import webdriver from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By import pandas as pd import time import random
Step 2: Configure Selenium
Set up Selenium to run in headless mode (no browser GUI) and use a random User-Agent to avoid bot detection.
def setup_driver(): chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("--disable-blink-features=AutomationControlled") chrome_options.add_argument(f"user-agent={random.choice(user_agents)}") chrome_options.add_argument("--headless") # Run in headless mode chrome_options.add_argument("--no-sandbox") chrome_options.add_argument("--disable-dev-shm-usage") # Path to ChromeDriver service = Service("/usr/local/bin/chromedriver") driver = webdriver.Chrome(service=service, options=chrome_options) return driver
Step 3: Scrape Amazon Data
This function navigates to Amazon’s search results page, extracts product names and prices, and stores them in a Python list.
def scrape_amazon(driver, product_name): url = f"https://www.amazon.com/s?k={product_name.replace(' ', '+')}" driver.get(url) time.sleep(random.uniform(2, 5)) # Random delay products = [] items = driver.find_elements(By.CSS_SELECTOR, "div.s-result-item") for item in items: try: name = item.find_element(By.CSS_SELECTOR, "span.a-text-normal").text price = item.find_element(By.CSS_SELECTOR, "span.a-price span.a-offscreen").get_attribute("textContent") products.append({"name": name, "price": price, "source": "Amazon"}) except: continue return products
Step 4: Scrape eBay Data
Similarly, this function scrapes eBay’s search results page.
def scrape_ebay(driver, product_name): url = f"https://www.ebay.com/sch/i.html?_nkw={product_name.replace(' ', '+')}" driver.get(url) time.sleep(random.uniform(2, 5)) # Random delay products = [] items = driver.find_elements(By.CSS_SELECTOR, "div.s-item__info") for item in items: try: name = item.find_element(By.CSS_SELECTOR, "h3.s-item__title").text price = item.find_element(By.CSS_SELECTOR, "span.s-item__price").text products.append({"name": name, "price": price, "source": "eBay"}) except: continue return products
Step 5: Compare Prices
Now combine the results from Amazon and eBay, clean the data, and sort by price.
def compare_prices(product_name): driver = setup_driver() amazon_products = scrape_amazon(driver, product_name) ebay_products = scrape_ebay(driver, product_name) driver.quit() # Close the browser # Combine results into a single DataFrame all_products = amazon_products + ebay_products df = pd.DataFrame(all_products) # Clean and sort the data df["price"] = df["price"].replace("[\$,]", "", regex=True).astype(float) df = df.sort_values(by="price") return df
Step 6: Run the Program
Finally, prompt the user for a product name and display the results.
if __name__ == "__main__": product_name = input("Enter the product name to search: ") result = compare_prices(product_name) print(result)
Example Output
For a search query like “wireless headphones”, the output might look like this:
name price source 2 Sony Wireless Headphones 45.99 eBay 1 JBL Wireless Headphones 49.99 Amazon 0 Bose Wireless Headphones 99.99 Amazon
Why This Project Matters
This project is more than just a fun coding exercise—it’s a practical tool that can save you time and money while teaching you valuable skills. Here’s why it matters:
- Comparing prices manually across multiple e-commerce platforms is time-consuming. This web scraper automates the process and gives you instant access to the best deals.
- Web scraping is a highly sought-after skill in fields like data science, market research, and automation. By building this project, you’ll gain hands-on experience with Selenium, Python, and data manipulation.
- Many modern websites use JavaScript to load content dynamically. This project teaches you how to handle such sites, which is a crucial skill for scraping real-world data.
- From handling anti-scraping mechanisms to cleaning and organizing data, this project challenges you to think critically and solve problems creatively.
- Once you’ve built the basic scraper, you can extend it to include more websites, add features like email alerts for price drops, or even integrate it into a larger application.
Whether you’re a beginner, just looking to learn Python, or an experienced programmer trying to explore web scraping, this project offers something for everyone.
Summary
In this article, we built a web scraper using Selenium in Python to compare prices from e-commerce websites like Amazon and eBay. We discussed why Selenium is important for scraping dynamic websites that use JavaScript, unlike tools like BeautifulSoup or requests.
The tutorial walked through setting up Selenium and ChromeDriver, writing Python code to scrape and clean data, and implementing anti-scraping measures like random delays and rotating user-agent headers.
This project is not only a practical tool for finding the best deals but also a great way to learn web scraping, automation, and data analysis.
If you found this tutorial helpful, share it with fellow data enthusiasts, and let’s build a community of skilled scrapers. Until next time,