Build a Web Scraper to Compare E-Commerce Prices Using Python

Displaying a laptop, shopping cart, price tags, Python logo, and a graph in a modern, tech-inspired design with sky blue, white, and orange colors. Text reads 'Compare E-Commerce Prices' in bold font.

In today’s online era, finding the best deals online can save you both time and money. However, manually checking multiple e-commerce websites for prices is tedious. What if you could automate this process?

In this article, we’ll create a web scraper using Selenium in Python to extract and compare prices from popular e-commerce sites like Amazon and eBay. Whether you’re a beginner or an experienced programmer, this tutorial will help you create a powerful tool to simplify your shopping experience.

Read also: Web Scraping for Job Listings: Automate Job Search with Python

Why Use Selenium for Web Scraping?

Before we start writing our code, let’s understand why Selenium is the best choice for this project compared to other tools like BeautifulSoup, requests, or Scrapy.

Dynamic Content Handling

Many modern websites, including Amazon and eBay, use JavaScript to load content dynamically. Tools like BeautifulSoup and requests can only fetch static HTML content, making them ineffective for such sites.

Selenium, on the other hand, can interact with JavaScript-rendered content, making it ideal for scraping dynamic websites.

Browser Automation

Selenium mimics real user behavior by controlling a web browser. This reduces the chances of being detected as a bot compared to sending direct HTTP requests.

Flexibility

Selenium allows you to interact with web pages just like a human would—clicking buttons, filling out forms, and scrolling. This makes it useful for complex scraping tasks.

Debugging

With Selenium, you can visually see what the browser is doing, making it easier to debug your scraper.

Setting Up Selenium and ChromeDriver

To use Selenium, you need two things:

Selenium Python Package: Install it using the following command:

pip install selenium

ChromeDriver: A tool that allows Selenium to control Google Chrome.

Selenium documentation is here.

Finding and Installing ChromeDriver

Check Your Chrome Version

Open Google Chrome and go to Settings > About Chrome. Note the version number (e.g., 133.0.6943.98).

Download ChromeDriver

Visit the ChromeDriver download page and download the version that matches your Chrome browser.

Install ChromeDriver

1. Extract the downloaded file.

2. Move the chromedriver executable to a directory in your system’s PATH (e.g., /usr/local/bin on Linux or C:\Windows on Windows).

3. Make it executable (Linux/Mac)

sudo chmod +x /usr/local/bin/chromedriver

4. Verify the Installation

Run the following command to ensure ChromeDriver is installed correctly:

chromedriver --version

Building the Web Scraper

Now that everything is set up, let’s build the scraper step by step.

Step 1: Import Required Libraries

We’ll use Selenium for web scraping and the pandas library for organizing the data.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import pandas as pd
import time
import random

Step 2: Configure Selenium

Set up Selenium to run in headless mode (no browser GUI) and use a random User-Agent to avoid bot detection.

def setup_driver():
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument("--disable-blink-features=AutomationControlled")
    chrome_options.add_argument(f"user-agent={random.choice(user_agents)}")
    chrome_options.add_argument("--headless")  # Run in headless mode
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")

    # Path to ChromeDriver
    service = Service("/usr/local/bin/chromedriver")
    driver = webdriver.Chrome(service=service, options=chrome_options)
    return driver

Step 3: Scrape Amazon Data

This function navigates to Amazon’s search results page, extracts product names and prices, and stores them in a Python list.

def scrape_amazon(driver, product_name):
    url = f"https://www.amazon.com/s?k={product_name.replace(' ', '+')}"
    driver.get(url)
    time.sleep(random.uniform(2, 5))  # Random delay

    products = []
    items = driver.find_elements(By.CSS_SELECTOR, "div.s-result-item")
    for item in items:
        try:
            name = item.find_element(By.CSS_SELECTOR, "span.a-text-normal").text
            price = item.find_element(By.CSS_SELECTOR, "span.a-price span.a-offscreen").get_attribute("textContent")
            products.append({"name": name, "price": price, "source": "Amazon"})
        except:
            continue
    return products

Step 4: Scrape eBay Data

Similarly, this function scrapes eBay’s search results page.

def scrape_ebay(driver, product_name):
    url = f"https://www.ebay.com/sch/i.html?_nkw={product_name.replace(' ', '+')}"
    driver.get(url)
    time.sleep(random.uniform(2, 5))  # Random delay

    products = []
    items = driver.find_elements(By.CSS_SELECTOR, "div.s-item__info")
    for item in items:
        try:
            name = item.find_element(By.CSS_SELECTOR, "h3.s-item__title").text
            price = item.find_element(By.CSS_SELECTOR, "span.s-item__price").text
            products.append({"name": name, "price": price, "source": "eBay"})
        except:
            continue
    return products

Step 5: Compare Prices

Now combine the results from Amazon and eBay, clean the data, and sort by price.

def compare_prices(product_name):
    driver = setup_driver()
    amazon_products = scrape_amazon(driver, product_name)
    ebay_products = scrape_ebay(driver, product_name)
    driver.quit()  # Close the browser

    # Combine results into a single DataFrame
    all_products = amazon_products + ebay_products
    df = pd.DataFrame(all_products)

    # Clean and sort the data
    df["price"] = df["price"].replace("[\$,]", "", regex=True).astype(float)
    df = df.sort_values(by="price")

    return df

Step 6: Run the Program

Finally, prompt the user for a product name and display the results.

if __name__ == "__main__":
    product_name = input("Enter the product name to search: ")
    result = compare_prices(product_name)
    print(result)

Example Output

For a search query like “wireless headphones”, the output might look like this:

                          name  price  source
2  Sony Wireless Headphones   45.99   eBay
1  JBL Wireless Headphones    49.99  Amazon
0  Bose Wireless Headphones   99.99  Amazon

Why This Project Matters

This project is more than just a fun coding exercise—it’s a practical tool that can save you time and money while teaching you valuable skills. Here’s why it matters:

  • Comparing prices manually across multiple e-commerce platforms is time-consuming. This web scraper automates the process and gives you instant access to the best deals.
  • Web scraping is a highly sought-after skill in fields like data science, market research, and automation. By building this project, you’ll gain hands-on experience with Selenium, Python, and data manipulation.
  • Many modern websites use JavaScript to load content dynamically. This project teaches you how to handle such sites, which is a crucial skill for scraping real-world data.
  • From handling anti-scraping mechanisms to cleaning and organizing data, this project challenges you to think critically and solve problems creatively.
  • Once you’ve built the basic scraper, you can extend it to include more websites, add features like email alerts for price drops, or even integrate it into a larger application.

Whether you’re a beginner, just looking to learn Python, or an experienced programmer trying to explore web scraping, this project offers something for everyone.

Summary

In this article, we built a web scraper using Selenium in Python to compare prices from e-commerce websites like Amazon and eBay. We discussed why Selenium is important for scraping dynamic websites that use JavaScript, unlike tools like BeautifulSoup or requests.

The tutorial walked through setting up Selenium and ChromeDriver, writing Python code to scrape and clean data, and implementing anti-scraping measures like random delays and rotating user-agent headers.

This project is not only a practical tool for finding the best deals but also a great way to learn web scraping, automation, and data analysis.

If you found this tutorial helpful, share it with fellow data enthusiasts, and let’s build a community of skilled scrapers. Until next time,

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! I’m Subhankar Rakshit, the brains behind PySeek. I’m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles: 194