Learn Multi-Threading in Python: Why & Where to Use

Introduction

In the realm of programming, performance is a key factor that often determines the success or failure of an application. One way to enhance performance is through parallelism, the ability to execute multiple tasks simultaneously.

Python, a versatile and popular programming language, offers multi-threading as a powerful technique to achieve parallel execution. In this article, we will explore the concept of multi-threading in Python, its benefits, and how to effectively utilize it in your programs.

Process and Threads

Before understanding multi-threading itā€™s important to know about what are processes and threads. In computing, both processes and threads are essential concepts related to concurrent execution within an operating system. Hereā€™s an explanation of processes and threads:

Processes

A process can be defined as an instance of a running program. It represents a complete execution environment, including the program code, memory, and system resources. Each process operates independently and is isolated from other processes, meaning they cannot directly access each otherā€™s memory. Processes are managed by the operating system and have their own process ID (PID).

Key characteristics of processes include:

  1. Memory Isolation: Each process has its own memory space, ensuring that one process cannot directly access or modify the memory of another process. Communication between processes typically requires inter-process communication (IPC) mechanisms.
  2. Resource Allocation: Processes have their own allocated system resources, such as file descriptors, network connections, and open files. These resources are not shared with other processes unless explicitly shared through mechanisms like file sharing or IPC.
  3. Independent Execution: Processes can execute independently of each other, allowing for parallelism and concurrent execution. Each process has its own program counter, stack, and set of registers.

Threads

A thread, also known as a lightweight process, can be considered as a smaller unit of execution within a process. Threads share the same memory space and system resources as their parent process, enabling them to access and modify the same memory locations. Multiple threads within a process can work concurrently, performing different tasks simultaneously.

Key characteristics of threads include:

  1. Shared Memory: Threads within a process share the same memory space, meaning they can directly read from and write to the same variables and data structures. This allows for efficient communication and data sharing between threads within the same process.
  2. Lightweight: Threads are lightweight compared to processes, as they utilize the resources allocated to their parent process. Creating a thread requires less overhead compared to creating a new process, making thread creation and context switching faster.
  3. Scheduling: Threads are scheduled for execution by the operating systemā€™s thread scheduler. Depending on the scheduling algorithm and system resources, threads may be executed in a pre-emptive or cooperative manner.

Threads vs Processes

The main differences between processes and threads can be summarized as follows:

  1. Memory: Processes have separate memory spaces, while threads share the same memory space as their parent process.
  2. Resource Management: Processes have their own allocated resources, while threads share the resources of their parent process.
  3. Creation and Context Switching: Creating a new process is more resource-intensive than creating a new thread. Context switching between threads within a process is faster compared to context switching between processes.
  4. Communication: Inter-process communication (IPC) mechanisms are required for communication between processes, whereas threads can communicate through shared memory directly.
  5. Concurrency: Threads within a process can work concurrently, allowing for parallel execution and increased efficiency. Processes, on the other hand, are generally more isolated and independent from each other.

Both processes and threads have their own use cases and advantages, and the choice between them depends on the specific requirements of a given application or scenario.

Understanding Multi-Threading

At its core, multi-threading is a form of concurrent programming where multiple threads of execution coexist within a single process. A thread is a lightweight unit of execution that operates independently, allowing different parts of a program to run concurrently.

By utilizing multi-threading, you can divide a computationally intensive task into smaller subtasks that can be executed simultaneously, potentially leading to significant performance improvements.

Benefits of Multi-threading

  • Parallel Execution: Multi-threading enables parallel execution of tasks, thereby maximizing the utilization of available CPU resources. This is particularly beneficial for computationally intensive applications, where dividing the workload among threads can significantly speed up execution.
  • Improved Responsiveness: By using threads, you can ensure that long-running tasks donā€™t block the main execution thread, thus preventing the application from becoming unresponsive. For instance, while waiting for a response from a remote server, the main thread can continue executing other operations concurrently.
  • Resource Sharing: Threads can easily share data and resources within the same process, simplifying communication and coordination. This allows for efficient inter-thread collaboration and synchronization.
  • Simplified Design: Multi-threading can simplify the design of certain types of applications, such as those involving concurrent I/O operations. Instead of dealing with complex event-driven programming or asynchronous approaches, threads can handle multiple I/O tasks simultaneously, leading to cleaner and more maintainable code.

Pythonā€™s Threading Module

Pythonā€™s built-in threading module provides an intuitive interface for creating and managing threads. Letā€™s explore some programming examples to demonstrate the practical use of multi-threading in Python.

Example 1: Thread Creation and Execution

import threading

def task():
    print("Executing task...")

# Create a thread
thread = threading.Thread(target=task)

# Start the thread
thread.start()

# Wait for the thread to finish
thread.join()

print("Thread execution complete.")

In this example, a new thread is created using the `Thread` class from the `threading` module. The `target` argument specifies the function to be executed by the thread.

The thread is started with the `start()` method, and the main thread waits for it to finish using `join()`. Finally, a message is printed to indicate the completion of thread execution.

Example 2: Thread Synchronization

import threading

counter = 0
lock = threading.Lock()

def increment():
    global counter
    with lock:
        for _ in range(100000):
            counter += 1

def decrement():
    global counter
    with lock:
        for _ in range(100000):
            counter -= 1

# Create threads
thread1 = threading.Thread(target=increment)
thread2 = threading.Thread(target=decrement)

# Start threads
thread1.start()
thread2.start()

# Wait for threads to finish
thread1.join()
thread2.join()

print("Counter value:", counter)

This example demonstrates thread synchronization using a lock to ensure that the shared `counter` variable is accessed atomically. Without synchronization, simultaneous access to `counter` could lead to race conditions and unpredictable results.

The `with lock` statement provides mutual exclusion, allowing only one thread to execute the critical section at a time.

Example 3: Thread Pooling

import concurrent.futures as f

def task(name):
    print(f"Executing task {name}...")

# Create a thread pool with maximum 3 worker threads
with f.ThreadPoolExecutor(max_workers=3) as executor:
    # Submit tasks to the thread pool
    executor.submit(task, "Task 1")
    executor.submit(task, "Task 2")
    executor.submit(task, "Task 3")

    # Wait for all tasks to complete
    executor.shutdown()

print("All tasks completed.")

In this example, the `ThreadPoolExecutor` class from the `concurrent.futures` module is used to create a thread pool with a maximum of 3 worker threads.

Tasks are submitted to the pool using the `submit()` method. The `shutdown()` method is called to ensure that all tasks are completed before proceeding.

Real-Life Examples

Why do we incorporate multi-threading into our Python program? Letā€™s grasp the concept by examining a practical programming example.

In this scenario, we will develop a Tkinter window that displays a countdown timer of 5 seconds. The program will simultaneously handle the countdown operation and the display operation on a Tkinter widget (Label).

The key challenge here lies in managing these two tasks within a single process. Assuming you have some familiarity with GUI programming using the Tkinter library (If youā€™re interested, you can explore a dedicated page offering more thanĀ 20+ Python Tkinter Projects), the code will be straightforward to comprehend.

Letā€™s proceed directly to the main program.

Without Multi-Threading

import time
from tkinter import *
from threading import *

# Declaring variables for global use
TimeLabel = None
# Setting the countdown time (5 seconds in this case)
c_time = 5

# Function to create a GUI window
def Window(root):
    global TimeLabel

    window = root
    window.geometry("480x320")
    window.title("Demo App")
    window.resizable(height = False, width=False)

    # A Label widget for showing the remaining time
    TimeLabel = Label(window, text="0:0", bg="black", 
    fg="white", font=("Courier", 22))
    TimeLabel.place(x=205, y=120)

    CountDown()

def CountDown():
    global TimeLabel
    global c_time
    
    while c_time > 0:
        # Print the remaining time
        print(c_time)
        # Display the remaining time
        TimeLabel.config(text=f"{0}:{c_time}")
        # sleep for 1 second
        time.sleep(1)
        # reduce c_time by 1
        c_time -= 1

# Main function
if __name__ == "__main__":
    root = Tk()
    Window(root)
    root.mainloop()

Output

As observed, initially, the program outputs the remaining time on the terminal and subsequently shows the GUI window, which was not the intended behavior. The intention was for printing and displaying to occur simultaneously in parallel, but a different outcome transpired.

This is where multi-threading comes into play. It enables the creation of separate threads within the same process where the main program is executing. Here is the same example, solved usingĀ multi-threading.

Using Multi-Threading

import time
from tkinter import *
from threading import *

# Declaring variables for global use
TimeLabel = None
# Setting the countdown time (5 seconds in this case)
c_time = 5

# Function to create a GUI window
def Window(root):
    global TimeLabel

    window = root
    window.geometry("480x320")
    window.title("Demo App")
    window.resizable(height = False, width=False)

    # A Label widget for showing the remaining time
    TimeLabel = Label(window, text="0:0", bg="black", 
    fg="white", font=("Courier", 22))
    TimeLabel.place(x=205, y=120)

    MultiThreading()

def MultiThreading():
    x = Thread(target=CountDown)
    x.start()

def CountDown():
    global TimeLabel
    global c_time
    
    while c_time > 0:
        # Print the remaining time
        print(c_time)
        # Display the remaining time
        TimeLabel.config(text=f"{0}:{c_time}")
        # sleep for 1 second
        time.sleep(1)
        # reduce c_time by 1
        c_time -= 1

# Main function
if __name__ == "__main__":
    root = Tk()
    Window(root)
    root.mainloop()

Output

At present, the program is concurrently printing the countdown numbers on the terminal and displaying them on the Tkinter window. Thatā€™s what we wanted.

Exercises

Here are a few exercises to practice multi-threading in Python:

Exercise 1: Downloading Images

Write a program that downloads multiple images from a given list of URLs concurrently using multi-threading. Each thread should download an image and save it to the local disk.

Exercise 2: Parallel Processing

Implement a program that performs parallel processing of a large list of numbers. Create multiple threads to process different segments of the list simultaneously, such as finding prime numbers or calculating square roots.

Exercise 3: Producer-Consumer Problem

Solve the classic producer-consumer problem using multi-threading. Create two threads, one acting as a producer that generates items and adds them to a shared buffer, and another acting as a consumer that consumes items from the buffer.

Exercise 4: Web Scraping

Develop a web scraping application that extracts data from multiple web pages concurrently. Use multi-threading to send simultaneous requests to different URLs and process the retrieved data.

Exercise 5: File Processing

Write a program that reads multiple text files concurrently using multi-threading. Each thread should open and process a separate file, performing tasks like searching for specific words or counting occurrences of certain patterns.

Exercise 6: Parallel Sorting

Design a program that sorts a large list of numbers in parallel. Divide the list into smaller segments and assign different threads to sort each segment concurrently. Merge the sorted segments to obtain the final sorted list.

These exercises should provide a good starting point for practicing multi-threading in Python. Remember to handle synchronization, shared resources, and thread coordination appropriately for each exercise.

Conclusion

Multi-threading is a powerful technique that enables parallel execution of tasks within a Python program. By leveraging the `threading` module, you can divide your application into multiple threads that run concurrently, improving performance and responsiveness.

However, it is important to carefully consider the nature of your tasks and whether multi-threading is the most suitable approach for your specific requirements. With a solid understanding of multi-threading and its nuances, you can unlock the full potential of parallelism in Python, optimizing your applications for speed and efficiency.

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! Iā€™m Subhankar Rakshit, the brains behind PySeek. Iā€™m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles:Ā 198