Source: Warren Umoh on Unsplash
Threading is beneficial for tasks that spend a significant amount of time waiting for input/output (I/O), such as network operations or file I/O. By running these tasks concurrently, your program can make progress on other tasks while waiting for I/O to complete, leading to faster overall execution.
threading
module primarily achieves concurrency, not true parallelism. Parallelism will be covered in the multiprocessing
module.This example demonstrates how to create and start threads using the threading
module.
import threading
import time
def do_something(seconds):
print(f"Sleeping for {seconds} second(s)...")
time.sleep(seconds)
print("Done sleeping!")
start = time.perf_counter()
# Running the function sequentially
do_something(1)
do_something(1)
finish = time.perf_counter()
print(f"Finished in {round(finish-start, 2)} second(s)") # Approximately 2 seconds
start = time.perf_counter()
# Running the function concurrently using threads
t1 = threading.Thread(target=do_something, args=[1])
t2 = threading.Thread(target=do_something, args=[1])
t1.start()
t2.start()
t1.join() # Wait for t1 to complete
t2.join() # Wait for t2 to complete
finish = time.perf_counter()
print(f"Finished in {round(finish-start, 2)} second(s)") # Approximately 1 second
In this example, the threaded version completes faster because the threads run concurrently. While one thread is sleeping, the other thread can execute.
The ThreadPoolExecutor
from the concurrent.futures
module provides a more efficient and convenient way to manage threads.
import concurrent.futures
import time
def do_something(seconds):
print(f"Sleeping for {seconds} second(s)...")
time.sleep(seconds)
return f"Done sleeping...{seconds}"
start = time.perf_counter()
with concurrent.futures.ThreadPoolExecutor() as executor:
f1 = executor.submit(do_something, 1)
f2 = executor.submit(do_something, 1)
print(f1.result()) # Waits for the result of f1
print(f2.result()) # Waits for the result of f2
finish = time.perf_counter()
print(f"Finished in {round(finish-start, 2)} second(s)")
# Using as_completed to get results as they become available:
start = time.perf_counter()
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [executor.submit(do_something, 1) for _ in range(10)]
for future in concurrent.futures.as_completed(futures):
print(future.result())
finish = time.perf_counter()
print(f"Finished in {round(finish-start, 2)} second(s)")
This example demonstrates a practical use case of threading: downloading multiple images concurrently.
import time
import requests
import concurrent.futures
image_urls = [
"https://images.unsplash.com/photo-1518791841217-8f162f1e1131?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=500&q=60",
"https://images.unsplash.com/photo-1493663284031-b7e3a1a05820?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=500&q=60",
"https://images.unsplash.com/photo-1522038992700-c151c4f47122?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=500&q=60",
# Add more image URLs here
]
def download_image(image_url):
img_bytes = requests.get(image_url).content
img_name = image_url.split('/')[-1]
with open(img_name, 'wb') as img_file:
img_file.write(img_bytes)
print(f"{img_name} was downloaded...")
start = time.perf_counter()
with concurrent.futures.ThreadPoolExecutor() as executor:
executor.map(download_image, image_urls)
finish = time.perf_counter()
print(f"Finished downloading {len(image_urls)} images in {round(finish - start, 2)} seconds")
This code downloads multiple images concurrently, significantly reducing the total download time compared to downloading them sequentially.