Sources Contact Advanced Search Tutorials

An Interest In:

Web News this Week

Search Archive

Some of Our Sources

View All Sources

Help Webnuz

Referal links:

June 19, 2021 08:24 pm GMT

Introduction to threading and multiprocessing: Concurrency & Parallelism in Python

Introduction

Most of us have come across terms like multithreading, parallel processing, multiprocessing, concurrency, etc., though each of these terms has its own meaning. In a broader sense, we need these because we want to avoid some kind of latency (or have an illusion of doing so) in the execution of regular programs.

To do this, we try to write code that doesnt necessarily runs in order (non-sequential) and all this boils down to two different concepts concurrency and parallelism.

Concurrency

I came across concurrency when I was trying to download ~5000 images from the web. I had collected image URLs from Flickr, and these images had to be passed on to a team doing annotation (labelling).

This is how a sequential program to download images would look like:

import requestsdef download_image(URL, filename):    image = requests.get(URL)    with open(f'{filename}.png','wb') as f:        f.write(image.content)flickr_URLs = [    'https://live.staticflickr.com/6022/5941812700_0f634b136e_b.jpg',    'https://live.staticflickr.com/3379/3492581209_485d2bfafc_b.jpg',    'https://live.staticflickr.com/7309/27729662905_e896a3f604_b.jpg',    'https://live.staticflickr.com/8479/8238430093_eb19b654e0_b.jpg',    'https://live.staticflickr.com/5064/5618934898_659bc060cd_b.jpg',    'https://live.staticflickr.com/3885/14877957549_ccb7e55494_b.jpg',    'https://live.staticflickr.com/5473/11720191564_76f3f56f12_b.jpg',    'https://live.staticflickr.com/2837/13546560344_835fc79871_b.jpg',    'https://live.staticflickr.com/140/389494506_55bcdc3664_b.jpg',    'https://live.staticflickr.com/5597/15253681909_0cc91c77d5_b.jpg',    'https://live.staticflickr.com/1552/24085836504_3d850f03e7_b.jpg',    'https://live.staticflickr.com/7787/26655921703_ee95e3be8e_b.jpg',    'https://live.staticflickr.com/423/32290997650_416303457b_b.jpg',    'https://live.staticflickr.com/4580/37683207504_053315d23f_b.jpg',    'https://live.staticflickr.com/3225/2454495359_92828d8542_b.jpg',    'https://live.staticflickr.com/7018/6640810853_22634c6667_b.jpg',    'https://live.staticflickr.com/7681/17285538029_363c8760ea_b.jpg',    'https://live.staticflickr.com/7630/16622584999_0654c8d564_b.jpg',    'https://live.staticflickr.com/6160/6207543047_da2c66c2f6_b.jpg',    'https://live.staticflickr.com/2921/14251116921_a97d7a46ce_b.jpg']for url in flickr_URLs:    filename = url.split(/')[-1]    download_image(url, filename)

It does the job but we spent most of the time waiting for the source URLs to respond. When we scale this program to 5,000 images this wait time becomes humongous.

The above program sends a request to a URL, waits until the image loads (gets response from server), writes it to disk, and only then sends a new request until the list exhausts.

However, rather than waiting for the first URL to load, shouldnt we send a new request in the meantime? Once we receive some response from a previously sent request, we can write the corresponding image to the disk. By doing this we are not letting the latency block our main program.

We can achieve this by starting a new thread, along with the main thread using built-in Python module called threading.

Heres how you create a thread:

thread = threading.Thread(download_image, args=[url, filename])thread.start()

So thats one thread. You need to create many threads, so lets loop over. Heres whats the threaded version of this program would look like:

import threadingimport requestsdef download_image(URL, filename):    ...flickr_URLs = [...]threads = []for url in flickr_URLs:    filename = url.split('/')[-1]    thread = threading.Thread(target=download_image, args=[url, filename])    thread.start()    threads.append(thread)for thread in threads:    thread.join()

We call .join() on a thread to join it to the main thread telling Python to wait for a thread to terminate before moving further down in the file.

In this program, we created 20 threads. But how many threads are too many? If we have 5,000 URLs, should we start 5000 threads?

First of all, you should know that you can start multiple threads but they wont be running simultaneously. Its just that while one thread is waiting for some I/O operation, another one starts working in the meantime. It might look like the juggler is juggling two balls but in reality, at any given point he only has one ball in his hand. Source: Library of Juggling

Since your OS continuously switches thread to thread, deciding which one should run at a given time, managing hundreds of threads will eat up a big chunk of resources.

So a simple workaround is this:

MAX_THREADS = 10threads = []for url in flickr_URLs:    if len(threads)>MAX_THREADS:        for thread in threads:            thread.join()        threads = []    filename = url.split('/')[-1]       thread = threading.Thread(target=download_image, args=[url, filename])    thread.start()    threads.append(thread)for thread in threads:    thread.join()

Were only starting MAX_THREADS at once, waiting for them to terminate (we're doing that by calling.join()), and then starting the next MAX_THREADS threads.

However, theres a modern way of doing this which we will see later in this article.

Some implementations of Python like PyPy, IronPython can run multiple threads simultaneously but this isnt the case with the default implementation, thats CPython.

Notice how we didnt need CPU power to speed up this task of downloading images. These are I/O bound tasks its like cooking food, for example. If youre preparing a dish, and at some point you need to preheat your microwave oven, you wont be looking for more manpower to speed up your cooking. Youre better off utilising the preheating time intomaybe chopping down some veggies.

Python cutting down veggies while the oven finishes up preheating

Thats concurrency. But what if a task is bottlenecked by the CPU, rather than networking and IO? That brings us to parallelism.

Parallelism

Now suppose youre done with cooking, and its time to do the dishes. Can you apply the concept of concurrency here? Pick the knife and start cleaning it, switch over to the bowl you used for pouring milk, start washing it, and then move on to the plates, then back to the knife you were washing some time ago.

At best, it wont make any difference to the execution time of this task. And in most of the cases, doing this will make the process slower as you are taking some time in switching back and forth to different utensils (yes, multithreading can also slow down tasks).

If youre looking to expedite this task, you need manpower. Some friend who washes the bowl while you cleanse knives and forks more friends, the better.

This is how a CPU heavy task looks like, where you can use multiple CPU cores to speed up the task. Lets try to apply this using Python.

Were trying to find out product of prime numbers upto a number n using a function productOfPrimes, such that productOfPrimes(10) -> 210 (product of 2, 3, 5, 7)

import timedef productOfPrimes(n):    ALL_PRIMES_UPTO_N = []    for i in range(2, n):        PRIME = True        for j in range(2, int(i/2)+1):            if i%j==0:                PRIME = False                break        if PRIME:            ALL_PRIMES_UPTO_N.append(i)    print(f"{len(ALL_PRIMES_UPTO_N)} PRIME NUMBERS FOUND")    product = 1    for prime in ALL_PRIMES_UPTO_N:        product = product*prime    return productinit_time = time.time()LIMITS = [50328,22756,39371,44832]for lim in LIMITS:     productOfPrimes(lim)fin_time = time.time()print(f"TIME TAKEN: {fin_time-init_time}")

Nested loops, lots of division pretty heavy on the CPU. The execution took ~10 seconds on 1.1 GHz quad core Intel i5 processor. However, out of 4 cores, we just used one.

To manage multiple cores and processes, we use this module called multiprocessing in python.

Lets see if multiprocessing improves this result. Syntactically, its quite similar to how we started threads:

import multiprocessingdef productOfPrimes():    if __name__ == "__main__":      processes = []    LIMITS = [50328,22756,39371,44832]     for lim in LIMITS:        process = multiprocessing.Process(target=productOfPrimes, args=[lim])        process.start()        processes.append(process)    for process in processes:        process.join()

This does the same thing in a little over 4 seconds. productOfPrimes was simulatenously executed on multiple cores available in the CPU.

Now let's talk a bit about numbers. How many processes can a quad-core CPU can execute simultaneously? Shouldn't it be only 4? Yes, but that doesn't mean the OS can't hold more than 4 processes in memory. There's a difference in executing processes and just holding them in memory.

Run this bash command to see the number of processes running on your system:

ps -e | wc -l

So if you start 20 processes from Python, it won't throw an error saying you don't have enough cores. The OS will just manage these 20 processes over whatever cores are available.

Pool of Threads and Processes

Weve seen how we can implement the concepts of concurrency and parallelism using threading and multiprocessing modules. However, weve a sexier, more Pythonic way of doing this using the concurrent.futures module (ships with Python).

import concurrent.futuresdef download_image(URL, filename):    ...flickr_URLs = [...]with concurrent.futures.ThreadPoolExecutor() as executor:    results = executor.map(download_image, flickr_URLs)for result in results:        print(result)

We can do something similar using ProcessPoolExecutor:

import concurrent.futuresdef productOfPrimes(n):    ...LIMITS = [...]if __name__ == "__main__":     with concurrent.futures.ProcessPoolExecutor() as executor:        results = executor.map(productOfPrimes, LIMITS)for result in results:    print(result)

Threads, Processes, OS

Technically, threads run inside a process. When we create 4 threads, they share the same process, thus the same memory, and a lot of other OS level stuff (process control block, address space, etc.). The same is not true for processes. Each process has memory space of its own and run independently.

On the other hand, processes can run simultaneously unlike in multithreading where the OS just keeps switching over and over to manage latency inside the same process.

Conclusion

Weve seen how we can implement concurrency and parallelism in Python which are fundamentally very different, and have use cases of their own. There are more things to talk about like problems with threading, GIL, asynchronicity, etc.

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To