Scaling applications is a necessity in today’s tech world.
An application’s use can expand exponentially and hence applications should be developed with the aim to handle more and more requests per second as this happens.
Before talking about scalability, we need to be familiar with a few computer science concepts.
Let’s talk about them!
What is a Process? : A Process is the running instance of a program/ application. A program can have multiple instances of it (Processes) running at the same time.
What is a Thread? - A Thread represents the actual processor instructions that are being executed in the context of a Process. Each Process has at least one thread executing its instructions.
What is Multi-threading? - In Multi-Threading, multiple threads work together, executing instructions within the context of a process in parallel. Threads within a process share state information, memory, and many other attributes.
In a multi-threaded program, the application process can utilize different cores in the processor of the computer’s CPU and run multiple threads in parallel.
This is done to improve computational efficiency / reduce execution time. This forms the basis of Vertical scaling and the next step can be to add in more and faster processors / executing cores. But this has limitations. You cannot have ever-increasing processing power to upgrade to and this becomes expensive!
To know more about scaling: Checkout this blog on Horizontal vs Vertical Scaling - https://www.thegeekyminds.com/post/all-about-scaling-a-system-horizontal-scaling-vs-vertical-scaling-system-design
Cost of Horizontal vs. Vertical Scaling
Multi-Threading in Python Applications
Let’s write a function called countdown and execute it
This function takes 1.946 seconds to finish its execution.
Let’s use multi-threading and execute this function again.
One would expect this to run twice as fast because we have used two threads to execute this function.
Surprisingly, this does not work the way we expect it to!
Do you know why?
The reason for this is the Python GIL (Global Interpreter Lock).
Let’s talk about where it comes from.
CPython is the default and most widely used reference implementation of Python.
It is written in C and Python. The function of CPython is to compile Python code into bytecode before interpreting it. CPython uses the GIL on each CPython interpreter process.
But, What is GIL?
GIL or Global Interpreter Lock is a mutual exclusion(mutex) lock. It is a way to avoid race conditions.
It makes sure that during the execution of a Python interpreter process, only one thread may be processing Python bytecode at any one time.
What Problem Does This Cause?
Applications can handle two types of operations:
CPU-bound operations are computationally expensive and are limited by the capability of the processor for completion.
For example, complex mathematical operations are CPU-bound.
On the other hand, I/O bound operations for completion are limited by the period spent waiting for input/ output operations.
For example, database and network operations.
GIL works against multithreading in CPython when an application is using CPU-intensive operations distributed across multiple cores.
Although, multi-threading can still be a possibility for I/O-bound operations.
A great visualization regarding this can be found in the blog by Dave Beazley, a Python pioneer.
2 CPU-bound threads running on a machine with a single processing core
He talks about an application that runs two CPU-bound threads on a computer with a single processing core.
This works well and both threads execute concurrently.
When the same dual-threaded application is run on a computer with two processing cores, something weird happens because of GIL.
2 CPU-bound threads running on a machine with two processing core
In the above, the red regions show the times when one of the threads of a processor cannot run because the other thread on the other core is holding it.
Interestingly, GIL makes multi-threading with I/O operations slow as well.
Multi-threading with one I/O bound and another CPU-bound thread
In the above, note that the I/O bound operation faces difficulty acquiring the GIL from the CPU bound operation, in order to perform its processing.
This was fixed later as documented below: https://mail.python.org/pipermail/python-dev/2009-October/093321.html
Why Use GIL Then?
GIL was developed because it offered many advantages:
It was easy to implement while ensuring thread safety and avoiding race conditions
It increased the speed of single-threaded applications
It allowed easy integration of Python with many C libraries that are not thread-safe
Since many Python packages and modules have been developed keeping the GIL in mind, it makes it hard to remove the GIL without breaking them.
Another one is to use multiple processes rather than multiple threads to scale an application (using the Python multiprocessing package)
Finally, the ultimate solution could be to work on removing the GIL from CPython (popularly called “Gilectomy”).
This hasn't been done yet due to many hard requirements given when removing GIL.
You can read about the requirements here: https://wiki.python.org/moin/GlobalInterpreterLock
You can also check out the gilectomy branch of the repository below for more updates on this work. https://github.com/larryhastings/gilectomy
(Note: This post was originally published here.)
To know more about distributed systems, check out this article - https://levelup.gitconnected.com/100-essential-systems-design-concepts-that-every-developer-must-know-part-1-1318c2c402ca