Let’s Go Python

Parallelism in Python
Let’s write a simple application that counts back from 100 million to zero (it can be also program that pushes requests to the server many times, do calculations, etc.):
from datetime import datetime def doSomethingWeird(counter, number): print("Thread %s started" % number) while counter > 0: counter -= 1 print("Thread %s ends" % number) if __name__ == '__main__': start = datetime.now() counter = 100000000 doSomethingWeird(counter, 1) print("Execution took %s" % (datetime.now() - start))
This results in:
Thread 1 started Thread 1 ends Execution took 0:00:05.573309
As you see, running the whole application took around 5.5 seconds.
Now, let’s modify our code to run the same function with 4 threads (quad-core Intel Core i7 in my case). Each thread should calculate from 25 millions to zero. If we do this correctly, the execution time should drop to ~1.4 sec.
from datetime import datetime from threading import Thread def doSomethingWeird(counter, number): print("Thread %s started" % number) while counter > 0: counter -= 1 print("Thread %s ends" % number) if __name__ == '__main__': start = datetime.now() counter = 100000000 run_x_times = 4 threads = list() for i in range(run_x_times): threads.append(Thread(target=doSomethingWeird, args=(counter/run_x_times, i + 1))) for t in threads: t.start() for t in threads: t.join() print("Execution took %s" % (datetime.now() - start))
Results I received:
Thread 1 started Thread 2 started Thread 3 started Thread 4 started Thread 3 ends Thread 4 ends Thread 2 ends Thread 1 ends Execution took 0:00:08.961999
Threads were run concurrently, but what happened to execution time? It increased! How is this possible? The Global Interpreter Lock… They were run concurrently but not in parallel.
Concurrency & parallelism in Python is a great topic for another article, so let me just introduce a few dry facts. In Python, regardless of the number of threads and processors, only one thread is executed at any time.
GIL Go Away!
I looked for other languages that would help me build a fast, multithread application. Scala, Clojure, Erlang, Go… Go? What’s Go?
“Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.”
Go is a programming language initially developed at Google and announced in November 2009. Go(lang) looks like C on steroids, with the goal of simplicity, succinctness, and safety. For more details about language specification please check out the official Go website. In the meantime, let’s install Go to show you the magic of concurrency.
Get Go
Go is available for Linux, Mac, Windows and many more. Just visit the official website and download the version for your system. If you are lazy lack time, you can use Go Playground, which is an awesome place to write and run your first program using only your browser! A simple example:
package main import "fmt" func main() { var title string title = "Mr" name := "Dude" fmt.Printf("Hello, %s %s", title, name) }
To run the program just hit Run button on the playground or type go run main.go (if your filename is named main.go).
Parallelism in Go
Now, when we know basics of Go, let’s rewrite the same code that we did in Python. At the beginning, it is the same simple program that counts to 0.
package main import "fmt" import "time" func doSomethingWeird(counter int, number int) { fmt.Printf("Routine %d startedn", number) for ; counter > 0; counter-- { } fmt.Printf("Routine %d endsn", number) } func main() { start := time.Now() counter := 100000000 doSomethingWeird(counter, 1) elapsed := time.Since(start) fmt.Printf("Execution took %sn", elapsed) }
This results in:
Routine 1 started Routine 1 ends Execution took 31.880747ms
It works properly. Let’s not focus on execution time, since, in general, programs that have been compiled are faster that programs that are interpreted.
Now let’s modify our code to run in parallel mode using goroutines. A goroutine is a lightweight thread of execution and, in my opinion, the most common example of Go awesomeness:
package main import "fmt" import "time" import "sync" import "runtime" var wg sync.WaitGroup func doSomethingWeird(counter int, number int) { defer wg.Done() fmt.Printf("Routine %d startedn", number) for ; counter > 0; counter-- { } fmt.Printf("Routine %d endsn", number) } func main() { start := time.Now() counter := 100000000 run_x_times := 4 // How many logic processors we want to use runtime.GOMAXPROCS(run_x_times) wg.Add(run_x_times) for i := 1; i <= run_x_times; i++ { go doSomethingWeird(counter/run_x_times, i) } wg.Wait() // Wait till all routines ends elapsed := time.Since(start) fmt.Printf("Execution took %sn", elapsed) }
And now let’s see the results:
Routine 3 started Routine 1 started Routine 2 started Routine 4 started Routine 1 ends Routine 3 ends Routine 4 ends Routine 2 ends Execution took 8.626793ms
Woohoo! Success! Using 4 cores, the time dropped from ~32ms to ~8ms. What we do here is added GOMAXPROCS and used goroutine. The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code.
Summary
As David Beazley writes in The Unwritten Rules of Python:
1. You do not talk about the GIL.
2. You do NOT talk about the GIL.
3. Don’t even mention the GIL. No seriously.
Now we at least know why!