It’s that time of the year to slow down from all the craziness and take a look at what has been accomplished in the last months.
Here at Tryolabs, this includes thinking about the Python libraries that have helped the open source community to build amazing systems during the year.
Like in 2015, 2016 and 2017, we’re thrilled to share our hand-picked selection of the best Python libraries (according to our most humble opinion) with you.
If you can think of a library that is not mentioned here but you believe deserves to be on the list, please let us know in the comments section at the end of the post. Here we go!
Along with the Deep Learning boom that we’re experiencing, a huge amount of new tools have seen the light.
The complexity of our reality brings about one huge problem: Deep Learning models that achieve the best accuracy rates are steadily becoming more and more complex, provoking interpretability to become a critical issue.
Shap provides an explanation for the output of any Machine Learning model. It uses the Shapley value concept (solution concept in cooperative game theory characterized by a collection of desirable properties) in order to interpret the target model.
It introduces optimizations which allow Shapley values to be used in practice, by calculating those values faster than if a model prediction had to be calculated for every possible combination of features.
The output explanation also includes how each feature contributes to pushing the model value from the base value (average output from the training dataset), indicating if the difference pushes the prediction in a higher or lower direction.
As you may know, Decision Trees constitute the foundation of some of the most widely used learning methods. On top of that, Decision Tree graphs are very easily interpreted, that’s why being able to visualize the result of a prediction in a graphic form is extremely useful.
Until now, visualization packages designed to solve this problem were kinda rudimentary, lacking some desirable functionality. For example, existing implementations weren’t able to show how decision nodes split up the feature space.
Because of this lack of possibilities, dtreeviz was born. This Python package for scikit-learn allows one to visualize classification or regression decision trees and to perform model interpretation.
It’s entirely true that scikit-learn has its own visualization option and it works great. However, dtreeviz provides a more intuitive and user-friendly output for Python programmers.
A detailed comparison between dtreeviz and other existing implementations can be found at How to visualize decision trees.
Attempting to make concurrency easy and still get the job done.
Sounds like an impossible deal, right? However, this library is meant for those who need to write programs that do multiple things at the same time with parallelized I/O. Keeping the focus in usability and without executing real parallelization, it takes advantage of some I/O time-outs to recreate a pseudo-parallelization instead.
While it is true that asyncio is very similar and a much more mature tool, Trio will make your code simpler.
A great review can be found in Medium and, of course, we recommend you to check the exhaustive official documentation.
Disclaimer: this library was not developed by Tryolabs. 😉
Igniting your PyTorch is now cleaner and easier than it’s ever been. This high-level library takes care of the messy coding that is necessary for training neural networks so you can just focus on the data science.
With just a few lines of code, Ignite can provide you with full-fledged training loops and out of the box support for metrics, early-stopping, model checkpointing and many more features without the boilerplate.
This library was inspired by torchnet and the main difference is that it adds a higher level of abstraction (making fewer assumptions about the type of network being trained), providing more flexibility and empowering users.
Now you can finally code like it’s 2018!
Defining a stream processor has never been so easy. Who would say that by just adding a decorator into a Python function you will found all the stream processing power that you always wanted? Sounds like magic, but no, it’s just Faust.
Earlier this year, Robinhood open-sourced its stream processing library. Based on Kafka Streams, it allows you to work with all the known Python structures and libraries when processing a stream such as NumPy, PyTorch, Pandas, Django and more.
It takes advantage of Python recent performance improvements and integrates with the new AsyncIO module for high-performance asynchronous I/O. It’s so awesome that it will even let you store the state of the stream in RocksDB, which is an embedded non-relational database.
It’s a great solution to process all types of streams such as bytes, Unicode, and serialized structures.
And if you don’t believe us you may as well believe the hype. Because, as reported by The New Stack, Faust was number three on Hacker News on the day it launched. 😊
Helping to adopt asynchronous frameworks since Python 3.5.
This server implementation supports the Asynchronous Server Gateway Interface (ASGI) specification, which is intended to provide a standard interface between async-capable Python web servers, frameworks, and application. It’s built on uvloop and httptools, so speed is totally guaranteed. Moreover, it includes support for features like Web Sockets and plans to add support for HTTP/2.0 in the future.
The implementation enables to build a set of tools that can be used across all asyncio frameworks! Isn’t that great?
To take a deeper look into this great server implementation, complete documentation can be found at Uvicorn.
Backend lovers, fasten your seatbelts! This special library allows you to create what you always wanted, a platform independent GUI just using Python. Translating your Python code into HTML, without writing a single line of HTML, that’s Remi’s business.
What makes Remi different from other GUI libraries is that no native code or dependencies are needed, instead, you’ll only need your browser. Remi will take care of all the needed stuff to get your app up and running by starting a web server accessible on your network.
Including its own Graphical Editor that will make your life easier, Remi will let you walk back the typical road and produce the Python code associated with a specific UI design. It’s ideal to build a quick prototype or proof of concept systems.
Learn more about this Cross-platform remote GUI in its official site.
Web service framework, written for human beings.
Kenneth Reitz is back with another amazing contribution to the Python open-source community.
Have you ever imagined the mixing between Flask, Falcon, and Requests? If you have, then you know Responder for sure.
This framework combines Flask style expressions with the elegance of Python plus support for JSON and YAML, all in one single place. Also, it uses the client form Requests library that’s more than well known by all the Python community.
Check the official documentation for further reading!
Library dependencies can be somehow treacherous when defining the needed stack for a project. There are some great tools, like Pipenv, that allow managing isolated environments in an easy and intuitive way. However, dependencies resolution is sometimes erratic in those tools.
That’s why Poetry’s main strength is the exhaustive and efficient dependency resolver built at its heart, which ensures always having the proper stack by declaring, managing and installing project’s dependencies.
Besides being a dependency manager, as it manages dependencies for applications and libraries, it can be also used to build distributions, becoming a distribution builder and publisher.
Furthermore, you can configure the completion scripts for the terminal of your preference.
Working in teams is our thing, and it doesn’t matter how much effort you put on everyone sharing the same code style by complying to the same standards, each programmer has their own particular swing. That’s when Black comes to play.
This code formatter allows you to focus on important matters instead of pouring your energy into code styling by reformatting entire files in place.
It simplifies the coding process by unifying the style along all projects. Also, it produces the minimum amount of differences, increasing team productivity when doing code reviews. In a few words, you can describe this library as a strict subset of PEP 8 rules.
You can even visit the Black Playground to give it a try! Yet, as a small disclaimer, it’s still in Beta.
Rounding up this brief description, we can conclude that this year was a blast. Software industry is a roller coaster of emotions with a lot of good developments in the road. It’s growing so fast that’s really hard to keep track of all what’s happening at the same time!
If we’ve left out your favorite Python library or if you disagree with any of the above, feel free to comment below.
There’s nothing more to say than to thank everybody in the community for such great work!
This blog post has been written in collaboration with Alan Descoins and Fabián Torres.
Like what you read?
Subscribe to our newsletter and get updates on Deep Learning, NLP, Computer Vision & Python.