Earlier this year, I was invited to give a keynote talk at PyCon APAC, to be held in Singapore on May 31 – June 2, 2018. It is always an honor to be asked to be a keynote speaker, and this particular conference was taking place in Asia-Pacific – a region which I did not know too much about, since nearly all our clients are based in the US. Eager to explore something different and learn about a new community, I said yes!
My initial learnings
Singapore is a beautiful place. Extremely safe, modern city, ideal for tourists. People are friendly, and nearly everyone speaks English (which is one of the official languages). It is interesting to know that the English spoken in Singapore is very different to American or British: there is a huge influence that comes from the mixtures of cultures (Singaporean, Chinese, Indonesian, Indian, etc.) that live together in the place. The quality of life is generally very good; it is mostly expensive for tourists (compared to other countries in the region), but if you live there, you have access to good financing for things like buying a house.
Singapore is (or is becoming) the "Silicon Valley" of southeast Asia. Many big companies (such as Google) are already present here. Other, lesser known (for me, because they are most active in Asia) companies either have their headquarters in Singapore or are eager to grow their offices there.
Many companies from neighboring countries are having difficulty finding enough senior developers for their needs, so they are looking at Singapore as a place where they can grow their engineering team with highly qualified developers.
There is plenty of VC money, and the government itself is very supportive. Even school kids can get money for their startup.
I was in charge of the opening keynote on Saturday – the last day of the conference. Given our work in computer vision and other talks we have given in international conferences, I thought it would be a good idea to demystify how modern Deep Learning for object detection works.
This time, it was particularly challenging: while our other talks happened in data science conferences (and with other 2 tracks to attend to at the same time), the entire audience was going to be present at the keynote talk, so needed to be addressed at a general public. This means I couldn’t even assume that the public knew what a Neural Network is! It was very challenging, but in the end – and with great help from colleagues – I think we could arrive to an explanation that showed the intuition, not math formulas, behind it all.
This is the abstract of the talk:
In recent years, models based on Convolutional Neural Networks (CNNs) have revolutionized the entire field of computer vision. Problems like image classification can now be considered solved, and it is easy to construct implementations with any modern Deep Learning framework using fine tuning with pre-trained weights on datasets such as ImageNet.
In this talk, we will explore how and why these techniques work, getting an understanding of the intuitive aspects of what the networks are actually doing. Moreover, this intuition will enable us to understand how to jump from image classification to the more complex problem of object detection, explaining the workings of the Faster R-CNN algorithm in the process.
We will also speak about an open source Python object detection toolkit based on TensorFlow called Luminoth, going over the motivation behind it and showing how it can be integrated to your application.
Here are the slides of my keynote talk:
On my way to the conference I took a short video in streets of Singapore, which I used to detect cars and people with Luminoth, the open-source toolkit mentioned in the presentation above. See here the outcome:
My conference highlights
Jordan Dea-Mattson had an inspiring talk. He talked about how the cost of building a company has gone down in the past years; how it is also now easier than ever to build things that were impossible before. Also, I learned about many of the good things Singapore has to offer for entrepreneurs. Good talk for a mostly young and energetic audience!
Katharine Jarmul had an excellent keynote talk about privacy in ML, a subject that has been very relevant recently (Facebook and Cambridge Analytica’s scandal, GDPR, etc). Her point is that privacy in data science today is a privilege, not a right. Interestingly, she had several examples of datasets that have been widely used by the community (like MNIST), and showed that very few people really know about the origin of the data and the actual people producing it. In the case of MNIST, I was surprised to find out that you could actually trace the origin of the characters to several schools, and it would be possible to use handwriting analysis to have an educated guess as of who the particular person who produced the digits was. There were several more interesting (and scary) examples of data de-anonymization like how you can extract credit card numbers from language models (do you use a predictive keyboard in your phone?). This boils down to the fact of recognizing the importance of data provenance and that we should be able to track consent as part of the data and give it an expiration date, like you would do for contracts. She also talked about the idea of differential privacy and how it solves part of the problem. Overall, really interesting talk and also great to talk to Katharine after her talk.
Artisanal Async Adventures talk by Jonas Obrist from HDE Inc. I have a lot of respect for people who have the courage to do a live coding talk, and this one was particularly good. He showed how in about 100 lines of low level Python code you could write a very performant async web server, explaining the rationale of each decision. This is the same thing that makes more powerful libraries like Twisted/Tornado work. Neat!
How to deploy machine learning models to production (frequently and safely) talk by David Tan from Thoughtworks. Highlighted how doing Continuous Integration for ML models (to reduce deploy time significant) is really hard, and how to integrate unit/integration/metric testing into your pipeline to make sure you are not breaking things, and the model used in production is actually the best performing one.
Detecting offensive messages using Deep Learning: A micro-service based approach talk by Alizishaan Khatri from Pivotus Ventures showed how modern NLP done with combinations of Deep Learning models can be very powerful, achieving state-of-the-art performance for tasks like detecting offensive messages. Really fun and engaging talk.
Zac Hatfield-Dodds had a very interesting workshop about the Hypothesis Python library for property-based testing. I had kept an eye on this library because it’s a super interesting paradigm shift for testing, but didn’t have the chance to play with it. Talking to Zac enlightened me about the uses, limitations and inner workings of this library.
PyCon APAC 2018 was a great experience, and I am very thankful to the organizers for having invited me. I learned many interesting things, found out that Singapore has a vibrant IT & developer scene, and also met many cool and smart people! I am really looking forward to more events like this in the future.