Thu, Jun 13, 2019
As a machine learning engineer, building solutions in the vision and deep learning fields, I’d always had my eye on the Embedded Vision Summit, a leading computer vision conference taking place yearly in Silicon Valley. When I found out I had been invited to speak at the 2019 edition of the conference taking place May 18-21, I was obviously very excited. This was going to be an amazing opportunity for me to share my experiences in the field, and of course I was all in.
The Embedded Vision Summit is one of the major events dedicated to computer vision powered products and solutions in the US. Several hundred engineers, executives, investors, analysts, and marketers gather together from all over the world, eager to learn about computer vision enabling technologies and applications, and how computer vision impacts the business of technology.
This year's event attracted more than 1,200 attendees, with over 100 speakers covering an amazingly broad range of topics on computer vision, organized into four tracks: Fundamentals, Technical Insights, Business Insights, and Enabling Technologies. No industry was left behind as the different talks contained insights on every topic imaginable, from electronic products pertaining to autonomous vehicles and aerial imagery, to practical security and health care applications.
The event is organized by the Embedded Vision Alliance, an industry partnership that brings together more than 90 technology providers and end-product companies from around the world.
I was asked to kick off the Fundamentals track with my talk, which featured key concepts in computer vision aimed at an audience new to the field.
My session was titled, An Introduction to Machine Learning and How to Teach Machines to See. I answered questions such as the following (you guessed it):
These questions were followed by an introduction to deep learning, including the history of neural networks, an explanation of how convolutional neural networks function, and how they can be used to solve image classification problems.
Since my target audience was rather diverse (some were new to machine learning or computer vision, while others were looking to get a refresher on certain concepts), I decided to focus on a subject that everyone is aware of and that the internet has been attempting to explain for many years: cats.
See the slides of the session here or in the images below:
By the end of my session, I was very pleased that the audience had not only understood the concepts, but that they were eager to learn more about the topic. They asked some really interesting questions and, afterwards, even approached me to share their specific thoughts on the topic and related needs. Since my talk was the first step towards understanding several concepts that would be mentioned in other sessions, I was glad I could prepare the audience for later speakers.
The Summit was one amazing session after another, but for the sake of brevity, I'll highlight just a few that I found really interesting:
Professor Ramesh Raskar from MIT, opened the first day of sessions with an amazing keynote showcasing his team's most recent advancement in imaging hardware and computer vision algorithm designs. Seeing through fog? Looking around corners? Surely this sounds like science fiction, but Prof. Raskar gave an eye-opening presentation on what could be achieved with femto-photography sensors and clever calculations, leaving the room speechless.
Pete Warden, Staff Research Engineer at Google, was the keynote speaker on the second day. He shared his thoughts around what the future holds for embedded vision devices and machine learning solutions, in general. In particular, he talked about low-power, low-cost machine learning solutions being deployed on embedded devices, with neural network architectures that only need a few hundred kilobytes to run (yeah, that's not a typo, I said KBs).
Professor Ioannis Kakadiaris, from the University of Houston, observed that although we are seeing remarkable results with face recognition systems, there are still some challenges when they are faced (no pun intended) with variations in pose, expression, illumination, and occlusions. He offered insights on how we should evaluate face recognition systems, taking into account all of these challenges.
Bert Moons, from Synopsys, presented 5+ Techniques for Efficient Implementation of Neural Networks, a really important topic when trying to deploy computer vision solutions on embedded devices, but also worth considering when deploying any type of neural network based solution. Bert was great at showcasing several methods to reduce memory consumption and computing requirements like quantization, network decomposition, weight pruning, weight sharing, and sparsity-based compression. He also demonstrated that careful selection and application of these techniques could drastically reduce resource requirements without negatively impacting great results.
Adam Kraft, from Orbital Insight, talked about the challenges they addressed when working with satellite imagery, and the techniques they used. Some of these challenges included fusing together data coming from different types of image sensors, dealing with different ground truth measurement sources, and detecting trends and changes in imagery over time.
If you want to hear the audios of the talks you can check them out here on the Embedded Vision Alliance website.
The Summit not only offered amazing sessions, there was also a hall full of booths where more than 60 companies demoed their newest processors, algorithms, software, sensors, development tools, and services. After visiting several of these booths and talking with many people, one thing became clear: self-driving vehicles are gaining a lot of attention, with companies developing dedicated camera sensors running embedded vision systems dedicated to object detection and scene segmentation. But this was not an isolated phenomena at the Summit itself; we even saw 3 or 4 different companies testing their autonomous cars on the streets of San Francisco the day we arrived!
Sometimes, approaching a speaker after they give their talk, or chatting with someone over lunch or at their booth, is not enough to fully develop productive conversation, since you only have a few minutes and there are other people who want to do the same. Luckily, the organizers had us covered and invited us to an Executive Networking Reception that took place following the first day's sessions. There, we had the opportunity to speak with industry leaders, researchers, and investors, all while eating a fantastic paella and enjoying some drinks.
The Embedded Vision Summit is certainly the place to be if you want to be up-to-date on current products, technologies, companies, and ideas that are reshaping computer vision. It hosts high-quality talks, workshops, and exhibitors, and is full of great networking opportunities involving people who share a passion for computer vision (especially with food and drink available). At Tryolabs, we are very excited to have been a part of this experience and hope to be there again next year.
Want to get more familiar with opportunities and techniques in the field? Read further in our Introductory Guide to Computer Vision.
© 2024. All rights reserved.