Launching Luminoth: our open source computer vision toolkit

Discover our collaboration with TNC, using edge AI to automate fisheries monitoring at sea. Read more

Blog

blog

Tue, Oct 10, 2017

Authors

Martín Alcalá Rubí

Co-Founder & Board Member

After a few months working in stealth mode, we are very proud to launch our Deep Learning initiative: luminoth.ai

Luminoth is an open source toolkit for computer vision. Currently, we support object detection and image classification, but we are aiming for much more. It is built in Python, using Google’s Machine Intelligence framework, TensorFlow; and Sonnet, a very useful library built by DeepMind for building complex neural networks with reusable components.

Why have we created such a thing?

Over the last few years, the Machine Learning (ML) landscape has changed dramatically. The comeback of neural networks in the form of Deep Learning has opened new creative ways to approach many classical ML and Artificial Intelligence (AI) problems.

Out of all the areas vastly improved by Deep Learning techniques, computer vision has been particularly revolutionized by major breakthroughs that have only very recently occurred. In particular, there is a family of neural network architectures that have performed really well: Convolutional Neural Networks (CNNs or ConvNets, in ML lingo).

These networks have a number of properties that make them really well suited for image processing tasks. CNNs have the ability to spatially slide different filters through an image, and use stacks of these filters to recognize patterns of increasing level of abstractness, until they can grasp complex concepts that would otherwise be hard to express.

Deep neural networks also allow us to be very creative in terms of designing different architectures, by playing around with different layer types, trying out different configurations and experimenting with different hyperparameters. In fact, the state of the art of several of these technologies is rapidly changing almost in a everyday basis, due to this kind of fast prototyping aspect.

On the downside of these methods is the huge amount of labeled training data needed to get to meaningful results, which sometimes can be very expensive to get. As Machine Learning referent Andrew Ng states in many of his talks, this effect get specially aggravated in the case of Deep Learning.

Opportunities for the industry

By helping connect these developments to spawn new industry grade applications (which is our core expertise at Tryolabs), we believe there lies a massive opportunity. In fact, many of the improvements of incorporating ML derived technologies into our everyday lives will come from advancements in the field of computer vision.

As a few examples: self driving cars heavily depend on computer vision to understand their surroundings to make effective decisions. Augmented reality will be extremely benefited since apps will have to identify objects and scenarios. Image based medical diagnosis, video based security systems, satellite and drone imagery analysis, are only some of the countless possible applications of this emerging technologies.

The issues companies are facing

Over the last couple of years, we have identified some issues most companies face when incorporating these technologies into their platforms:

Expect an intense learning curve for your team

Even for people with tons of experience (and even with PhDs in the field!), the recent developments in the space have been evolving so quickly that it’s very hard to keep up with the pace of recent advancements and new techniques.
There has been a boom in the number of Deep Learning frameworks: Google’s TensorFlow, Facebook’s PyTorch or Caffe2, Microsoft’s CNTK, Amazon’s MXNet, among many others. To integrate and customize these frameworks requires deep domain expertise.
While there are higher level abstractions such as Keras, it is not always straightforward to apply them directly to the latest research findings.

SaaS solutions are important IP outside your platform, and are expensive on high usage scenarios.

Many companies have developed their own SaaS API solutions, like Google's Vision API, Microsoft’s Computer Vision API, Amazon’s Rekognition, among others. While those are super simple to integrate into most applications, these APIs usually only work for a pre-defined number of tasks/classes/object types like detecting faces, cars, dogs, cats, etc. While these are good for a broad range of applications, it is very common that companies have their own needs specific to the type of problem they are trying to solve. What if you are manufacturing company and want to identity defective pieces in your production chain? Or if you work in health care, and want to detect certain patterns in medical imagery? In these cases, you need something that can be trained with any dataset.
Some companies like Clarifai even let you upload examples and do the training in the cloud. However, sometimes you want to be the owner of your data model and deploy in your own servers, and not have an important part of your core business tech hosted outside your platform, using a Cloud API provider to whose model (or implementation) you have no real access to.
Rent vs Build: if you have a properly labeled dataset and are using computer vision results intensively for your product, probably the “buy versus build” equation will fall quickly into developing your own algorithms, instead of paying per usage or API consumption.

Creating a functional implementations from a research paper, is damn hard

Even if you are an expert in any of the Deep Learning frameworks, coming up with something that actually works is pretty damn hard. Small implementation details have no room in academic papers, and sometimes they can and do make a huge difference in the results. Moreover, many things can be implemented in several ways, and these might in turn also affect your results.

What are Luminoth’s main “selling” points?

State of the art algorithms: currently, we only support image classification and Faster R-CNN model for object detection, but we are actively working on providing more models and keeping up to date with the research in the field.
Open Source: Luminoth is free and open source. You can download it, customize it for your needs and integrate it into your product or cloud.
Developer friendly: we’ve poured over 7 years of experience working on bridging the gap between academic Machine Learning findings and production ready software to make Luminoth accessible. We strived for an easy to use interface, beautiful code with comments, and unit tests. Of course, there is still a lot of room for improvement, and things will get better as the toolkit gets more mature.
Made with Google's TensorFlow and Sonnet. State of the art, reliable, robust and maintained frameworks.
Customizable: you can train Computer Vision models with your own data. You are not limited to existing datasets such as COCO or ImageNet.
Cloud integration: we strived for a super simple Google Cloud integration, specifically ML Engine. This means training distributedly is very straightforward: there is no need to buy those GPUs.
Commercial support: if Machine Learning is not your company’s core skill, or if you you want to try out Computer Vision features for your product but you don’t have the resources to allocate people on your team for R&D, or simply want to add a feature without the hassle of having a learning curve for your team, you can hire Tryolabs or any company with relevant experience using Luminoth to help you integrating computer vision into your product.

Luminoth “launch” tour

We are in the process of revealing and starting the Luminoth evangelization. This week (October 9th, 2017) we are traveling to London to start Luminoth international launch.

We’ll be giving talks on Google Campus London on Thursday 12th at noon and on Saturday 14th, we will be speaking at 11am on OSDC Europe, with that being our official release date. Talks will be given by Javier Rey, Tryolabs’ Lead researcher and Alan Descoins, our CTO.

Continuing, later in October we will be speaking at IEEE UruCon, and between 2-4 November we will be presenting in Silicon Valley, at San Francisco on ODSC West.

Lastly, we’ll finish Luminoth launch tour speaking at MvdValley in November. Our talk will be about our experiences and lessons learned while launching international Machine Learning projects such as Luminoth and our spin-off company MonkeyLearn. We will also cover key techniques to sell Machine Learning related technologies internationally, particularly to the US markets focusing on Silicon Valley companies.

So, we have a very intense and fun ride ahead this month launching our new venture!

By the way, why the name?

If you are familiar with the Metroid saga, you shall probably remember the Luminoth alien species. In between their qualities, they had special visors to improve their vision to outstanding levels.

The Dark Visor is a Visor upgrade in Metroid Prime 2: Echoes. Designed by the Luminoth during the war, it was used by the Champion of Aether, A-Kul, to penetrate Dark Aether's haze in battle against the Ing.

Closing remarks

We expect you to enjoy using Luminoth and make your life easier integrating Deep Learning based Computer Vision technologies into your company’s products. Feel free to explore our Github repository. All feedback and contributions are welcome. If you find this useful don’t forget to share and give Luminoth a star on Github! ⭐️

Update April 2018: We're announcing Luminoth 0.1! This new version brings several very exciting improvements, susch as the implementation of the SDD model and pretrained checkpoints. Read more about it in this blog post!

Wondering how AI can help you?

hello@tryolabs.com