By Mike Hollinger, chief architect, computer vision platforms with IBM
Computer-vision models used for visual inspection are improving rapidly. This is thanks to tireless work by AI researchers who are getting better and better at training computer-vision models to identify what’s in images—just as humans do—but on a much larger scale. Thanks to these efforts, visual inspection is now widely adopted in industrial, retail, infrastructure and even medical settings.
A particularly obvious use case for visual inspection is that it can be used to help spot flaws and problems. In the case of infrastructure, defect-detection models can be trained by engineers to help inspectors spot issues in structures like cracks or concrete spalling.
For inspectors in the field, AI-assisted defect detection can help make people’s jobs vastly easier, allowing them to take a picture on a phone, and run a detection model right there, on the device, that can highlight the likely problems. Capturing image data remotely, through cameras or drones, organizations can monitor and prioritize repairs remotely. In these ways, visual inspection can play an important role in mitigating the effects of deferred maintenance and repairs on infrastructure around the world (something that is already under strain).
In manufacturing, visual inspection is now not only used to identify flaws on assembly lines, but it is also leading to the development of more sophisticated smart factories. Where traditional computer-vision techniques relied on perfect lighting, pixel-perfect pictures, and took many weeks or months to configure, AI-powered deep learning models for visual inspection can enable whole new use cases in hours, and readily-adapt to the variation found in the real world.
These new machine-vision models can help organizations optimize the spacing of their employees, equipment and other assets. It can help them track when certain supplies are being used and will soon need to be replenished. And it can help make it easier to identify crowding or unsafe conditions. Visual inspection is being used by retailers to analyze someone’s shopping cart and determine the bill. And it is increasingly being used by the medical community to assess the severity of things like cataracts and tumors.
One reason for the recent explosion in new use cases for machine-vision models for visual inspection is that it has become vastly easier to make them. In the past, making a visual-inspection model required both AI expertise and relevant subject-matter expertise. In other words, you not only needed to understand how to build AI models but also needed expertise in what the model was looking for—whether spotting cracks in a bridge or the anatomy of the human eye. You also needed massive sets of image data with which to train the model—tens of thousands or even millions of images.
Thanks to recent innovations in deep-learning model training, that’s not the case anymore. Training techniques like transfer learning, semi-automated labeling and built-in augmentation techniques make it possible to train a visual-inspection model with a relatively small number of images—as little as a few dozen. Further automation to the modeling process means that even non-experts in AI can create models of their own, to suit whatever purposes they like, in only a few hours. This democratization of AI will have ramifications in industries ranging from healthcare and infrastructure to sustainability and conservation, as well as entirely new use cases we haven’t even thought of yet.
Visual-inspection models can gather image data 24/7. A visual-inspection model could help airports get a better understanding of exactly when and how planes, passengers and their luggage work their way through these complex organizations, and use that knowledge to help dramatically reduce delays. It could be used by small shops to help keep an eye on shelves or how many customers are waiting, and alert shopkeepers when goods need to be replaced or when a customer needs help.
When technology gets not only easier to use, but easier to personalize, it becomes applicable in countless new settings that its early adopters may never have anticipated. And that’s where the true potential starts to be unlocked.