ustas - stock.adobe.com
Whether identifying people in photos, identifying and classifying animals, or determining whether there is a hotdog in a photo, deep learning recognition has proven to be particularly skilled.
It has shown an almost magical capability to identify, segment and classify a wide range of image, video, audio and other unstructured data in ways that can't be done with programming approaches.
Recognition applications of AI -- one of the seven primary AI application patterns identified by Cognilytica, the research firm that I work for -- have become so prominent across a wide range of application types that they have almost come to define the practical use of AI.
The field of AI research was given a big push forward by the development of deep learning approaches to neural networks, which itself was spurred forward to solve some of the recognition problems plaguing previous approaches to machine learning.
In 2006, AI researcher Fei-Fei Li devised a way to test and expand capabilities of emerging machine learning tools. Along with others, she developed ImageNet, a large data set of well-labeled images that could be used to both train and test the capabilities of machine learning algorithms. Via crowdsourcing, it used armies of Amazon Mechanical Turk workers to do the job of labeling and tagging each image.
In 2012, researchers made waves when, for the first time, a deep learning neural network algorithm approach called a convolutional neural network, originally pioneered by AI researcher Yann Lecun, was able to achieve a 95% accuracy rate on ImageNet classification. This advancement helped fuel significant adoption and growth of AI in the years that followed. By 2017, 29 teams had greater than 95% accuracy, with the top leaders, a collection of Chinese firms and academic institutions, achieving classification capabilities that beat even human abilities at almost 98% accuracy. Clearly, the machine is winning at recognition.
Applications of image and object recognition
Today, enterprises and organizations of all types have access to these amazingly powerful and accurate recognition systems for a wide range of application types. Cloud providers and focused technology vendors have made it trivial to add image recognition to an application or edge device that needs recognition capabilities. Similarly, a new breed of chips and hardware offerings and software toolkits have optimized recognition models to the point that they can be embedded in smartphones, cameras and other devices that don't even need a permanent connection to the internet or a back-end cloud to function. As a result, we're seeing a tremendous expansion in the sorts of applications making use of deep learning-powered recognition.
At manufacturers, marketing and advertising applications are utilizing recognition systems to automatically spot their ads, as well ones for competing brands. This allows these companies to continuously monitor how their brands are performing in the market. E-commerce and retail companies are using recognition systems to automatically tag and classify products as they are uploaded to their catalog systems. Most notably, recognition systems have enabled the concept of "autonomous retail," best illustrated by Amazon Go, where customers can simply pick items off shelves and ever-watching cameras can note the items selected, which are automatically added to the customer's shopping cart. Customers can simply leave the store and the items are automatically billed to a card on file. Without recognition, these capabilities would not be possible.
The power of deep learning recognition systems has also enabled a revolution in autonomous vehicles. Deep learning-based recognition systems can keep a constant watch on the road, traffic and potential issues that might arise. With the combination of sensor data, real-time mapping and machine learning algorithms that can make instant decisions, autonomous vehicles are rapidly moving from a science-fiction fantasy to reality.
Similarly, recognition systems have revolutionized cameras and surveillance systems. Instead of having people continuously monitor dozens of cameras for potential security issues, recognition-based systems can automatically identify, classify and annotate whenever people, vehicles or even animals are visible in video footage. Security systems are increasingly using the power of recognition to enable semantic search of thousands of hours of footage to find instances of vehicles or people within the recordings.
Deep learning-based recognition has enabled powerful, real-time facial recognition technology that has a broad range of uses. Already, we're seeing the use of facial recognition as biometric security to enable access to devices and systems. Companies are experimenting with the use of facial recognition as a means to facilitate payments and as a way of tailoring marketing and advertising content.
More controversially, law enforcement and military organizations are utilizing facial recognition systems to identify suspects amongst video footage or potential targets for military actions. Since facial recognition systems are not guaranteed to be 100% accurate, reliance on these systems for law enforcement or military purposes has many feeling on edge. Reflecting those concerns, San Francisco's board of supervisors recently prohibited the city's police department and other local agencies from using facial recognition technology -- making it the first municipal government in the U.S. to do so. The board also required agencies to get its approval before deploying other types of automated surveillance tools.
Recognition systems also provide an effective means to do gesture recognition, which helps machines understand the hand and body motions of individuals to convey specific meaning such as commands. Specific use cases include video game interaction, retail experiences that immerse shoppers in relevant content, remote control systems that allow surgeons to virtually grasp and move objects, sign language interpretation, and enablement of gesture-based interfaces in vehicles and other locations where touch/swipe/type interfaces are not practical.
While image and facial recognition use cases make most of the news, deep learning-powered recognition systems are not limited to visual data. Deep learning recognition systems are also able to easily classify audio and sound information. Music recognition systems identify and recognize songs, music and instruments, as well as musical patterns. Audio recognition systems can identify whether there are people or animals in a location. Researchers have even used audio recognition systems to identify animals based on noises, songs and sounds.
Companies are also applying audio recognition technology to enforce copyright claims across the internet. Deep learning models can be trained to recognize fragments of songs, movies and other copyrighted materials and notify copyright holders if there is any unauthorized use of their material.
Moving beyond OCR -- deep learning-powered handwriting recognition
While optical character recognition (OCR) technology has been around for decades, it has primarily been limited to digitizing printed material into a form that computers can process. However, deep learning is kicking OCR up a significant notch by adding power through recognition. Deep learning-based handwriting and text recognition systems easily provide systems the capability to understand and interpret handwritten content from documents and digital sources. Besides simply digitizing content, these systems can also extract content and understanding from these documents by automatically identifying things like names, addresses, personal information and even structures such as tables and form fields.
Document-intensive industries have been transformed through the use of this powerful deep learning-based recognition capability. Banking and finance firms have enabled remote check deposit and automatic document capture through apps with built-in recognition systems. Insurance and mortgage firms are using recognition systems for intelligent document processing and document analysis.
In fact, deep learning recognition systems are proving to be highly adept at classifying and identifying information across a wide range of unstructured data in the organization. According to Cognilytica's research, over 80% of the information in the enterprise is of unstructured form. This includes email, voice, image, video, text, documents and other forms of information that can't be easily queried or analyzed with traditional approaches. Deep learning, through the recognition of patterns, provides a way to tap this information and gain value by understanding the content.
Industry-specific applications of recognition systems
The recognition pattern of AI is being widely implemented across many industries. In particular, the medical imaging industry is expected to be transformed through the use of AI-enabled recognition. AI is augmenting radiologists by being a second set of eyes, making a first pass on X-ray, CT scan and MRI images to identify areas that are anomalous or indicative of potential medical problems for further evaluation. These AI-based systems scan images and flag problematic regions and are rapidly becoming the norm at hospitals. In fact, according to industry sources, by 2023, 90% of radiology images will be read by AI systems.
Likewise, AI-based recognition is making a dent in the trade of counterfeit goods. AI is helping find counterfeit goods such as purses, watches, sunglasses and other items by matching visual samples of the materials against databases of known counterfeit items. These systems are also helping to detect counterfeit drugs and provide much more thorough inspection of materials than is possible with human inspectors alone.
The insurance industry is also being transformed through recognition systems. Many auto insurance companies are using phone camera images to provide instant loss assessment information after a collision or other damage to a vehicle. These apps are providing real-time car damage assessment and can identify potential cost and other information necessary for claims processing. Likewise, property insurers are providing apps that homeowners can use to complete a photo tour of their commercial or residential property to document items that need to be insured, resulting in a more frictionless, faster process. After natural disasters or other major insurance events, insurance companies are using drones and satellite data backed by recognition systems to collect instant assessments of potential damage claims for individual properties, making handling of insurance claims faster and more accurate.
In many ways, we can see the power of AI through the lens of recognition and deep learning-powered computer vision. As it becomes clearer that these systems are capable of handling recognition tasks previously thought unavailable to computers, organizations are finding that these recognition capabilities can transform the very nature of how they operate.