We are now, wanted or not, living in the days and age when technology development is considered a decisive part of a modern society.
What is even more frustratedly exciting is the fact that It is changing so rapidly, to the point that it seems almost impossible to keep up, let alone predict what’s going to happen. And one of the fastest moving, most influential—and debatably most fascinating — technological advancement of all is image recognition.
Table of Contents
1. What is Image Recognition?
Image recognition is a mechanism of computer vision, while computer vision is a branch of AI.
As mentioned in our article AI vs. Machine Learning vs. Deep Learning: The simplified distinction, Artificial Intelligence (also known as AI) is a computer system being able to copy human characteristics and perform tasks that normally require human intelligence.
In order to create a persuasive AI, we need what is known as “Computer Vision”. According to Venture Beat, it is the “ability of computers to acquire, process, and analyze data coming primarily from visual cue but could also include data from similar sources such as heat sensors, ultrasound, and so on”.
In short, computer vision allows machines to “see” things – sometimes, things that humans cannot. For example, the Carnegie Mellon University located in Pittsburgh (United States) is actually working on a computer vision application named “Breathe Cam”. Equipped with four cloud-connected cameras, it allows its users to monitor and document the air pollution they breathe, even trace it back to the sources. Yes, it “sees” air quality.
However, to do what human can’t, we must start with what human can: seeing and labeling objects and creatures. That’s the primary function of image recognition.
Tensorflow, an open-source software library created by Google brain team, defines image recognition as the process by which computers break down photos or videos into pixels by pixels, recognize the pattern of shapes in order to “see” what’s in those images, and categorized them.
For example, stock websites are fueled with millions of picture and billions of searches every day. Typically, website’s contributors have to add tags and descriptions to every single photo they upload in order to match with the user’s search words. With image recognition install, as soon as the images are transferred to the server, the machine can automatically recognize who is who and what it what. Then, it can fill up the description with way more details than a human can, which optimize the search engine and amplify users experiences.
2. How to achieve Image Recognition?
For now, the most promising technique to give a machine the ability to “see” is deep learning. In short, deep learning is a machine learning framework that vaguely modeled after the human neuron system which provides the self-learning capability to computers. Computers, therefore, can accurately identify what’s in pictures without being explicitly installed with hand-coding software and instruction — but they need a massive amount of data to do it.
And massive amounts of data is what the world is working on, with the prominent examples of ImageNet and PASCAL. Years in the making, these massive and free-to-anyone databases contain millions of images tagged with keywords about what’s inside the pictures
- ImageNet: Created by researchers from Princeton University back in 2009, this visual database has over 14 millions URL of images collected from search engines like Flickr. During the compilation process, staff and volunteers annotated the submitted pictures with detailed description and categorized them in approximately 1000 object classes.
- PASCAL: Stand as a united collaboration between various EU-based universities, the PASCAL Challenge is quite an underdog compared to ImageNet database – having only 20,000 training images categorized into 20 object classes.
As you might have guessed from the drastic difference in the number of classes, PASCAL has a more generic categorize system. ImageNet, instead, promotes one crucial feature to push forward the image recognition technology: Inter-class variability – two images each containing a different kind of one species/ object look different to the machine and therefore, being categorized in different classes. For example, while a picture simply belongs to the “dog” category in PASCAL, it can be put into the “corgi dog”, “shepherd dog” or “pug dog” class in ImageNet.
3. Why do you need to invest in Image Learning now?
Seems like everybody’s doing it, aren’t they? Because they are.
In 2012, Qualcomm Connected Experiences, Inc first introduced Vuforia – a software platform that utilizes image recognition to offer a wide set of AR and VR-related features and giving mobile app developers the freedom to extend their vision.
Facebook began helping blind people to ‘view’ photos and images in 2016. By using image recognition, the Facebook IOS app will generate a description for every photo and narrate it out loud for the users.
Fast forward to just earlier this year, Google – one of the world most noteworthy AI company – have introduced Cloud AutoML – a tool that’s designed to simplify the process of applying AI in business operations. Cloud AutoML is starting out with image recognition, allowing Google’s customers to drag in images and teach their systems to recognize them on Google’s cloud. It’s already being used by corporations such as Disney and Urban Outfitters to make searches on their websites more relevant to customer’s demand.
But It’s not just the big guys league. According to an analysis by Bloomberg’s Chief Economist McDonough, since mid-2015, the number of corporate earnings calls that mention “AI” or “AI companies” has skyrocketed. In fact, 80% of interviewed companies reporting that they have AI applications in production.
Why are billions of dollars are being thrown into this technology? Our guess: Potential. Humongous potential.
Image recognition alone can be a very abstract field. But when put into contexts, its potential to transform businesses is indisputable. Let’s look at several potential image recognition application in various industries and business processes:
- Healthcare: One of the most prominent Image Recognition ability is assisting the creation of Augmented Reality (AR) – a technology that “superimposes a computer-generated image on a user’s view of the real world”. Giving an AI the AR technology and a database contains visual cue of diseases or illnesses and you have yourself a medical assistant who never forget. With it, doctors then can get real-time, detailed diagnostic suggestions projected on the patient’s wounds or the medical documents during examinations.
- Education: Image recognition can allow students with learning difficulties and disabilities to obtain the education they need – in a form they can perceive. Apps powered by computer vision can offer text-to-speech and image-to-speech which assist students with impaired vision or dyslexia to ‘read’ the content provided.
- Food and Beverage: By employing image recognition, a simple app on smartphones can caught visual cues in Instagram and Facebook uploaded images, analyze them and offer live data. For example, based on the photos, the app could tell you whether a cafe in Singapore is frequently visited by families and friends, or it’s a wild place to party. This way, customers receive local customized proposal at-a-glance while restaurants can effectively reach out to its targeted audience.
- E-commerce: Imagen a customer seeing something they would like to buy on the street. They have nobody to ask where to get it, so they snap a picture. Then, the customer uploads it to an E-commerce site that equipped with image recognition technology. The algorithm itself can ‘see’ the picture, scan through millions of options available and recommend the one that looks identical – or at least, the closest – to what the customer is looking for. This is exactly what Savvycom has in mind when founded the new AI Lab back in March 2018. Now, our engineers are currently developing an Artificial Intelligence Visual Search tool to utilize the big E-commerce database of thousands of products and amplify the E-commerce experiences.
- Business Process Management: A more advanced image recognition system can also assist the identification process during business operation. For example, the machine can provide Face ID identification, which will replace the traditional ID cards used to determine whether a person is granted the right to conduct a certain task: access to document storages, attend meetings or simply check in to work. However, we acknowledge that to ‘see’ and ‘recognize’ a human face is much more complex than identifying an object due to emotion illustration and makeup alteration ability. Therefore, Savvycom is aiming to tackle this area as soon as possible with upcoming projects.
4. Are there barriers?
Image recognition is not a new field, but under bird-eye-view, it is still in its early stage. And like any typical developing teenagers, it has problems when adapting to the real worlds.
Remember the “80% of organizations report they have AI applications in production” earlier? Within the same group of AI companies, around 33% said the biggest blockage to that desired adoption is the AI technology’s sentiment – immature and unproven. 34% of them find a hard time recruiting talented engineers, and 40% stated that information technology infrastructure impeding progress can easily take a toll on the company financial strength.
Money is also an issue. Thanks to the growing number of open-source software libraries for dataflow programming such as Microsoft CNTK and Accord.Net, machine-learning enthusiasts can conduct researches and studies with limited to none of cost. However, not everything is there because not everything is known. Companies still have a long way to go and budget to balance in order to fulfill their product ideas.
There is one solution that solves a lot of those mentioned issues: outsourcing. With a condensed focus on skill-set and expertise, IT outsourcing companies provide the insurance of higher-end tools and best practice operation with a predictable management cost. In short, they know what they are doing. That’s their jobs.
To sum up, image recognition is the early sign of a future of computer vision. No matter how it will be approached or what industries it will be applied on, image recognition will never be achieved alone. It can only be made stronger by access to more pictures, real-time data, time and effort. The businesses that realize this, make the most of these connections and prepare head-on are the ones that shaped for success.