In a recent project, our engineering teams were asked to develop an application to scan codes on product packaging and identify the product or brand the code belongs to. We had three days to build a proof of concept to solve the client’s need for its loyalty app to be capable of recognizing products and inferring the brand-product and promotional codes.
The solution required us to create a web application that can capture and process the images. Before we dig deeper into the project, I wanted to share how artificial intelligence and relevant technologies were used in the solution.
Artificial intelligence is the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speed recognition, decision-making, and translation between languages.
A few of the subfields or applications of AI are:
- Machine learning
- Neural network
- Deep learning
- Computer vision
- Natural Language Processing (NLP)
Convolutional neural network
Neural networks can solve a number of AI challenges. In deep learning, a convolutional neural network (CNN) is a class of deep neural networks most commonly applied to analyzing visual imagery. You may also hear a CNN referred to as a shift invariant or space invariant artificial neural network.
While a neural network may sacrifice generalizability if designed for processing and categorizing certain images, a CNN provides a better alternative and makes processing large-size images somewhat manageable.
Image recognition is a capability of computer programs to identify certain things, such as places, objects, and actions in images. It has become one of the biggest applications of AI.
When humans look at an image or video, we do not have to study it before we can tell what it is. Image recognition is a machine or deep-learning approach and it is designed to resemble the way a human brain functions. Using deep learning, the machines are taught to recognize certain things within an image or video.
Since we were operating on a tight deadline, we needed to leverage services to build a prototype of a fully functional application.
First, we leveraged Amazon Web Services (AWS), specifically Amazon S3 and AWS Rekognition, for optical character recognition (OCR) to identify promotional codes in an image. We chose AWS Rekognition because it is fairly accurate out of the box.
As brand recognition was vital to our application, we built our own deep-learning model based on pictures our team took of several client products. Using brands specific to our client helped fuel our image recognition model concept for the purpose of our presentation.
After feeding and training the neural network with enough data, and analyzing the results, we chose to create a five-layer-deep neural network.
As we learned earlier, CNNs perform better than other neural networks for image recognition. Our team built a five-layer-deep CNN, custom-image recognition model using Google’s TensorFlow library in Python and a web service in Flask to consume and categorize images and client products in real time. Both the model and web service were hosted in Microsoft Azure.
We also built a proxy web service that communicates to both AWS Rekognition web service and the image recognition web service for the web app to consume.
In a very short timeframe, we build a fully functioning application with image-capturing capability and an accuracy rate of 81%. When an image is captured via the mobile app, the AWS Lambda Proxy Function is contacted with the image as the input. The image is then stored on AWS S3 and the AWS Lambda Proxy Function calls AWS Rekognition for OCR and the image recognition web service hosted in Microsoft Azure for product recognition.
Now, we’re looking to further advance and optimize the image recognition application for our client. First, we’re building a custom model for promotional-code detection, which requires a lot of data, research, and images of the font used in the promotional codes so the machine-learning model can better detect that specific font for improved results.
Next, we’ll work to train the model with more data and client products. We’ll also explore machine-learning models to use in pre-processing and product recognition.
Engineering solutions for brands
DEG’s innovative engineering team develops solutions for national and global brands seeking to better connect with customers and optimize business operations. Our experienced team of technical resources and consultants work with clients, as an extension of their teams, to deliver solutions on time and on budget. Contact us to learn how we can help you tackle your biggest challenges.