Computer vision
In science, Artificial Intelligence is intelligence demonstrated by machines, as opposed to natural intelligence demonstrated by humans. . Typically, the term “artificial intelligence” is often used to describe machines (or computers) that mimic the “cognitive” functions that humans associate with the human mind, such as “learning”. ” and “problem solving”.
AI was established as an academic discipline in 1956, but it was only when the information technology boom of the 4.0 era came, that artificial intelligence really shook the world. In 2018, AI has been applied by people in many aspects of life and is becoming an inevitable trend in many fields of economic, scientific and educational life of mankind.
Artificial intelligence is likely to become the most disruptive technology in the next 10 years thanks to advances in computing power, leaps in volume, speed, and diversity of data.
Research on artificial intelligence technology for new products is creating incredible applications as machines and devices are getting closer and closer to human capabilities.
Even in many fields, machines with artificial intelligence have the ability to outperform humans. More than 10 years ago, many people were skeptical about the prediction that “50 years from now humanity will have a computer that recognizes images as good as the human eye”.
But in fact, only 10 years later, computers that recognize images have appeared. By 2016, many machines have surpassed the ability to recognize and analyze images with the human eye. Today, artificial intelligence has been present in most industries and fields, changing daily life.
Computer vision is considered one of the fields of artificial intelligence (AI) and brings great modern applications.
1. What is computer vision?
Computer vision is a form of technology used to describe a machine’s ability to receive and analyze visual data and then make decisions about it. To put it simply, this is a technology in the field of artificial intelligence and computer science, setting up machines to have vision and recognition processing like humans.
Currently this technology has had certain applications, on the user level, it has been applied to unmanned aircraft to avoid obstacles, the same applies to cars. from Tesla and Volvo.
Computer vision refers to the entire process of simulating human vision in a non-biological apparatus. This includes taking the initial shot, detecting and recognizing objects, recognizing the temporal context between scenes, and developing a high-level understanding of what is happening in the appropriate time period.
This technology has long been commonplace in science fiction, and as such, they are often taken for granted. In practice, a system that provides reliable, accurate, and real-time computer vision is a challenging problem that has yet to be fully developed.
As these systems mature, there will be countless applications that rely on computer vision as a key component. Typical examples are self-driving cars, autonomous robots, drones, surgical-assisted intelligent medical imaging devices, and surgical implants that restore human vision.
2. Why is computer vision necessary?
Computer vision allows computers as well as robots, computer-controlled vehicles and everything from factories and farm equipment to cars and airplanes to perform some sort of automated operation. operate more efficiently, even more safely.
Its importance has become more apparent in a digital age. We’ve seen the application of this technology by helping users organize and access their image collections without tagging or marking up in Google Photos. But it is worth mentioning how it remains constant when the number of images being shared every day reaches billions. For humans, manual manipulation is impossible.
A study last year by photo printing service Photoworld showed that it would take a person 10 years to go through all the pictures shared on snapchat (in just 1 hour) not to mention. to the classification. And of course in those 10 years, the number of corresponding photos also increased exponentially. This shows that today’s world is flooded with digital images and we need these computer technologies to be able to handle it all – it is beyond human capacity.
3. How does computer vision work?
On a certain level this is all about pattern recognition technology. The way to train a computer to understand actual image data is to feed it images, lots of pictures, possibly thousands or millions, organized and labeled in advance.
As a next step, software developers will draw up an algorithm that follows various software techniques that allow the computer to detect all patterns according to the many factors associated with those labels.
For example, if you feed a computer a million images of penguins, computer vision will follow all the algorithms that allow them to analyze the colors in the image, the shapes and the distances between the penguins. part. At the end of the algorithm, the computer will be able to apply its experience if provided with other unlabelled images to format the images of the penguins.
4. Computer vision in the past and current trends
Traditional computer vision systems are combinations of algorithms that work together in an effort to solve the aforementioned tasks. The main goal is to extract features from an image, including side tasks such as edge detection, corner detection, and color-based segmentation. The accuracy of the algorithms used to extract features depends on the design and flexibility of each algorithm.
Examples of traditional feature extraction algorithms are Scale-invariant feature transform (SIFT), Speeded up robust features (SURF) and Binary Robust Independent Elementary Features (BRIEF). Different algorithms perform with varying degrees of success, depending on the type and quality of the image used as input. Ultimately, the accuracy of the entire system depends on the methods used to extract features. Once the features have been extracted, the analysis is performed using traditional Machine Learning methods.
The main problem with this approach is that the system needs to be told what features to look for in the image. Basically, since the algorithm behaves as defined by the algorithm designer, the extracted features are designed by humans. In such implementations, the poor performance of the algorithm can be addressed through fine-tuning, such as by adjusting parameters or by modifying the code level to adjust the behavior. However, changes like these need to be made manually and are hard-coded or fixed for a specific application.
Current trends from Deep Learning
Although there are still significant obstacles in the development path of computer vision to the “human level”, Deep Learning systems have made significant progress in handling a number of related sub-tasks. Mandarin. The reason for this success is partly based on the additional responsibility assigned to deep learning systems.
It is reasonable to say that the biggest difference with deep learning systems is that they no longer need to be programmed to look for specific characteristics. Instead of searching for specific features using a carefully programmed algorithm, neural networks inside deep learning systems are trained. For example, if the car in the image is misclassified as a motorcycle, then you don’t tweak the parameters or rewrite the algorithm. Instead, you keep training until the system gets it right.
With the increased computing power provided by modern deep learning systems, there is steady and noticeable progress towards the point where a computer will be able to recognize and react to everything it sees. see.
5. Application of computer vision in practice
Detect defects
This is probably the most popular application of computer vision. Until now, the detection of faulty elements was usually carried out by designated supervisors and by extension they could not control an entire system process.
With computer vision, we can check for all the tiniest flaws from metal cracks, paint defects, bad prints, which are less than 0.05mm in size. This processing is many times faster and better than the human eye. This algorithm is specially designed and trained for each specific application through defect and defect-free images.
Auto Reader
If you’ve ever used the Google translate app, you’ve probably discovered the ability to point your smartphone camera at text from any number of languages and translate it into another on almost any screen. right away. Uses character recognition (OCR) algorithms to extract information, specifically optical character recognition – allowing for an accurate translation that is then transformed into an overlay over the actual text.
Automatic operation
You’ve probably seen driverless cars on TV, this field relies heavily on Computer vision and Deep learning. While it’s not yet time to completely replace human drivers, autonomous vehicle technology has advanced significantly over the past few years.
AI technology analyzes data collected from millions of drivers, learning from driving behavior to automatically find lanes, estimate road curves, detect hazards, and interpret signals and traffic signals.
Data processing
To assist humans in the tasks of identifying and organizing information, Computer Vision tools and Deep Learning models have been included in the research, which requires large volumes of labeled data. As Deep Learning algorithms evolve, they largely replace the manual tagging process through an approach known as crowdfunding – automated real-time collection and data tagging. created by experts and from there machine learning will start the process of recognizing objects.
Medical
Great advances are continuously appearing in the fields of pattern recognition and image processing. At the same time, it is not surprising that the medical community and healthcare professionals consider that medical imaging (a technique for creating visual images of the inside of the body for clinical analysis) and medical interventions, as well as visually representing the function of certain organs or physiological tissues) have become an essential part of the way they work, leading to better diagnostic tools and significantly increase the likelihood of more effective actions being taken.
Medical image analysis is a big help for predictive and therapeutic analysis. For example, computer vision applied to endoscopic imaging could increase the validity and reliability of data to reduce colorectal cancer-related mortality.
In another example, computer vision technology also provides technical assistance for surgery. 3D imaging models of the skull, as part of brain tumor treatment, offer great potential in advanced neurosurgery preparation. In addition, as deep learning is increasingly used in AI technologies, leveraging it to classify lung nodules has made great progress for the early diagnosis of lung cancer.
Another typical example of applied computer vision in this field is IBM Watson for Oncology – An optimized cancer treatment protocol using advanced artificial intelligence from the United States. This regimen has been applied in more than 230 hospitals and 13 countries, including Vietnam.
The IBM Watson for Oncology system has the ability to identify, evaluate, and compare treatment options for each specific patient; 83% is the ratio between IBM Watson for Oncology and the Diagnostic Panel when it comes to cancer treatment options.
About IBM Watson for Oncology: https://www.ibm.com/products/clinical-decision-support-oncology
Retail
Computer vision is being used in stores more and more, especially to help improve the customer experience. Pinterest Lens is a search engine that uses computer vision to detect objects. By using smartphone apps in stores, you can visualize what a product looks like and get other products related to it.
Facial recognition is a well-known computer vision application that can be used in a shopping mall or in a store. Lolli & Pops, a candy store based in the US, is using facial recognition to earn loyalty points. “Imagine: You walk into your favorite store and the salesperson greets you by name, and any time you need it, they share with you their latest products that you might be interested in. most attentive.” Technological innovation can offer personalized referrals specific to each customer.
There seems to be no limit when it comes to use cases of computer vision in retail, they can also include analysis of store shelves or floors, even analysis of customer moods. specifically, detect emotions based on algorithms through images in video and analyze the smallest facial expressions, process them and, finally, interpret general emotions.
Ending the queue to pay could be the ultimate goal of technological innovation in stores. Amazon has developed a new model, Amazon Go, that leverages technologies including computer vision, IoT, and AI to detect, track, and analyze customer behavior and actions in-store to handle Automate the payment process and send them e-invoices.
Bank
When it comes to linking AI technologies to banking, we mostly think about fraud detection. While it’s an area of particular focus for cutting-edge technology in the field, computer vision can improve much more. Image recognition applications that use machine learning to classify and extract data for monitoring the authentication of documents such as ID cards or driver’s licenses can be used to improve the user experience. remote customer experience and enhanced security.
Drone based fire detection
The wide and varied use of computer vision also applies to the security sector. Drones, or UAVs, can leverage computer vision systems to enhance human detection of wildfires, using infrared (IR) imaging as part of their detection. forest fire monitoring protocols. Advanced algorithms analyze video image characteristics such as motion or brightness to detect fire. The system is performing targeted extractions for easy detection of patterns and calculating ways to see the difference between actual fires and movements that could be misinterpreted as fires.
Drones can also improve the security and efficiency of fire operations by monitoring or studying hazardous areas. Firefighters can run advanced algorithm-driven analytics to check for smoke and fire, thereby assessing risk and making predictions about fire spread.
Face Recognition
Facial recognition mapping and storing digital identities thanks to deep learning algorithms. This type of biometric identification can be compared to the currently very popular voice, iris or fingerprint recognition technologies.
The concept dates back to 2011 when Google demonstrated that it was possible to create a face detector using only unlabeled images. They designed a system that can learn to detect cat images without having to explain to the system what a cat looks like.
At the time, the neural network was 1,000 computers made up of 16,000 cores. It was fed with 10 million random YouTube videos, Dr. J. Dean, who worked on this project, explained in an interview with the New York Times that they never told the system during the process. train that “this is a cat,” so it basically invents the concept of a cat on its own.
Nowadays, smartphones can use high-quality cameras for identification. For example, Apple’s iPhone X runs Face ID technology so users can unlock their phone. This facial data is encrypted and stored in the cloud and it can also be used for other purposes like authentication at checkout.
Computer vision is being used in the security field to find criminals, predict emergency crowd movements, and more. By developing more and more complex and efficient advanced computer vision algorithms, we are improving its results and human speech recognition as both these topics are based on principles compare. All of these contribute to enhancing situational awareness of AI and robot.
In addition to the advantages of computer vision derived from deep learning techniques and the growing power of machine learning algorithms, concerns continue to arise as these technologies are posing new problems. privacy and ethics issues.