Image Recognition in Artificial Intelligence

Share Embed Donate


Short Description

Download Image Recognition in Artificial Intelligence...

Description

Lovely Professional University [Term paper of Artificial Intelligence (CAP- 402)]

Topic:- Image Recognition in Artificial Intelligence.

Submitted TO:

Submitted By:

Mrs. Charu Sharma Dept. CA

Aradhana katoch Class: BCA-MCA Roll No. 07 Reg. No. 3010060007

Synopsys

Course Code: CAP402

Course Instructor: Ms. Navdeep

Course Tutor: _________

Student Roll Number: 07

Section is: E3601

Declaration: I declare that this Term paper is my individual work. I have not copied from any other student's work or from any other source except where due acknowledgement is made explicitly in the text, nor has any part been written for me by another person.

Aradhana katoch (Student Signature) Evaluator’s Comment: __________________________________________________________________________________ __________________________________________________________________________________ __________________________________________________________________________________ _________

Marks Obtained: ____________

Out Of: ______________________

Contents Contents................................................................................................................ 3 Introduction to the topic:.......................................................................................3 Image Processing...................................................................................................5 Example:-............................................................................................................... 7 Example of Image recognition:-.............................................................................8 MODEL:-................................................................................................................. 9 Properties of the model:-.......................................................................................9 Motion analysis.................................................................................................10

Introduction to the topic: "Image recognition is the research area that studies the operation and design of a picture that recognize patterns in it. Image recognition is a long-standing challenge in science. Important application areas are image analysis through which we can try to make our computers to recognize the images as a human mind recognise.

"For example, when we see a dog, first we recognize that it's an animal....This recognition concept is simple and familiar to everybody in the real world environment, but in the world of artificial intelligence, recognizing such objects is an amazing feat. The functionality of the human brain is amazing; it is not comparable with any artificial machines or software. It is done through machines Applications include finger print identification, face recognition, character recognition, signature recognition and classification of objects in scientific/research areas such as astronomy, engineering, statistics, medical, machine learning and neural networks." These include statistical and structural pattern recognition; image analysis; computational models of vision; enhancement, restoration, segmentation, feature extraction, shape and texture analysis; character and text recognition.

Current research on the image recognition for AI: It takes surprisingly few pixels of information to be able to identify the subject of an image. The discovery could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do. Laboratory, and colleagues have been trying to find out what is the smallest amount of information--that is, the shortest numerical representation--that can be derived from an image that will provide a useful indication of its content. At present, the only ways to search for images are based on text captions that people have entered by hand for each picture, and many images lack such information. Automatic identification would also provide a way to index pictures people download from digital cameras onto their computers, without having to go through and caption each one by hand. And ultimately it could lead to true machine vision, which could someday allow robots to make sense of the data coming from their cameras and figure out where they are. We will try to find very short codes for images, so that if two images have a similar sequence, they are probably similar--composed of roughly the same object, in roughly the same configuration." If one image has been identified with a caption or title, then other images that match its numerical code would likely show the same object and so the name associated with one picture can be transferred to the others. Psychologists have proposed that many human-object interaction activities form unique classes of scenes. Recognizing these scenes is important for many social functions. To enable a computer to do this is however a challenging task. Much of artificial intelligence deals with autonomous planning or deliberation for robotical systems to navigate through an environment. A detailed understanding of these environments is required to navigate through them. Information about the environment could be provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment.

Image Processing image processing, image analysis and machine vision. There is a significant overlap in the range of techniques and applications that these cover. This implies that the basic techniques that are used and developed in these fields are more or less identical, something which can be interpreted as there is only one field with different names. On the other hand, it appears to be necessary for research groups, scientific journals, conferences and companies to present or market themselves as belonging specifically to one of these fields and, hence, various characterizations which distinguish each of the fields from the others have been presented. The following characterizations appear relevant but should not be taken as universally accepted: •







Image processing and image analysis tend to focus on 2D images, how to transform one image to another, e.g., by pixel-wise operations such as contrast enhancement, local operations such as edge extraction or noise removal, or geometrical transformations such as rotating the image. This characterisation implies that image processing/analysis neither require assumptions nor produce interpretations about the image content. They tends to focus on the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image. They tends to focus on applications, mainly in manufacturing, e.g., vision based autonomous robots and systems for vision based inspection or measurement. This implies that image sensor technologies and control theory often are integrated with the processing of image data to control a robot and that real-time processing is emphasised by means of efficient implementations in hardware and software. It also implies that the external conditions such as lighting can be and are often more controlled in machine vision than they are in general computer vision, which can enable the use of different algorithms. There is also a field called imaging which primarily focus on the process of producing images, but sometimes also deals with processing and analysis of images. For example, medical imaging contains lots of work on the analysis of image data in medical applications.

Typical tasks of image recognition:Determining whether or not the image data contains some specific object, feature, or activity. This task can normally be solved robustly and without effort by a human, but is still not satisfactorily solved in computer vision for the general case: arbitrary objects in arbitrary situations. The existing methods for dealing with this problem can at best solve it only for specific objects, such as simple geometric objects (e.g., polyhedra), human faces, printed or

hand-written characters, or vehicles, and in specific situations, typically described in terms of well-defined illumination, background, and pose of the object relative to the camera. Different varieties of the recognition problem are described in the literature: •





Object recognition: one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. Identification: An individual instance of an object is recognized. Examples: identification of a specific person's face or fingerprint, or identification of a specific vehicle. Detection: the image data is scanned for a specific condition. Examples: detection of possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analysed by more computationally demanding techniques to produce a correct interpretation.

Several specialized tasks based on recognition exist, such as: •





Content-based image retrieval: finding all images in a larger set of images which have a specific content. The content can be specified in different ways, for example in terms of similarity relative a target image (give me all images similar to image X), or in terms of high-level search criteria given as text input (give me all images which contains many houses, are taken during winter, and have no cars in them). Pose estimation: estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an assembly line situation. Optical character recognition (OCR): identifying characters in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or indexing Department of Computer Science and Engineering, Michigan State University. "The Pattern Recognition and Image Processing (PRIP) Lab faculty and students investigate the use of machines to recognize patterns or objects. Methods are developed to sense objects, to discover which of their features distinguish them from others, and to design algorithms which can be used by a machine to do the classification. ... Important applications include face recognition, fingerprint identification, document image analysis, 3D object model construction, robot navigation, and visualization/exploration of 3D volumetric data. Current research problems include biometric authentication, automatic surveillance and tracking, handless HCI, face modeling, digital watermarking and analyzing structure of online documents. Recent graduates of the lab have worked on handwriting recognition, signature verification, visual learning, and image retrieval."

Example:It takes surprisingly few pixels of information to be able to identify the subject of an image, a team led by an MIT researcher has found. The discovery could lead to great advances in the automated identification of online images and, ultimately, provide a basis for computers to see like humans do.

Deriving such a short representation would be an important step toward making it possible to catalog the billions of images on the Internet automatically. At present, the only ways to search for images are based on text captions that people have entered by hand for each picture, and many images lack such information. Automatic identification would also provide a way to index pictures people download from digital cameras onto their computers, without having to go through and caption each one by hand. And ultimately it could lead to true machine vision, which could someday allow robots to make sense of the data coming from their cameras and figure out where they are. so that if two images have a similar sequence [of numbers], they are probably similar-composed of roughly the same object, in roughly the same configuration." If one image has been identified with a caption or title, then other images that match its numerical code would likely show the same object (such as a car, tree, or person) and so the name associated with one picture can be transferred to the others. "With very large amounts of images, even relatively simple algorithms are able to perform fairly well" in identifying images this way.

Face recognition (a part of image recognition):Face recognition systems are progressively becoming popular as means of extracting biometric information. Face recognition has a critical role in biometric systems and is attractive for numerous applications including visual surveillance and security. Because of the general public acceptance of face images on various documents, face recognition has a great potential to become the next generation biometric technology of choice. Face images are also the only biometric information available in some legacy databases and international terrorist watch-lists and can be acquired even without subjects' cooperation.

Example of Image recognition:Detecting objects in cluttered scenes and estimating articulated human body parts are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g. playing tennis), where the relevant object tends to be small or only partially visible, and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other – recognizing one facilitates the recognition of the other. We then cast the model learning task as a structure learning problem, of which the structural connectivity between the object, the overall human pose, and different body parts are estimated through a structure search approach, and the parameters of the model are estimated by a new maxmargin algorithm. On a sports data set of six classes of human-object interactions . Introduction:Using context to aid visual recognition is recently receiving more and more attention. Psychology experiments show that context plays an important role in recognition in the human visual system. object detection and recognition ,scene recognition ,action classification ,and segmentation ,While the idea of using context is clearly a good one, a curious observation shows that most of the context information has contributed relatively little to boost performances in recognition tasks between context based methods and sliding window based methods for object detection .

MODEL:Objects and human poses can serve as mutual context to facilitate the recognition of each other. the human pose is better estimated by seeing the cricket bat, from which we can have a strong prior of the pose of the human. the cricket ball is detected by understanding the human pose of throwing the ball. One reason to account for the relatively small margin is, in our opinion, the lack of strong context. While it is nice to detect cars in the context of roads, powerful car detectors can nevertheless detect cars with high accuracy whether they are on the road or not. Indeed, for the human visual system, detecting visual abnormality out of context is crucial for survival and social activities Many important image recognition tasks rely critically on context. One such scenario is the problem of human pose estimation and object detection in human-object interaction activities .However, the two difficult tasks can benefit greatly from serving as context for each othe. The goal of this paper is to model the mutual context of objects and human poses in HOI activities so that each can facilitate the recognition of the other. Given a set of training images, our model automatically discovers the relevant poses for each type of HOI activity, and furthermore the connectivity and spatial relationships between the objects and body parts. We formulate this task as a structure learning problem, of which the connectivity is learned by a structure search approach, and the model parameters are discriminatively estimated by a novel max-margin approach. By modeling the mutual co-occurrence and spatial relations of objects and human poses, we show that our algorithm significantly improves the performance of both object detection and pose estimation on a dataset of sports images. Some techniques have been proposed to avoid exhaustively searching the image which makes the algorithm more efficient. While the most popular detectors are still based on sliding windows, more recent work has tried to integrate context to obtain better performance . However, in most of the works the performance is improved by a relatively small margin. It is out of the scope of this paper to develop an object detection or pose estimation method that generally applies to all situations. Instead, we focus on the role of context in these problems. In most of these works, one type of scene information serves as contextual facilitation to a main recognition problem. For example, ground planes and horizons can help to refine pedestrian detections.

Properties of the model:Co-occurrence context for the activity class, object, and human pose. Given the presence of a tennis racket, the human pose is more likely to be playing tennis instead of playing croquet. That is to say, co-occurrence information can be beneficial for coherently modeling the object, the human pose, and the activity class. Multiple types of human poses for each

activity. Our model allows each activity (�) to consist of more than one human pose (�). Treating � as a hidden variable, our model automatically discovers the possible poses from training images. This gives us more flexibility to deal with the situations where the human poses in the same activity are inconsistent.

Image Recognation Syatems:Motion analysis Several tasks relate to motion estimation where an image sequence is processed to produce an estimate of the velocity either at each points in the image or in the 3D scene, or even of the camera that produces the images . Examples of such tasks are: • • •

Egomotion: determining the 3D rigid motion (rotation and translation) of the camera from an image sequence produced by the camera. Tracking: following the movements of a (usually) smaller set of interest points or objects (e.g., vehicles or humans) in the image sequence. Optical flow: to determine, for each point in the image, how that point is moving relative to the image plane, i.e., its apparent motion. This motion is a result both of how the corresponding 3D point is moving in the scene and how the camera is moving relative to the scene.

Scene reconstruction Given one or (typically) more images of a scene, or a video, scene reconstruction aims at computing a 3D model of the scene. In the simplest case the model can be a set of 3D points. More sophisticated methods produce a complete 3D surface model.

Image restoration The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images. The simplest possible approach for noise removal is various types of filters such as low-pass filters or median filters. More sophisticated methods assume a model of how the local image structures look like, a model which distinguishes them from the noise. By first analysing the image data in terms of the local image structures, such as lines or edges, and then controlling the filtering based on local information from the analysis step, a better level of noise removal is usually obtained compared to the simpler approaches. An example in this field is the inpainting. Some systems are stand-alone applications which solve a specific measurement or detection problem, while others constitute a sub-system of a larger design which, for example, also contains sub-systems for control of mechanical actuators, planning, information databases, man-machine interfaces, etc. The specific implementation of a computer vision system also depends on if its functionality is pre-specified or if some part of it can be learned or modified during operation. There are, however, typical functions which are found in many computer vision systems.







Image acquisition: A digital image is produced by one or several image sensors, which, besides various types of light-sensitive cameras, include range sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image sequence. The pixel values typically correspond to light intensity in one or several spectral bands (gray images or colour images), but can also be related to various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic waves, or nuclear magnetic resonance. Pre-processing: Before a computer vision method can be applied to image data in order to extract some specific piece of information, it is usually necessary to process the data in order to assure that it satisfies certain assumptions implied by the method. Examples are o Re-sampling in order to assure that the image coordinate system is correct. o Noise reduction in order to assure that sensor noise does not introduce false information. o Contrast enhancement to assure that relevant information can be detected. o Scale-space representation to enhance image structures at locally appropriate scales. Feature extraction: Image features at various levels of complexity are extracted from the image data. Typical examples of such features are o Lines, edges and ridges. o Localized interest points such as corners, blobs or points. More complex features may be related to texture, shape or motion.



Detection/segmentation: At some point in the processing a decision is made about which image points or regions of the image are relevant for further processing. Examples are o Selection of a specific set of interest points o Segmentation of one or multiple image regions which contain a specific object of interest.



High-level processing: At this step the input is typically a small set of data, for example a set of points or an image region which is assumed to contain a specific object. The remaining processing deals with, for example: o o o

Verification that the data satisfy model-based and application specific assumptions. Estimation of application specific parameters, such as object pose or object size. Classifying a detected object into different categories.

So, image processing help AI to identify the image and respond according to the image identification.

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF