3D motion detection using neural networks

Share Embed Donate


Short Description

Download 3D motion detection using neural networks...

Description

1

CHAPTER 1 INTRODUCTION In video video survei surveilla llance, nce, video video signal signalss from from multip multiple le remote remote locati locations ons are displayed on several TV screens which are typically placed together in a control room. In the so-called third generation surveillance systems (3GSS), all the parts of  the survei surveillan llance ce syste systems ms will will be digita digitall and conseq consequen uently tly,, digita digitall video video will will be transm transmitte itted d and proces processed sed.. Additi Additiona onally lly,, in 3GSS 3GSS some some 'intel 'intellig ligenc ence' e' has to be introduced to detect relevant events in the video signals in an automatic way. This allows allows filtering filtering of the irrelevan irrelevantt time time segmen segments ts of the video video sequen sequences ces and the displaying on the TV screen only those segments that require the attention of the survei surveilla llance nce operato operator. r. Motio Motion n detecti detection on is a basic basic operat operation ion in the selecti selection on of  signif significa icant nt segmen segments ts of the video video signal signals. s. Once Once motion motion has been been detecte detected, d, other  other  features can be considered to decide whether a video signal has to be presented to the surveillanc surveillancee operator. operator. If the motion detection is performed performed after the transmission transmission of  the video signals from the cameras to the control room, then all the bit streams have to be previously decompressed; this can be a very demanding operation, especially if  there are many cameras in the surveillance system. For this reason, it is interesting to cons consid ider er the the use use of moti motion on dete detecti ction on algo algori rith thms ms oper operat atin ing g in the the comp compre ress ssed ed (transform) domain.

In this thesis we present a motion detection algorithm in the compressed domain with a low computational cost. In the following Section, we assume that video is compressed by using motion JPEG (MJPEG), i.e. each frame is individually JPEG compressed.

Motion detection from a moving observer has been a very important tech techni niqu quee for for comp comput uter er visi vision on appl applic icati ation ons. s. Espe Especi ciall ally y in recen recentt year years, s, for  for  autonomous driving systems and driver supporting systems, vision-based navigation method has received more and more attention worldwide.

2

One of its most important tasks is to detect the moving obstacles like cars,   bicycles or even pedestrians while the vehicle itself is running in a high speed. Method Methodss of image image differe differenci ncing ng with with the clear clear backg backgrou round nd or betwee between n adjacen adjacentt frames are well used for the motion detection. But when the observer is also moving, whic which h lead leadss to the the resul resultt of cont contin inuo uous usly ly chan changi ging ng back backgr grou ound nd scen scenee in the the   perspective projection image, it becomes more difficult to detect the real moving objects by differencing methods. To deal with this problem, many approaches have  been proposed in recent years. Previous work in this area has been mainly in two categories: 1) Using the difference of optical flow vectors between background and the moving objects, 2) calibrating the background displacement by using camera’s 3D motion analysis result. result. Calculate Calculate the optical optical flow and estimate the flow vector’s reliability between adjacent frames. The major flow vector, which represents the motion of background, can be used to classify and extract the flow vectors of the real moving objects. However, by reason of its huge calculation cost and its difficulty for  determining the accurate flow vectors, it is still unavailable for real applications. To analys analysis is the camera’ camera’ss 3D motion motion and calibrate calibrate the backg backgrou round nd is anothe anotherr main main method for moving objects detection. For on-board camera’s motion analysis, many motion motion-det -detecti ecting ng algori algorithm thmss have have been been propos proposed ed which which always always depend depend on the  previous  previous recognition recognition results results like road lane-marks lane-marks and horizon disappoin disappointing. ting. These methods show some good performance in accuracy and efficiency because of their  detail detailed ed analys analysis is of road road struct structure ure and measur measured ed vehicl vehiclee locomo locomotio tion, n, which which is, however, however, computation computationally ally expensive expensive and over-depen over-depended ded upon road features features like lane-marks, and therefore lead to unsatisfied result when lane mark is covered by other vehicles or not exist at all. Compare with these previous works, a new method of moving objects detection detection from an on-board camera is presented presented in this paper. To deal with the background-change problem, our method uses camera’s 3D motion analysis results to calibrate the background scene. With pure points matching and the introduction of camera’s Focus of Expansion (FOE), our method is able to determine camera’s rotation and translation parameters theoretically by using only three pairs of  matching points between adjacent frames, which make it faster and more efficient for  real-time applications.

3

A neural network, also known as a parallel distributed processing network, is a computing paradigm that is loosely modeled after cortical structures of the brain. It consists of interconnected processing elements called nodes or neurons that work  together to produce an output function. The output of a neural network relies on the cooperation of the individual neurons within the network to operate. Processing of  information by neural networks is characteristically done in parallel rather than in series (or sequentially) as in earlier binary computers or Von Neumann machines. Since it relies on its member neurons collectively to perform its function, a unique  property of a neural network is that it can still perform its overall function even if  some of the neurons neurons are not functioning functioning.. In other words it is robust to tolerate error  or failure. All neural networks take numeric input and produce numeric output. The transfer function of a unit is typically chosen so that it can accept input in any range, and produces output in a strictly limited range (it has a squashing effect).

An artificial neural network (ANN), also called a simulated neural network  (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing  based on a connectionist approach to computation. In most cases an ANN is an adaptive adaptive system that changes its structure structure based on external or internal internal information information that flows through the network.

There are different topologies of neural networks that may be employed for  time series modeling. In our investigation we used radial basis function networks which which have have shown shown consid considerab erably ly better better scalin scaling g proper propertie ties, s, when when increas increasing ing the number of hidden units, than networks with sigmoid activation function.

RBF networks were introduced into the neural network literature by Broom head head/L /Low owee and and Pogg Poggio io/G /Gir iros osii in the the late late 1980 1980s. s. The The RBF RBF netw networ ork k mode modell is motivated by the locally tuned response observed in biologic neurons, e.g. in the visu visual al or in the the audi audito tory ry syst system em.. RBFs RBFs have have been been stud studie ied d in mult multiv ivar aria iate te approximation theory, particularly in the field of function interpolation. The RBF neural network model is an alternative to multilayer perceptron which is perhaps the

4

most often used neural network architecture. A radial basis function network (RBF), therefore, has a hidden layer of radial units, each actually modeling a Gaussian response surface. Since these functions are nonlinear, it is not actually necessary to have more than one hidden layer to model any shape of function: sufficient radial units will always be enough to model any function.

In surveillance surveillance system system estimation of motion motion is of great importance, importance, which enables the various types of operations to be performed on the detected object. When using motion estimation, an assumption is made that the objects in the scene have only translational motion. This assumption holds as long as there is no camera pan, zoom, changes in luminance, or rotational motion (quite an assumption!).

After the process of estimation, the detected motion has to be extracted. With the obtained boundary, two objects (with background) can then be extracted from two image frames (both current image frame and previous image frame). Extracting the moving moving object object from from its backgrou background nd can be done done by the edge edge enhanc enhanceme ement nt network and the background remover.

In algorithm level, complexity, regularity and precision are main factors that directly affect the power consumed in extracting an algorithm for motion estimation. Concurrency and modularity are the requirements on algorithms that are intended to exec execut utee on low low powe powerr arch archit itec ectu ture re.. This This proj projec ectt aims aims to redu reduce ce the the powe power  r  consumption of motion estimation at algorithm level and architectural level by using neural network concept.

5

1.1

PROBLEM ST STATEMENT The goals for this thesis have been the following.

One One goal goal has has been been to comp compil ilee an intr introd oduc ucti tion on to the the moti motion on detec detectio tion n algorithms. There exist a number of studies but complete reference on real time motion detection is not as common .we have collected materials from journals, papers and conferences conferences and propose proposed d approach approach that can be best to implement implement a real time motion detection.

Another goal has been to search for algorithms that can be used to implement the RBF neural network.

A third goal is to evaluate their performance with regard to motion detected. Thes Thesee prop proper erti ties es were were chos chosen en becau because se they they have have the the grea greate test st impa impact ct on the the implementation effort.

A final goal has been to design and implement an algorithm including object extraction. This should be done in high level language or matlab. The source code should be easy to understand so that it can serve as a reference on the standard for  designers that need to implement real time motion detection.

6

CHAPTER 2 OVERVIEW OF NEURAL NETWORKS   Neural network theory is sometimes used to refer to a branch of computational scie scienc ncee that that uses uses neur neural al netw networ orks ks as mode models ls to simu simula late te or anal analyz yzee comp complex lex  phenomena and/or study the principles of operation of neural networks analytically. It addres addresses ses proble problems ms similar similar to artific artificial ial intell intelligen igence ce (AI) (AI) except except that that AI uses uses traditional computational algorithms to solve problems whereas neural networks use 'net 'netwo work rkss of agen agents ts'' (sof (softw tware are or hard hardwar waree enti entitie tiess link linked ed toge togeth ther er)) as the the computational architecture to solve problems. Neural networks are trainable systems that can "learn" to solve complex problems from a set of exemplars and generalize the "acquir "acquired ed knowled knowledge" ge" to solve solve unfore unforesee seen n proble problems ms as in stock stock market market and environmental prediction. i.e., they are self-adaptive systems. Trad Tradit itio iona nall lly, y, the the term term neur neural al netw networ ork k has has been been used used to refer refer to a netw networ ork k of   biological neurons. In modern usage, the term is often used to refer to artificial neural networks, which are composed of artificial neurons or nodes. Thus the term 'Neural Network' has two distinct connotations: 1. Biolog Biological ical neural neural networ networks ks are made made up of real real biologic biological al neurons neurons that are connec connected ted or functio functional nallyly-rela related ted in the periph peripheral eral nervo nervous us system system or the central nervous system. In the field of neuroscience, they are often identified as grou groups ps of neur neuron onss that that perfo perform rm a speci specifi ficc phys physio iolo logi gical cal func functi tion on in laboratory analysis.

2. Artifi Artificial cial neural neural networks networks are made up of interco interconne nnectin cting g artific artificial ial neurons neurons (usually simplified neurons) designed to model (or mimic) some properties of   biological neural networks. Artificial neural networks can be used to model the modes modes of operati operation on of biolog biologica icall neural neural networ networks, ks, whereas whereas cognit cognitive ive models are theoretical models that mimic cognitive brain functions without

7

necessarily using neural networks while artificial intelligence are well-crafted algori algorithm thmss that that solve solve specif specific ic intell intellige igent nt proble problems ms withou withoutt using using neural neural network as the computational architecture.

2.1 The brain, neural networks and computers While it is accepted by most scientists that the brain is a type of computer, it is a computer with a vastly different architecture to the computers that most of us are fami familia liarr with with.. The The brain brain is mass massiv ivel ely y paral parallel lel,, even even more more so than than adva advanc nced ed multiprocessor computers. This means that simulating the behavior of a brain on traditional computer hardware is necessarily slow and inefficient.   Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation relation between between this model and brain biological biological architecture is very much debated. debated. To answer this question, question, David Marr has proposed proposed various levels of analysis which   pro provi vide de us with with a plau plausi sibl blee answ answer er for for the the role role of neur neural al netw networ orks ks in the the understanding of human cognitive functioning. The question of what is the degree of complexity and the properties that individual neural elements should have in order to reproduce something resembling animal intelligence is a subject of current research in theoretical neuroscience. Historically computers evolved from von Neumann architecture, based on sequen sequential tial proces processin sing g and execut execution ion of explici explicitt instru instructi ctions ons.. On the other other hand hand origins of neural networks are based on efforts to model information processing in  biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, words, rather than sequential sequential processing processing and execution, at their very heart, neural networks are complex statistic processors.

8

2.2 Artificial Neural networks An artificial neural network (ANN), also called a simulated neural network  (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing  based on a connectionist approach to computation. In most cases an ANN is an adaptive adaptive system that changes its structure structure based on external or internal internal information information that flows through the network. In more more pract practic ical al terms terms neur neural al netw networ orks ks are nonnon-li line near ar stat statis isti tica call data data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data.

2.3 Background An arti artific ficia iall neur neural al netw networ ork k invo involv lves es a netw networ ork k of simp simple le proc proces essi sing ng elements (neurons) which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters. One classical type of artificial neural network is the Hopfield net. In a neur neural al netw networ ork k mode model, l, simp simple le node nodess (cal (called led vari variou ousl sly y "neu "neuro rons ns", ", "neurodes", "PEs" ("processing elements") or "units") are connected together to form a network of nodes — hence the term "neural network". While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow. In mode modern rn soft softwa ware re impl implem emen enta tati tion onss of arti artifi ficia ciall neur neural al netw networ orks ks the the approach approach inspired by biology has more or less been abandoned abandoned for a more practical approach approach based on statistics and signal signal processing processing.. In some of these systems systems neural netw networ orks ks or part partss of neur neural al netw networ orks ks (suc (such h as artif artific icial ial neur neuron ons) s) are used used as components in larger systems that combine both adaptive and non-adaptive elements.

9

2.4 Models  Neural network models in artificial intelligence are usually referred to as artificial neural networks (ANN); these are essentially simple mathematical models defining a function

. Each type of ANN model corresponds to a class of 

such functions.

Fig 1 Artificial Neural Network 

10

Fig 2 A complex neural network 

2.5 Employing artificial neural networks Perhaps Perhaps the greatest greatest advantage advantage of ANN is their ability to be used as an arbitrary arbitrary function function approximati approximation on mechanism mechanism which 'learns' from observed observed data. However, However, using them is not so straightforward and a relatively good understanding of the underlying theory is essential. •

Choi Choice ce of mode model: l: This This will will depe depend nd on the the data data repre represe sent ntat atio ion n and and the the application. Overly complex models tend to lead to problems with learning.



Learn Learnin ing g algo algori rith thm: m: Ther Theree are are nume numero rous us trad tradeo eoff ffss betw between een lear learni ning ng algori algorithm thms. s. Almost Almost any algori algorithm thm will will work work well well with with the correc correctt hyper  hyper   parameters  parameters for training training on a particular particular fixed dataset. However However selecting selecting and tuning an algorithm for training on unseen data requires a significant amount of experimentation.



Robustness: If the model, cost function and learning algorithm are selected appropriately the resulting ANN can be extremely robust.

With the correct implementation ANN can be used naturally in online learning and large dataset applications. applications. Their simple simple implementat implementation ion and the existence of mostly mostly

11

local dependencies exhibited in the structure allows for fast, parallel implementations in hardware.

2.6 Types of neural networks 2.6.1 Feed forward neural network  The feed forward neural networks are the first and arguably simplest type of  artificial neural networks devised. In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. There are no cycles or loops in the network.

2.6.2 Single-layer perceptron The The earlies earliestt kind kind of neural neural networ network k is a single single-lay -layer er percep perceptro tron n networ network, k, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights. In this way it can be considered the simplest kind of  feed-forward network. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold (typically 0) the neuron neuron fires and takes the activated activated value value (typic (typically ally 1); otherw otherwise ise it takes takes the deactivated value (typically -1). Neurons with this kind of activation function are also called McCulloch-Pitts neurons or threshold neurons. A perc percep eptr tron on can can be crea create ted d usin using g any any valu values es for for the the acti activa vate ted d and and deact deactiv ivate ated d stat states es as long long as the the thres thresho hold ld valu valuee lies lies betw betwee een n the the two. two. Most Most  perceptrons have outputs of 1 or -1 with a threshold of 0 and there is some evidence that such networks can be trained more quickly than networks created from nodes with different activation and deactivation values. Perceptrons can be trained by a simple learning algorithm that is usually called the delta rule. It calculates calculates the errors between calculated calculated output and sample sample output data, and uses this to create an adjustment to the weights, thus implementing a form of gradient descent.

12

Single Single-un -unit it percep perceptro trons ns are only only capabl capablee of learnin learning g linearl linearly y separab separable le  patterns;  patterns; in 1969 in a famous famous monograph monograph entitled entitled Perceptrons Perceptrons Marvin Minsky and Seymour Papert showed that it was impossible for a single-layer perceptron network  to learn an XOR function.

2.6.3 Multilayer layer perceptron This class of networks consists of multiple layers of computational units, usually interconnected in a feed-forward way. Each neuron in one layer has directed connections to the neurons of the subsequent layer. In many applications the units of  these networks apply a sigmoid function as an activation function. The universal approximatio approximation n theorem theorem for neural networks networks states that every continuous function that maps intervals of real numbers to some output interval of  real numbers can be approximated arbitrarily closely by a multi-layer perceptron with just one hidden layer. This result holds only for restricted classes of activation functions, e.g. for the sigmoid functions. Multi-layer networks use a variety of learning techniques, the most popular    being being back-p back-prop ropaga agatio tion. n. Here Here the outpu outputt values values are compar compared ed with with the correct correct answ answer er to comp comput utee the the valu valuee of some some pred predef efin ined ed error error-fu -func ncti tion on.. By vario various us techniques the error is then fed back through the network. Using this information, the algorithm adjusts the weights of each connection in order to reduce the value of the error function by some small amount. After repeating this process for a sufficiently large number of training cycles the network will usually converge to some state where the error of the calculations is small. In this case one says that the network has learned a certain target function. To adjust weights properly one applies a general metho method d for non-li non-linea nearr optimi optimizati zation on that that is called called gradie gradient nt descen descent. t. For this, this, the derivative of the error function with respect to the network weights is calculated and the weights are then changed such that the error decreases (thus going downhill on the surface of the error function). For this reason back-propagation can only be applied on networks with differentiable.

13

Fig 3 XOR perceptron A three layer Perceptron net capable of calculating XOR. The numbers within the perceptrons perceptrons represent each perceptrons perceptrons'' explicit explicit threshold. threshold. The numbers numbers that annotate arrows represent the weight of the inputs. This net assumes that if the threshold is not reached, zero (not -1) is output. Note that the bottom layer of inputs is not always considered a real perceptron layer.

2.6.4 Radial basis function (RBF) network  Radi Radial al Basi Basiss Func Functi tion onss are powe powerf rful ul tech techni niqu ques es for for inte interp rpol olati ation on in multidimensional space. A RBF is a function which has built into a distance criterion with respect to a centre. Radial basis functions have been applied in the area of  neural neural networks networks where they may be used as a replacement replacement for the sigmoidal sigmoidal hidden hidden layer transfer characteristic in multi-layer perceptrons.

2.6.5 Echo State Network  The Echo State Network (ESN) is a recurrent neural network with a sparsely connected random hidden layer. The weights of output neurons are the only part of  the network that can change and be learned. ESN are good to (re)produce temporal  patterns.

14

2.6.6 Stochastic neural networks A stochastic neural network differs from a regular neural network in the fact that it introduces random variations into the network. In a probabilistic view of  neural networks, such random variations can be viewed as a form of statistical sampling, such as Monte Carlo sampling.

2.6.7 Neuro-fuzzy networks A neuro-fuzzy network is a fuzzy inference system in the body of an artificial neural network. Depending on the FIS type, there are several layers that simulate the  processes involved in a fuzzy inference like fuzzification, inference, aggregation and defuzzification. Embedding an FIS in a general structure of an ANN has the benefit of using available ANN training methods to find the parameters of a fuzzy system.

15

CHAPTER 3 RBF NETWORK  3.1 Radial Functions Radial functions are a special class of function. Their characteristic feature is that their response decreases (or increases) monotonically with distance from a central  point. The centre, the distance scale, and the precise shape of the radial function are  parameters of the model, all fixed if it is linear. A typical radial function is the Gaussian which, in the case of a scalar input, is

Its parameters are its centre c and its radius r. The figure illustrates a Gaussian RBF with centre c = 0 and radius r = 1. A Gaussian RBF monotonically decreases with distance from the centre. In contrast, a multiquadric RBF which, in the case of scalar input, is

monotonically increases with distance from the centre. Gaussian-like RBFs are local (give a significant response only in a neighbourhood near the centre) and are more commonly used than multiquadric-type RBFs which have a global response.

3.2 Radial Networks A RBF is a function which has built into a distance criterion with respect to a centre. Radial basis functions have been applied in the area of neural networks where they they may may be used used as a repl replac acem emen entt for for the the sigm sigmoi oida dall hidd hidden en laye layerr tran transf sfer  er  characteristic in multi-layer perceptrons. RBF networks have 2 layers of processing: In the first, input is mapped onto each RBF in the 'hidden' layer. The RBF chosen is usua usuall lly y a Gaus Gaussi sian an.. In regr regres essi sion on prob problem lemss the the outp output ut laye layerr is then then a line linear  ar  comb combin inati ation on of hidd hidden en laye layerr valu values es repr repres esen enti ting ng mean mean pred predic icte ted d outp output ut.. The The

16

interpretation of this output layer value is the same as a regression r egression model in statistics. In classification problems the output layer is typically a sigmoid function of a linear  comb combin inat atio ion n

of hidd hidden en laye layerr

valu values es,, repr repres esen enti ting ng a

post poster erio iorr

prob probab abil ilit ity. y.

Performance in both cases is often improved by shrinkage techniques, known as ridge regression in classical statistics and known to correspond to a prior belief in small small parame parameter ter values values (and (and therefo therefore re smooth smooth output output functio functions) ns) in a Bayesi Bayesian an framework. RBF networks networks have the advantage advantage of not suffering suffering from local minima in the same way as multi-layer perceptrons. This is because the only parameters that are adjusted in the learning process are the linear mapping from hidden layer to output layer. Linearity Linearity ensures that the error surface is quadratic and therefore has a single easily found minimum. In regression problems this can be found in one matrix operat operation ion.. In classi classifica ficatio tion n proble problems ms the fixed fixed non-lin non-lineari earity ty introd introduce uced d by the sigmoid output function is most efficiently dealt with using iterated reweighed least squares. RBF networks have the disadvantage of requiring good coverage of the input space by radial basis functions. RBF centers are determined with reference to the distribution of the input data, but without reference to the prediction task. As a result, repres represent entati ationa onall resour resources ces may be wasted wasted on areas areas of the input input space space that that are irrelevant to the learning task. A common solution is to associate each data point with its own centre, although this can make the linear system to be solved in the final layer rather large, and requires shrinkage techniques to avoid over fitting. Associating each input datum with an RBF leads naturally to kernel methods such as Support Vector Machines and Gaussian Processes (the RBF is the kernel function). All three approaches use a non-linear kernel function to project the input data into a space where the learning problem can be solved using a linear model. Like Gaussian Processes, and unlike SVMs, RBF networks are typically trained in a Maximum Likelihood framework by maximizing the probability (minimizing the error) of the data under the model. SVMs take a different approach to avoiding over  fitting by maximizing instead a margin. RBF networks are outperformed in most

17

clas classi sifi fica catio tion n appl applic icat atio ions ns by SVMs SVMs.. In regres regressi sion on appl applic icati ation onss they they can can be competitive when the dimensionality of the input space is relatively small.

3.3 RBF Architecture Artificial networks typically have three layers: an input layer, a hidden layer  with a non-linear RBF activation function and a linear output layer. The output, , of the network is thus

Where  N  is the number of neurons in the hidden layer, ci is the center vector for  neuron i, and ai are the weights of the linear output neuron. In the basic form all input are connected to each hidden neuron. The norm is typically taken to be the Euclidean distance and the basis function is taken to be Gaussian.

.

The Ga Gaussian ba basis fu functions ar are lo local in in th the se sense th that

.

Changing parameters of one neuron has only a small effect for input values that are far away from the center of that neuron. RBF networks networks are universal universal approximate approximatess on a compact compact subset subset of

. This means

that a RBF network with enough hidden neurons can approximate any continuous function with arbitrary precision. The weights ai, and the data.

, and and β are det determ ermin ined ed in in a mann manner er that that opti optimi mize zess the the fit bet betwe ween en

18

Fig 4 Architecture of a radial basis function network.

3.4 Training  In a RBF network there are three types of parameters that need to be chosen to adapt the network for a particular task: the center vectors ci, the output weights wi, and and the the RBF RBF widt width h param paramet eters ers βi. In the sequential training of the weights are updated at each time step as data streams in. For some tasks it makes sense to define an objective function and select the  parameter values that minimize its value. The most common objective function is the least squares function

Where,

.

19

We have explicitly included the dependence on the weights. Minimization of the least squares objective function by optimal choice of weights optimizes accuracy of  fit.

3.5 Interpolation RBF networks networks can be used to interpolate interpolate a function function values

of

that

function

are

known

on

finite

when the number

of

points:

. Taking the known points  xi to be the centers of the radial basis functions and evaluating the values of the basis functions at the same  points g ij = ρ( | | x j − xi | | ) the weights can be solved from the equation

It can be shown that the interpolation matrix in the above equation is non-singular, if  the point’s  x _i are distinct, and thus the weights w can be solved by simple linear  algebra:

3.6 Function approximation If the purpose is not to perform strict interpolation but instead more general function approximation or classification the optimization is somewhat more complex  because there is no obvious choice for the centers. The training is typically done in two phases first fixing the width and centers and then the weights. This can be  justified by considering the different nature of the non-linear hidden neurons versus the linear output neuron.

20

3.7 Training the basis function centers Basis function centers can be either randomly sampled among the input instances or  found by clustering the samples and choosing the cluster means as the centers. The RBF RBF widt widths hs are are usua usuall lly y all fixe fixed d to same same valu valuee whic which h is prop propor orti tion onal al to the the maximum distance between the chosen centers.

3.8 Pseudoinverse solution for the linear weights After the centers ci have been fixed, the weights that minimize the error at the output are computed with a linear pseudo inverse solution: , Where the entries of G of G are the values of the radial basis functions evaluated at the  points x i: g ji = ρ (| | x j − ci | |). The existence of this linear solution means that unlike Multi-layer perceptron (MLP) networks the RBF networks have a unique local minimum (when the centers are fixed).

3.9 Advantages/Disadvantages



RBF trains faster than a MLP.



Another advantage that is claimed is that the hidden layer is easier to interpret

than the hidden layer in an MLP. •

Although the RBF is quick to train, when training is finished and it is being

used it is slower than a MLP, so where speed is a factor a MLP may be more appropriate.

21

CHAPTER 4 OVERVIEW OF MOTION DETECTION ALGORITHMS Given a number of sequential sequential video frames from the same source the goal is to detect the motion in the area observed by the source. When there is no motion all the sequential frames have to be similar up to noise influence. In the case when motion is present there is some difference between the frames. For sure, each lowcost system has some aspect of noise influence. And in case of no motion every two sequential frames will not be the identical. This is why the system must be smart enou enough gh to dist distin ingu guis ish h betw between een nois noisee and and real real moti motion on.. When When the the syst system emss are are calibrated and stable enough the character of noise is that every pixel value may be slightly different from that in other frame. And in first approximation it is possible to define some noise per pixel threshold parameter (adaptable for any given state) the mean meanin ing g of whic which h is how how the the pixe pixell valu valuee (of (of the the same same orie orient nted ed pixe pixell in two two sequential frames) might differ but actually the indicating value is the same one. More precisely, if the pixel with coordinates (Xa,Ya) in frame A differs from the  pixel  pixel with coordinates coordinates (Xb,Yb) (Xb,Yb) in frame B less than on TPP (threshold (threshold per pixel) pixel) value so we will see them as pixels with equal values. And we can write it by formulae: Pixel (Xa, Ya) equal to Pixel (Xb, Yb) I if  {abs (Pixel (Xa,Ya)-Pixel(Xb,Yb)) < TPP }

By adapting the TPP value to current system state we can make the system to be noise-stable. By applying this threshold operation to every pixel pair we may assume that all the preprocessed pixel values are noise-free. The element of noise that is not cancelled will be significantly small relative to other part. Ok, if so we have to post-process these values to detect the motion if any. As it was was memo memori rized zed abov abovee we have have mani manipu pulat latee with with diff differe erent nt pixe pixels ls insi inside de two two sequential frames to make conclusion about the motion.

22

Firstly, to make the system sensitive enough we have not to fix the TPP value too big. It mean that keeping the sensitivity of the system high in any two frames there will be some little number (TPP related) of different pixels. And in this case we have not to see them as noise. It is the first of the reasons to define a TPF (threshold  per frame) value (adaptable (adaptable for any given state) the meaning of which which is how many   pixels at least, inside two sequential frames must differ in order to see them as motion. The second reason to deal with TPF is to filter (to drop) small motion. For  instance, by playing with TPF values we can neutralize motion of the small object (bugs etc.) by still detect the motion of people. And we can write the exact meaning of TPF by formulae:

Let’s define NDPPP to be the Number of Different Pre-Processed by TPP Pixels. So, There is a motion i.e. NDPPP > TPF.

Both of TPP and TPF values are variable through the UI to get the optimal system sensitivity. Also the TPF value has its visual equivalent and it is used as following. After the pixels pre-processing (by TPP) lets color all static (which do not include motion) pixels by lets say black color and all the dynamic (which indicate the motion) pixels will be left with their original color. This will bring the effect of  motion extraction. In the other words, all the static parts of the frames will be black, and only the moving parts will be seen normally. The enabling/disabling of this effect is possible to control through the GUI. The The Camera Camera Manage Managerr provid provides es routin routines es for acquir acquiring ing video video frames frames from CCD cameras. Any process can request a video frame from any video source. The system manages a request queue for each source and executes them cyclically.

23

CHAPTER 5

PROPOSED DETECTION SYSTEM This chapter presents the main software design and implementation issues. It starts by describing the general flow chart of the main program that was implemented in MATLAB. It then explains each component of the flow chart with some details. Finally it shows how the graphical user interface GUI was designed.

5.1 Basic Architecture

Fig 5 A basic architecture of surveillance system

The above block diagram shows the surveillance system which consists of a camera system which monitors the particular area, a video daughter card which tran transm smit itss the the vide video o sign signal al to elect electric rical al sign signal al,, a netw networ ork k card card which which help helpss in connec connectin ting g to a networ network k and motion motion detecti detection on algori algorithm thm (SAD (SAD and Correla Correlatio tion) n) along with RBF network.

24

5.2 Main Program Flow Chart The main task of the software software was to read the still images recorded recorded from the camera and then process these images to detect motions and take necessary actions accordingly. Figure 6 below shows the general flow chart of the main program.

Start

Setup & Initializations

Flag=1

What is Flag value?

Flag=0

Image Acquisition

Motion Detection Algorithm

 No Is image > threshold

Break & clear 

Yes

Actions on Motion Detection

Data Record

Stop

Figure 6 Main Program Flow Diagram

25

It starts with general initialization initialization of software software parameters and objects objects setup. setup. Then, once the program started, the flag value which indicates whether the stop  button was pressed or not is checked. If the stop button was not pressed it start reading the images then process them using one of the two algorithms as the operator  was selected. If a motion is detected it starts a series of actions and then it go back to read the next images, otherwise it goes directly to read the next images. Whenever  the stop button is pressed the flag value will be set to zero and the program is stopped, memory is cleared and necessary results are recorded. This terminates the  program and returns the control for the operator to collect the results. The next sections explain each process of the flow chart in figure 6 with some details.

5.2.1 Setup and Initializations

Start

Launch GUI

No

Start button pressed

Yes

Read Threshold Value

Read Algorithm Type

Setup Serial Port

Setup Video Object

Stop

Figure 7 Setup and Initializations Process

26

Figure 7 show the flow chart for the setup and initialization process. This process includes the launch of the graphical user interface (GUI) where the type of motion detection detection algorithm is selected selected and threshold threshold value (the amount amount of sensitivity sensitivity of the detection) is being initialized. Also, during this stage a setup process for both the serial port and the video object is done. This process takes approximately 15 seconds to be completed,(depending on the the speci specifi ficat catio ions ns of the the PC used used)) for for the the seri serial al port port it star starts ts by sele select ctin ing g a communication port and reserving the memory addresses for that port, then the PC connect to the device using the communication setting that was mentioned in the   previous chapter. The video object is part of the image acquisition process but it should be setup at the start of the program.

5.2.2 Image acquisition

Start Read First Frame

Convert to Grayscale

Read Second Frame

Convert to Grayscale

Stop

Figure 8 Image acquisitions acquisitions Process

27

After setup stage the image acquisition starts as shown in figure 8 above. This  process reads images from the PC camera and save them in a format suitable for the motion detection algorithm.

There were three possible possible options from which one is implemented. implemented. The first option was by using auto snapshots software that takes images automatically and save them on a hard disk as JPEG format, and then another program reads these images in the same sequence as they were saved. It was found that the maximum speed that can be attained by this software is one frame per second and this limits the spee speed d of dete detecti ction on.. Also Also,, sync synchr hron oniza izati tion on was was requ require ired d betw betwee een n both both imag imagee  processing and the auto snapshot software’s where next images need to be available on the hard disk before processing them.

The second option was to display live video on the screen and then start capturing the images from the screen. This is a faster option from the previous approach but again it faced the problem of synchronization, when the computer  monitor goes into a power saving mode where black images are produced all the time during the period of the black screen.

The third option was by using the image acquisition toolbox provided in MATLAB 6.5.1 or higher versions. The image acquisition toolbox is a collection of  functions that extend the capability of MATLAB. The toolbox supports a wide range of image acquisition acquisition operations operations,, including including acquiring acquiring images through through many types of  image image acquis acquisiti ition on devices devices,, such such as frame frame grabbe grabbers rs and USB PC cameras cameras,, also also viewing a preview of the live video displayed on monitor and reading the image data into the MATLAB workspace directly.

For this project video input function was used to initialize a video object that connects to the PC camera directly. Then preview function was used to display live video video on the monitor. monitor. Get snapsh snapshot ot functi function on was used to read images images from the camera and place them in MATLAB workspace.

28

The later approach was implemented because it has many advantages over the others. It achieved the fastest capturing speed at a rate of five frames per seconds depend depending ing on algori algorithm thm comple complexit xity y and PC proces processor sor speed. speed. Furth Furtherm ermore ore,, the  problem of synchronization was solved because both capturing and processing of  images were done using the same software.

All read images images were were conver converted ted it into into a two dimensio dimensional nal monoch monochrom romee images. This is because equations in other algorithms in the system were designed with such image format.

5.2.3 Motion Detection Algorithm

A motion detection algorithm was applied on the previously read images. There were two approaches approaches to implement motion detection algorithm. algorithm. The first one was by using the two dimensional cross correlation while the second one was by using the sum of absolute difference algorithm. These are explained in details in the next two sub sections.

5.3 Motion Detection Using Sum of Absolute Difference (SAD)

This algo algori rith thm m is based ased on imag imagee diff differ eren enci cin ng tech techn niqu iques. es. mathematically represented using the following equation: 1

 D (t ) =

 N 

∑ I (t  ) − I (t  i

  j

)

Where  N  is the number of pixels in the image used as scaling factor,  I (t i ) is the image  I  at time i ,  I (t   j ) is the image  I  at time j and  D(t )

is the normalized sum of absolute difference for that time.

In an ideal case when there is no motion  I (t i )

 I (t   j )

=

It is

29

and  D(t )

=

0 . However noise is always presented in images and a better model of 

the images in the absence of motion will be  I (t i )

= I (t   j

) + n( p )

Where n(  p ) is a noise signal. The value  D(t ) that represents the normalized sum of absolute difference can be used as a reference to be compared with a threshold value as shown in figure 9  below.

The figure also shows a test case that contains a large change in the scene being monitored by the camera this was done by moving the camera. During the time  before the camera was moved the SAD value was around 1.87 and when the camera was moved the SAD value was around 2.2. If the threshold for detection was fixed around the value less than 2.2 it will continuously detect motion after the camera stop moving.

Figure 9 Direct Thresholds for SAD Values

30

This This approa approach ch solve solve the need need for contin continuou uously sly re-esti re-estimat matee the thresh threshold old value. Choosing a threshold of 1*10 -3 will detect the times when only the camera is moved. This results into a robust motion detection algorithm that can not be affected  by illumination change and camera movements.

5.3.1 Actions on Motion Detection

Before explaining series of actions happen when motion is detected it is worth to mention that the values of variance that was calculated whether it was above or below the threshold will be stored in an array, where it will be used later to  produce a plot of frame number Vs. the variance value. This plot helps in comparing the variance values against the threshold to be able to choose the optimum threshold value. Whenever the variance value is less than threshold the image will be dropped and only the variance value will be recorded. However when the variance value is greater than threshold sequence of actions is being started as shown in figure 10  below. Start

Time Date Frame#

Update Log File

Display Image

Trigger  Serial Port

Convert Image to Frame

Stop

Figure 10 Actions on Motion Detection

31

As the above flow chart show a number of activities activities happen when motion is detected. First the serial port is being triggered by a pulse from the PC; this pulse is used to activate external circuits connected to the PC. Also a log file is being created and then appended with information about the time and date of motion also the frame number in which motion occur is being recorded in the log file. Another process is to display the image that was detected on the monitor. Finally the image that was detected detected in motion will be converted to a movie movie frame and will be added to the film structure.

5.3.2 Break and clear Process

After motion detection algorithm applied on the images the program checks if  the stop button on GUI was pressed. If it was pressed the flag value will be changed from one to zero and the program will break and terminate the loop then it will return the control to the GUI. Next both serial port object and video object will be cleared. This process is considered as a cleaning stage where the devises connected to the PC through those objects will be released and the memory space will be freed.

5.3.3 Data Record

Finally when the program is terminated a data collection process starts where variable and arrays that contain result of data on the memory will be stored on the hard disk. This approach was used to separate the real time image processing from results processing. This has the advantage of calling back these data whenever it is required. The variables that are being stored from memory to the hard disk are variance values and the movie structure that contain the entire frames with motion. At this point the control will be returned to the GUI where the operator can callback  the results that where archived while the system was turned on. Next section will explain the design of the GUI highlighting each button results and callbacks.

32

START

IMAGE ACQUISTION

FRAME SEPARATION

DIVIDE QUADRANTS

SUM OF ABSOLUT DIFFERENCE

>T

DATA RECORD

Fig 11 Flow chart for SAD algoritham

Fig 12 Frame separation

Fig 13 Divide Quadrants

33

5.3.4 Graphical User Interface Design The GUI was designed to facilitate interactive system operation. GUI can be used to setup the program, launch it, stop it and display results.

Start

Clear all Previous Work 

Variable Initialization & Setup

Launch program

Call Selected main Program

Terminate Program

View Results

Yes

 NO Start Again

Exit

End

Figure 14 GUI flow Chart

34

During setup stage the operator is promoted to choose a motion detection algori algorithm thm and select select degree degree of the detect detection ion sensit sensitivit ivity y Whenev Whenever er the start/s start/stop top toggle button is pressed the system will be launched and the selected program will be called to perform the calculations until the start/stop button is pressed again which will terminate terminate the calculation calculation and return control control to GUI. Results Results can be viewed as a log file, movie and plot of frame number vs. variance value. Figure 14 illustrate a flow chart of the steps performed using the GUI.

5.4 Motion detection using Correlation Network 

A correlation neural network (CNN) which accounts for velocity sensitive respon responses ses of neuron neuronss is suita suitable ble for analog analog circuit circuit impleme implementa ntatio tion n of motion motion-detection systems and has been successfully implemented on CMOS. The CNN utilizes local motion detectors to correlate signals sampled at one location in the image with those sampled after a delay at adjacent locations; however, an edgedetection process is required in practical motion detection systems with the CNNs.

The term correlation can also mean the cross-correlation of two functions or  electro electron n correla correlatio tion n in molecu molecular lar syste systems ms.. In probab probabili ility ty theory theory and statis statistic tics, s, correlation, also called correlation coefficient, indicates the strength and direction of  a linear linear relatio relationsh nship ip between between two random random variab variables les.. In genera generall statis statistic tical al usage usage,, correlation or co-relation refers to the departure of two variables from independence, although correlation does not imply causation. In this broad sense there are several coefficients, measuring the degree of correlation, adapted to the nature of data. A number number of different coefficients coefficients are used for different situations. situations. The best known is the Pearson product-moment correlation coefficient, which is obtained by dividing the covariance of the two variables by the product of their standard deviations.

35

5.4.1 Mathematical properties The correlation ρX, Y between two random variables X  variables  X and and Y with Y  with expected values μ X  and μY  and standard deviations σ X  and σY  is defined as:

Where E is the expected value of the variable and cov means covariance. Since μ X = E(X), σX2 = E(X2) − E2(X) and likewise for Y  for Y , we may also write

The correlation is defined only if both of the standard deviations are finite and both of them are nonzero. It is a corollary of the Cauchy-Schwarz inequality that the correlation cannot exceed 1 in absolute value. The correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decre decreas asin ing g line linear ar relat relatio ions nshi hip, p, and and some some valu valuee in betw betwee een n in all all othe otherr cases cases,, indicating the degree of linear dependence between the variables. The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables. If the variables are independent then the correlation is 0, but the converse is not true   because the correlation coefficient detects only linear dependencies between two varia variabl bles es.. Here Here is an exam exampl ple: e: Supp Suppos osee the the rando random m varia variabl blee X is unifo uniform rmly ly distributed on the interval from −1 to t o 1, and Y = Y = X2. Then Y is Y is completely determined   by by X, so that that X and and Y are are depe depend nden ent, t, but but thei theirr corr correl elat atio ion n is zero zero;; they they are are unco uncorre rrela lated ted.. Howe Howeve ver, r, in the the spec special ial case case when when X and and Y are join jointl tly y norm normal al,, independence is equivalent to uncorrelated ness. A correlation between two variables is diluted in the presence of measurement error around estimates of one or both variables, in which case disattenuation provides a more accurate coefficient.

36

5.4.2 Geometric Interpretation of correlation The correlation coefficient can also be viewed as the cosine of the angle  between the two vectors of samples drawn from the two random variables. This method only works with centered data, i.e., data which have been shifted  by the sample mean so as to have an average of zero. Some practitioners prefer an uncentered uncentered (non-Pearso (non-Pearson-com n-compliant pliant)) correlation correlation coefficient. coefficient. See the example example below for a comparison. As an example, suppose five countries are found to have gross national  products of 1, 2, 3, 5, and 8 billion dollars, respectively. Suppose these same five countries (in the same order) are found to have 11%, 12%, 13%, 15%, and 18%  poverty. Then let x and y be ordered 5-element vectors containing the above data: x = (1, 2, 3, 5, 8) and y = (0.11, 0.12, 0.13, 0.15, 0.18). By the usual procedure for  finding the angle between two vectors, the uncentered correlation coefficient is:

 Note that the above data were deliberately chosen to be perfectly correlated: y correlated:  y = 0.10 + 0.01 0.01  x.  x. The The Pears Pearson on corr correla elatio tion n coef coeffic ficie ient nt must must there therefo fore re be exac exactl tly y one. one. Centering the data (shifting x by E(x) = 3.8 and y by E(y) = 0.138) yields x = (-2.8, -1.8, -0.8, 1.2, 4.2) and y = (-0.028, -0.018, -0.008, 0.012, 0.042), from which

as expected.

37

5.4.3 Interpretation of the size of a correlation Several authors have offered guidelines for the interpretation of a correlation coefficient. coefficient. Cohen (1988), (1988), for example, example, has suggested suggested the following following interpretatio interpretations ns for correlations in psychological research, in the table in the bottom. As Cohen himself has observed, however, all such criteria are in some ways arbitrary and should not be observed too strictly. This is because the interpretation of  a correlation coefficient depends on the context and purposes. A correlation of 0.9 may be very low if one is verifying a physical law using high-quality instruments,  but may be regarded as very high in the social sciences where there may be a greater  contribution from complicating factors

Correlation

Negative

Positive

Small

−0.29 to −0.10

0.10 to 0.29

Medium

−0.49 to −0.30

0.30 to 0.49

Large

−1.00 to −0.50

0.50 to 1.00

Table 1

Fig 15 An unit network of two-dimensional CCN.

38

START

IMAGE ACQUISTION FRAME SEPARATION DIVIDE QUADRANTS CORRELATION  NETWORK 

Decisi on

Fig 16 Flow chart for correlation

DATA RECORD

39

CHAPTER 6 PROPOSED OBJECT EXTRACTION Many attempts have been made to extract data from video and film in a form suitable for use by animators and modelers. Such an approach is attractive, since motions and movements for people and animals may be obtained in this way that would be difficult difficult using using mechanica mechanicall or magnetic magnetic motion capture systems systems.. Visual Visual extraction is also appealing since it is non-intrusive and has the potential to capture, from film, the motion and characteristics of people or animals long dead or extinct.

Almost all attempts to perform visual extraction have been based around  bespoke computer vision applications which are difficult for non-experts to use or  adapt to their own needs. needs. This paper paper presents presents a generic generic approach approach to extracting extracting data from video. video. Whilst Whilst our approach approach allows allows low-lev low-level el inform informatio ation n to be extracte extracted d we show show that that higher higher-le -level vel functi functiona onality lity is availa available ble also. also. This This functi functiona onalit lity y can be utilized in a manner that requires little knowledge of the underlying techniques and   princip principles les.. Our approac approach h is to approx approxima imate te an image image using using princip principal al compon component ent analysis, and then to train a multi-layer multi-layer perceptron to predict the feature required by the user. This requires the user to hand-label the features of interest in some of the frames of the image sequence. One of the aims of this work is to keep to a minimum the number of frames that need to need labeled by the user. The trained multi-layer   perceptron is then used to predict features for images that have never been labeled by the user.

Other attempts to extract useful information from video sequences include the use of edge-detection and contour or edge tracking, template matching and template tracking. All such systems work well in some circumstances, but fail or require adap adapta tatio tion n to meet meet the the requi requirem rement entss of new new users users.. For For instan instance, ce, in the case case of  template tracking, the user needs to be aware of the kinds of features that can be tracked well in an image and also choose choose a suitable template size. This is not a trivial task for non-specialists.

40

6.1 Method The main steps in extraction using our system are detailed below:

The user selects the sequence (or set) of images for which they wish data to be extracted from. This may well comprise of several shorter clips taken from different  parts of a film. These images have some pre-processing performed on them (principal components analysis) to reduce each image to a small set of numbers.

The user decides what feature(s) they wish to extract and labels this feature by hand in a fraction fraction of the images chosen chosen at random. random. The labeling labeling process process may involve clicking on a point to be tracked, labeling a distance or ratio of distances, measuring an angle, making a binary decision (yes/no, near/far etc.) or classifying the feature of interest into one of several classes. Once this ground-truth data is available, a neural network is trained to predict the feature values in images that have not been labeled by the user.

6.2 Feature Extraction Principal components analysis (also known as eigenvector analysis) has been used extensively in computer vision for image reconstruction, pattern matching and classification.

Given the i th image in a sequence of images, each of which consists of M  pixels, we form the vector x i by concatenating the pixels of the image in raster scan order and removing the mean image of the sequence. The matrix X is created

using the x i's as column vectors. Traditionally, the principal modes, q i, are extracted  by computing XXTq i=iqi (1)

Where i's are the Eigen values.

41

a measure of the amount of variance each of the eigen vectors accounts for. Unfortunately, the matrix XXT is typically too large to manipulate since it is of size

M by M. Such computation is wasteful anyway since only N princi  pal modes are meaningful, where N is the number of example images. In all our work    N M. Therefore we compute:

XTXu

=i ui(2)

and we can obtain the q i's that we actually require using: qi = Xui

(3)

In practice only the first P modes are used, P 30 N. The principal mode extracted from a short film clip is shown in Figure 1 and is used used late laterr to help elp an ani animato matorr to cons constr truc uctt a cart carto oon versi ersion on of the clip. lip.

It is tempting to think that such modes could be used directly to predict, say, the rotation of the man's shoulders. However, the second mode also encodes encodes information about shoulder movement and it is only by combining information from many modes that rotation can be reliably predicted.

42

REFERENCE;

[1] ' Special issue on third generation surveillance systems', froc. IEEE , IEEE , 2001, 89 , JAIN, R., KASTUR KASTURI, I, R., and SCHUN SCHUNCK, CK, B.G. B.G. This This paper paper gives gives the detailk detailkss about the surveillance systems

[2] 'Machi 'Machine ne vision vision'' (McGraw (McGraw-Hi -Hill ll Inc., Inc., 1995) 1995) PONS, PONS, J., PRADES PRADES-NE -NEBOT BOT,, J., ALBI ALBIOL OL,, A,, A,, and MOLINA MOLINA,, J.his paper paper provides provides the details details about

artific artificien ientt

intelligence.

[3] [3] 'Mot 'Motio ion n vide video o sens sensor or in the the comp compre ress ssed ed doma domain in'. '. SCS SCS Euro Eurome medi diaa Conf Conf., ., Vale Valenc ncia ia,, Spai Spain, n, 2001 2001,, This This pape paperr prov provid ides es the the deta detail ilss abou aboutt

algor algorit ithm hmss in

compressed domain.

[4] Y. Song, A perceptual approach to human motion detection and labeling. PhD thesis, California Institute of Technology, 2003. This paper provides the details about human motion detection

[5] N. Howe, M. Leventon, and W. Freeman, “Bayesian reconstruction of 3D human motion motion from from single single-ca -camer meraa video, video,”” Tech. Tech. Rep. Rep. TR-99TR-99-37, 37, Mitsubi Mitsubishi shi Electr Electric ic Research Lab, 1999 This paper provides the details about 3d human detection.

[6] L. Goncalves, E. D. Bernardo, E. Ursella, and P. Perona, “Monocular tracking of  the human arm in 3D,” in Proc. 5th Int. Conf. Computer Vision, (Cambridge, Mass),  pp. 764– 770, 1995.This paper provides the details about 3d human detection.

[7] S. Wachter and H.-H. Nagel, “Tracking persons in monocular image sequences,” Computer Vision and Image Understanding, vol. 74, pp. 174–192, 1999. This paper provides details about motion detection in image sequences.

43

44

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF