Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Is The Da Vinci Code Cracked Or Just the People Who Believe It
Category:
Entertainment / Television  

Secure Your Car For Lower Car Insurance Premiums
Category:
Business  

Scooters and Sourcing them Online
Category:
Home And Family  

A foolproof way to getting articles even if you can t write
Category:
Business  

6 Red Hot Tips To Get Your Articles Read
Category:
Marketing  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Mantle Clocks Great Deals And Huge Selection
Category:
Home And Family  

Acupuncture Quit Smoking
Category:
Health / Fitness  

Work at Home Opportunities What Are Your Options
Category:
Business  

Trading Online Trading India Internet Trading Net Trading e Trad...
Category:
Finance / Investment  

Protect Your Home with Spy Camera
Category:
Home And Family  

7 Cost Effective Marketing Tips
Category:
Business  

How to Make a Free Web Site
Category:
Business  

Advertising Corporate Identity through Logo Design
Category:
Business  

Popcorn and Other Marketing Mistakes In a Changing Economy
Category:
Business  

Affiliate Marketing A business Without Hassle
Category:
Marketing  

Find Discount Scuba Diving Vacation Popularity Of Destination
Category:
Travel  

5 simple ways to get kick ass ideas for your articles
Category:
Business  

Global warming Should we heed the harbingers of doom
Category:
Home And Family  

Starting an Ebook Online Business in Just 3 Easy Steps
Category:
Business  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Double Your Dish Network Affiliate Check
Category:
Marketing  

Going to the Beach Lose Up to 20 Pounds In Less Than 2 Weeks
Category:
Health / Fitness  

Tips On Getting A Suntan
Category:
Health / Fitness  

CHOOSING A LABEL PRINTER
Category:
Business  

Adverse Credit Credit Cards
Category:
Business  

mouth watering lobster recipes
Category:
Health / Fitness  

importance of food elements
Category:
Health / Fitness  

Blood Test To Predict Risk of Heart Disease For Diabetics
Category:
Health / Fitness  

How to Create a Money Magnet E commerce Web Site
Category:
Marketing  

10 Offline Tightwad Marketing Strategies to Help You Get More Cl...
Category:
Business  

Decent Acne Medicines
Category:
Health / Fitness  

Role play with added sex appeal
Category:
Health / Fitness  

Grow a Healthy Lawn You Can Do That
Category:
Home And Family  

Stock Images The Indispensable Tool For Designers And Webmasters...
Category:
Marketing  

Easy Work From Home Ideas Quickstarts For Everyone
Category:
Business  

Tips for Your Walking Program
Category:
Health / Fitness  

Everything About Arthritis
Category:
Health / Fitness  

A Gentle Warning To All Webmasters About RSS
Category:
Marketing  

15 Ways To Sell Yourself Effectively In A Job Interview Part Thr...
Category:
Business  

2 Ways Online Web Conferencing Can Save Your Business Money
Category:
Business  

Lighting Your Way to Outdoor Living
Category:
Home And Family  

7 Rules Every Salesman Should Follow
Category:
Business  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Nurses Wanted Incredible Career Opportunities in Nursing Today
Category:
Health / Fitness  

Baby Wont Sleep Here s some helpful advice
Category:
Home And Family  

Why Cotoneaster Makes a Good Bonsai Candidate
Category:
Home And Family  

Home Hair Care Tips for Dry Hair
Category:
Health / Fitness  

A Home Gym and Walking a Great Exercise Program
Category:
Health / Fitness  

Preparing For Cosmetic Plastic Surgery
Category:
Health / Fitness  

Avoiding Razor Burn
Category:
Health / Fitness  

Curcumin An Anti Aging Herbal
Category:
Health / Fitness  

Take You Russian Fiance to an American Wedding Before You Get Ma...
Category:
Travel  

How and Why to Get an Awesome X Box 360 Skin for your XBOX Conso...
Category:
Entertainment / Television  

Where Are All of The Best Job Search Engines
Category:
Business  

The Power of Intention
Category:
Health / Fitness  

Traditional Therapies Can Prevent Heart Disease Too
Category:
Health / Fitness  

Handling devil Boss II
Category:
Home And Family  

10 Tips when using electronic forms
Category:
Business  

Mens Jewellery Snap Style Guide on Wearing Jewellery
Category:
Home And Family  

6 Things to Consider When Naming Your Baby
Category:
Home And Family  

Give a man six inches and he ll want a
Category:
Health / Fitness  

Stevie Wonder Challenges Memphis and the World
Category:
Entertainment / Television  

Writing the Resource Box so it Makes People click
Category:
Marketing  

Weight Loss Psychology
Category:
Health / Fitness  

Australia Visa Services Free Online Australian Immigration Asses...
Category:
Travel  

The Truth About Passive Income
Category:
Finance / Investment  

A New Way of Looking at NJ Divorce
Category:
Finance / Investment  

Can Stress Play a Role In Hair Loss
Category:
Health / Fitness  

Tips to Selecting an RSS News Aggregator
Category:
Computers  

WHY LABEL PRINTERS STAY SO BUSY
Category:
Business  

No Win No Fee Compensation Claims No Risk No Costs
Category:
Finance / Investment  

Why Heart Fails
Category:
Health / Fitness  

Find The Best Compensation Claim Specialist
Category:
Business  

What price Victory An alternative look at the Adoption Triangle
Category:
Home And Family

Video surveillance system with object detection and probability scoring based on object class Number:7,127,083 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: Video surveillance system with object detection and probability scoring based on object class

Abstract: A video surveillance system uses rule-based reasoning and multiple-hypothesis scoring to detect predefined behaviors based on movement through zone patterns. Trajectory hypothesis spawning allows for trajectory splitting and/or merging and includes local pruning to managed hypothesis growth. Hypotheses are scored based on a number of criteria, illustratively including at least one non-spatial parameter. Connection probabilities computed during the hypothesis spawning process are based on a number of criteria, illustratively including object size. Object detection and probability scoring is illustratively based on object class.

Patent Number: 7,127,083 Issued on 10/24/2006 to Han,   et al.


Inventors: Han; Mei (Cupertino, CA), Gong; Yihong (Cupertino, CA), Tao; Hai (Santa Cruz, CA)
Assignee: Vidient Systems, Inc. (Sunnyvale, CA)
Appl. No.: 10/917,225
Filed: August 12, 2004


Current U.S. Class: 382/103 ; 340/541; 382/170; 382/171
Current International Class: G06K 9/00 (20060101)
Field of Search: 382/100,103,169-172 340/541


References Cited [Referenced By]

U.S. Patent Documents
4839631 June 1989 Tsuji
4962473 October 1990 Crain
5243418 September 1993 Kuno et al.
5323470 June 1994 Kara et al.
5448290 September 1995 VanZeeland
5497314 March 1996 Novak
5612928 March 1997 Haley et al.
5666157 September 1997 Aviv
5828769 October 1998 Burns
5923365 July 1999 Tamir et al.
5969755 October 1999 Courtney
6028626 February 2000 Aviv
6069655 May 2000 Seeley et al.
6069696 May 2000 McQueen et al.
6097429 August 2000 Seeley et al.
6107918 August 2000 Klein et al.
6128396 October 2000 Hasegawa et al.
6154131 November 2000 Jones et al.
6295367 September 2001 Crabtree et al.
6301370 October 2001 Steffens et al.
6324532 November 2001 Spence et al.
6628835 September 2003 Brill et al.
6654047 November 2003 Iizaka
6665004 December 2003 Paff
6696945 February 2004 Venetlaner et al.
6697103 February 2004 Fernandez et al.
6707486 March 2004 Millet et al.
6757008 June 2004 Smith
6876999 April 2005 Hill et al.
2002/0005955 January 2002 Kramer et al.
2002/0099770 July 2002 Lindo
2002/0198854 December 2002 Berenji et al.

Other References

DJ. Beymer and K. Konolige. Real-time tracking of multiple people using stereo. Frame-Rate99, 1999. cited by other .
D. Comaniciu, V. Ramesh, and P. Meer. Real-time tracking of non-rigid objects using mean shift, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR00), pp. II: 142-149, 2000. cited by other .
S.L. Dockstader and A.M. Tekalp. On the tracking of articulated and occluded video object motion, Real Time Imaging, 7(5):415-432, Oct. 2001. cited by other .
A.M. Elgammal and L.S. Davis. Probabilistic framework for segmenting people under occlusion, International Conference on Computer Vision (ICCV01), pp. II: 145-152, 2001. cited by other .
I. Haritaoglu, D. Harwood, and L.S. Davis. Hydra: Multiple people detection and tracking using silhouettes, Workshop on Visual Surveillance (VS99), 1999. cited by other .
Y. Kirubarajan, Y. Bar-Shalom, and K. R. Pattipati. Multiassignment for track- ing a large number of overlapping objects, IEEE Transactions on Aerospace and Electronic Systems, 37(1): 2-21, Jan. 2001. cited by other .
D.B. Reid. An algorithm for tracking multiple targets, IEEE Transactions on Automatic Control, 24(6):843-854, Dec. 1979. cited by other .
C. Stauffer and W.E.L. Grimson. Learning patterns of activity using real-time tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 22(8):747-757, Aug. 2000. cited by other .
H. Tao, H.S. Sawhney, and R. Kumar. A sampling algorithm for tracking multiple objects, Vision Algorithms 99, 1999. cited by other .
C.R. Wren, A. Azarbayejani, T.J. Darrell, and A.P. Pentland. PFinder: Real-time tracking of the human body, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 19(7):780-785, Jul. 1997. cited by other .
Y. Wu and T.S. Huang. A co-inference approach to robust visual tracking, International Conference on Computer Vision (ICCV01), pp. II: 26-33, 2001. cited by other .
T. Zhao and R. Nevatia. Stochastic human segmentation from a static camera, Motion02, pp. 9-14, 2002. cited by other .
T. Zhao, R. Nevatia, and F. LV. Segmentation and tracking of multiple humans in complex situations, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR01), pp. II:194-201, 2001. cited by other.

Primary Examiner: Mehta; Bhavesh M.
Assistant Examiner: Edwards; Patrick
Attorney, Agent or Firm: Slusky; Ronald D.

Parent Case Text



CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. provisional patent application No. 60/520,610 filed Nov. 17, 2003, incorporated herein by reference.
Claims



The invention claimed is:

1. A method for use in a video surveillance system in which is generated at least a first set of trajectories of a first set of objects of a particular object class hypothesized to have been moving through an area under surveillance at a previous point in time, the method comprising identifying objects of said particular class hypothesized to be in said area under surveillance at a current point in time, said particular class of objects being distinguishable from other objects based on the physical appearance of the objects of said particular class, at least one of said objects in said area under surveillance at said current point in time being identified independent of the physical appearance of any objects hypothesized to have been in said area under surveillance at said previous point in time, said identifying including analyzing individual portions of a video image of said area under surveillance to determine if said portions have features that are characteristic of objects in said particular class, extending at least ones of said first set of trajectories to at least ones of said identified objects to develop at least one set of extended trajectories, each of at least ones of said extended trajectories being a respective one of said first set of trajectories extended to at least one of the identified objects, and selecting, as an individual one of said portions, a portion of said video image that is selected independent of whether that portion is in the foreground of said video image.

2. The method of claim 1 wherein said foreground of said video image comprises portions of said image at said current point in time whose content is determined to have changed from the content of the spatially corresponding portion of said video image at a previous point in time.

3. The method of claim 1 wherein said individual one of said portions is selected based on its proximity to a terminating point of at least one of the trajectories of said first set of trajectories.

4. The method of claim 1 wherein said analyzing includes applying said individual portions of said video image to a neural network that has been trained to recognize objects in said particular object class.

5. The method of claim 4 wherein said neural network generates a score in response to each said applied portion of said video image and wherein said each applied portion is identified as being an object in said particular object class if said score is at least as large as a predetermined threshold.

6. The method of claim 1 further comprising selecting, as another individual one of said portions, a portion of the foreground of said video image at said current point in time, said selecting of another individual one of said portions being carried out independent of that portion's proximity to a terminating point of any of said at least first set of trajectories.

7. The method of claim 6 wherein said foreground of said video image comprises portions of said image at said current point in time whose content is determined to have changed from the content of the spatially corresponding portion of said video image at a previous point in time.

8. An electronic surveillance system adapted to carry out the method defined by claim 1.

9. A computer-readable medium on which are stored instructions that are executable by a processor to carry out the method defined by claim 1.
Description



BACKGROUND OF THE INVENTION

Multiple object tracking has been one of the most challenging research topics in computer vision. Indeed, accurate multiple object tracking is the key element of video surveillance system where object counting and identification are the basis of determining when security violations within the area under surveillance are occurring.

Among the challenges in achieving accurate tracking in such systems are a number of phenomena. These phenomena include a) false detections, meaning that the system erroneously reports the presence of an object, e.g., a human being, at a particular location within the area under surveillance a particular time; b) missing data, meaning that the system has failed to detect the presence of an object in the area under surveillance that actually is there; c) occlusions, meaning that an object being tracked has "disappeared" behind another object being tracked or some fixed feature (e.g., column or partition) within the area under surveillance; d) irregular object motions, meaning, for example, that an object that was moving on a smooth trajectory has abruptly stopped or changed direction; e) changing appearances of the objects being tracked due, for example, to changed lighting conditions and/or the object presenting a different profile to the tracking camera.

Among the problems of determining when security violations have occurred or are occurring is the unavailability of electronic signals that could be profitably used in conjunction with the tracking algorithms. Such signals include, for example, signals generated when a door is opened or when an access device, such as a card reader, has been operated. Certainly an integrated system built "from the ground up" could easily be designed to incorporate such signaling, but it may not be practical or economically justifiable to provide such signals to the tracking system when the latter is added to a facility after the fact.

SUMMARY OF THE INVENTION

The present invention addresses one or more of the above problems, as well as possibly addressing other problems as well. The invention is particularly useful when implemented in a system that incorporates the inventions that are the subject of the co-pending United States patent applications listed at the end of this specification.

A video surveillance system embodying the principles of the invention computes at least a first set of trajectories of a first set of objects of a particular object class hypothesized to have been moving through an area under surveillance at a previous point in time. The particular class of objects is distinguishable from other objects based said particular class of objects' physical appearance. Objects of that particular class hypothesized to be in said area under surveillance at a current point in time are identified, with at least one of those objects being identified independent of the physical appearance of any objects hypothesized to have been in said area under surveillance at a previous point in time. At least ones of the trajectories are extended to at least ones of the identified objects to develop at least one set of extended trajectories.

The class of objects is illustratively people.

The identification of the objects may include analyzing individual portions of a video image of said area under surveillance to determine if they have features that are characteristic of objects in said particular class using, for example, a neural network trained to recognize objects of the class in question.

The portions of the image that are analyzed may include the entire image but, for efficiency, may be limited to a) portions that seem to represent objects that are moving--this being the so-called "foreground" of the image, or b) a combination of a) with portions selected based on their proximity to a terminating point of at least one of the trajectories of the first set of trajectories.

DRAWING

FIG. 1A is a block diagram of image-based multiple-object tracking system embodying the principles of the invention;

FIG. 1B illustrates the operation of an alert reasoning portion of the system of FIG. 1A;

FIG. 1C is a flow diagram illustrating the operation of the alert reasoning portion of the system in detecting the occurrence of the alert conditions referred to as tailgating and piggy-backing;

FIGS. 2A through 2F depict various patterns of movement that are indicative of alarm conditions of a type that the system of FIG. 1A is able to detect;

FIGS. 3A and 3B depict two possible so-called hypotheses, each representing a particular unique interpretation of object detection data generated by the system of FIG. 1A over a period of time.

FIG. 4 is a generalized picture illustrating the process by which each of the hypotheses generated for a particular video frame can spawn multiple hypotheses and how the total number of hypotheses is kept to manageable levels

FIG. 5, shows a process carried out by the hypothesis generation portion of the system of FIG. 1A in order to implement so-called local pruning;

FIG. 6 indicates how data developed during the processing carried out in FIG. 5 is used to spawn new hypotheses;

FIG. 7 shows a simplified example of how hypotheses are generated;

FIG. 8 shows the processing carried out within the hypothesis management portion of the system of FIG. 1A; and

FIGS. 9A through 9E graphically depict the process by which the system of FIG. 1A detects the presence of humans in the area under surveillance.

DETAILED DESCRIPTION

The image-based multiple-object tracking system in FIG. 1A is capable of tracking various kinds of objects as they move, or are moved, through an area under surveillance. Such objects may include, for example, people, animals, vehicles or baggage. In the present embodiment, the system is arranged to track the movement of people. Thus in the description that follows terms including "object," "person," "individual," "human" and "human object" are used interchangeably except in instances where the context would clearly indicate otherwise.

The system comprises three basic elements: video camera 1502, image processing 103 and alert reasoning 132.

Video camera 102 is preferably a fixed or static camera that monitors and provides images as sequences of video frames. The area under surveillance is illustratively secure area 23 shown in FIG. 2A. A secured access door 21 provides access to authorized individuals from a non-secure area 22 into secure area 23. Individuals within area 22 are able to obtain access into area 23 by swiping an access card at an exterior access card reader 24 located near door 21 in non-secure area 22, thereby unlocking and/or opening door 21. In addition, an individual already within area 22 seeking to leave that area through door 21 does so by swiping his/her access card at an interior access card reader 25 located near door 21 in secure area 23.

Image processing 103 is software that processes the output of camera 102 and generates a so-called "top hypothesis" 130. This is a data structure that indicates the results of the system's analysis of some number of video frames over a previous period of time up to a present moment. The data in top hypothesis 130 represents the system's assessment as to a) the locations of objects most likely to actually presently be in the area under surveillance, and b) the most likely trajectories, or tracks, that the detected objects followed over the aforementioned period of time.

An example of one such hypothesis is shown in FIG. 3A. In this FIG., the nodes (small circles) represent object detections and the lines connecting the nodes represent the movement of objects between frames. This hypothesis is associated with the most recent video frame, identified by the frame number i+4. As seen in the rightmost portion of the FIG., four objects labeled Q, R, S and T were detected in frame i+4 and it has been determined--by having tracked those objects through previous frames, including frames i, i+1, i+2 and i+3--that those objects followed the particular trajectories shown. The frame index i illustratively advances at a rate of 10 frames/second, which provides a good balance between computational efficiency and tracking accuracy. FIG. 3A is described in further detail below.

Referring again to FIG. 1A, top hypothesis 130 is applied to alert reasoning 132. This is software that analyzes top hypothesis 130 with a view toward automatically identifying certain abnormal behaviors, or "alert conditions," that the system is responsible to identify based on predefined rules. It is, of course, possible to set up sensors to detect the opening and closing of a door. However, the image-based system disclosed herein provides a way of confirming if objects, e.g., people, have actually come through the door and, if so, how many and over what period of time.

The system utilizes a number of inventions to a) analyze the video data, b) determine the top hypothesis at any point in time and b) carry out the alert reasoning.

Alert Reasoning

The alert conditions are detected by observing the movement of objects, illustratively people, through predefined areas of the space under surveillance and identifying an alert condition as having occurred when area-related patterns are detected. A area-related pattern means a particular pattern of movement through particular areas, possibly in conjunction with certain other events, such as card swiping and door openings/closings. Thus certain of the alert conditions are identified as having occurred when, in addition to particular movements through the particular areas having occurred, one or more particular events also occur. If door opening/closing or card swiping information is not available, alert reasoning 132 is nonetheless able to identify at least certain alert conditions based on image analysis alone, i.e., by analyzing the top hypothesis.

As noted above, the objects tracked by this system are illustratively human beings and typical alert conditions are those human behaviors known as tailgating and piggy-backing. Tailgating occurs when one person swipes an access control card or uses a key or other access device that unlocks and/or opens a door and then two or more people enter the secure area before the door is returned to the closed and locked position. Thus, in the example of FIG. 2A, tailgating occurs if more than one person enters secure area 23 with only one card swipe having been made. This implies that after the card was swiped and door 21 was opened, two people passed through the door before it closed. Piggy-backing occurs when a person inside the secure area uses an access control card to open the door and let another person in. Thus in the example of FIG. 2A, piggy-backing occurs if a person inside secure area 23 swipes his/her card at reader 25 but instead of passing through door 21 into non-secure area 22 allows a different person, who is then in non-secure area 22, to pass through into secure area 23. An illustrative list of behaviors that the system can detect, in addition to the two just described, appears below.

Alert reasoning module 132 generates an alert code 134 if any of the predefined alert conditions appear to have occurred. In particular, based on the information in the top hypothesis, alert reasoning module 132 is able to analyze the behaviors of the objects--which are characterized by object counts, interactions, motion and timing--and thereby detect abnormal behaviors, particularly at sensitive zones, such as near the door zone or near the card reader(s). An alert code, which can, for example, include an audible alert generated by the computer on which the system software runs, can then be acted upon by an operator by, for example, reviewing a video recording of the area under surveillance to confirm whether tailgating, piggy-backing or some other alert condition actually did occur.

Moreover, since objects can be tracked on a continuous basis, alert reasoning module 132 can also provide traffic reports, including how many objects pass through the door in either direction or loiter at sensitive zones.

The analysis of the top hypothesis for purposes of identifying alert conditions may be more fully understood with reference to FIGS. 2A through 2F, which show the area under surveillance--secure area 23--divided into zones. There are illustratively three zones. Zone 231 is a door zone in the vicinity of door 21. Zone 232 is a swipe zone surrounding zone 231 and includes interior card reader 25. Door zone 231 and swipe zone 232 may overlap to some extent. Zone 233 is an appearing zone in which the images of people being tracked first appear and from which they disappear. The outer boundary of zone 233 is the outer boundary of the video image captured by camera 102.

Dividing the area under surveillance into zones enables the system to identify alert conditions. As noted above, an alert condition is characterized by the occurrence of a combination of particular events. One type of event is appearance of a person in a given zone, such as the sudden appearance of a person in the door zone 231. Another type of event is the movement of a person in a given direction, such as the movement of the person who appeared in door zone 231 through swipe zone 232 to the appearing zone 233. This set of facts implies that someone has come into secure area 23 through door 21. Another type of event is an interaction, such as if the trajectories of two objects come from the door together and then split later. Another type of event is a behavior, such as when an object being tracked enters the swipe zone. Yet another type of event relates to the manipulation of the environment, such as someone swiping a card though one of card readers 24 and 25. The timing of events is also relevant to alert conditions, such as how long an object stays at the swipe zone and the time difference between two objects going through the door.

Certain movement patterns represent normal, non-security-violative activities. For example, FIG. 2B shows a normal entrance trajectory 216 in which a person suddenly appears in door zone 231 and passes through swipe zone 232 and appearance zone 233 without any other trajectory being detected. As long as the inception of this trajectory occurred within a short time interval after a card swipe at card reader 24, the system interprets this movement pattern as a normal entrance by an authorized person. A similar movement pattern in the opposite direction is shown in FIG. 2C, this representing a normal exit trajectory 218.

FIGS. 2D 2F illustrate patterns of multiple trajectories through particular zones that are regarded as alert conditions. FIG. 2D, in particular, illustrates a tailgating scenario. In this scenario, two trajectories 212 and 214 are observed to diverge from door zone 231 and/or from swipe zone 232. This pattern implies that two individuals came through the door at substantially the same time. This pattern, taken in conjunction with the fact that only a single card had been swiped through exterior card reader 24, would give rise to a strong inference that tailgating had occurred.

FIG. 2E depicts a piggy-backing scenario. Here, a person approaches swipe zone 232 and possibly door zone 231 along trajectory 222. The same individual departs from zones 231/232 along a return trajectory 224 at the same time that another individual appears in door zone 231 and moves away in a different direction along trajectory 225. This area-related pattern, taken in conjunction with the fact that only a single card swiping had occurred--at interior card reader 25--would give rise to a strong inference that the first person had approached door 21 from inside secure area 23 and caused it to become unlocked it in order to allow a second person to enter, i.e., piggy-backing had occurred.

FIG. 2E depicts a loitering scenario. Here, a person approaches swipe zone 232 and possibly door zone 231 along trajectory 226. The same individual departs from zones 231/232 along a return trajectory 228. Approaching so close to door 21 without swiping one's card and going through the door is a suspicious behavior, especially if repeated, and especially if the person remains within door zone 231 or swipe zone 232 for periods of time that tend to be associated with suspicious behavior. This type of activity suggests that the individual being tracked is, for example, waiting for a friend, for example, to show up in non-secure area 22 so that he/she can be let in using the first person's card. This behavior may be a precursor to a piggy-backing event that is about to occur once the friend arrives at door 21

FIG. 1B illustrates the operation of alert reasoning 132 responsive to top hypothesis 130. By analyzing top hypothesis 130, a restricted zone determination module 146 of alert reasoning 132 can determine whether a particular zone, or sub-zone within some larger overall space, has been entered, such as door area 131 or swipe area 132. At the same time, a determination is made at a multiple entries determination module 160 whether there were multiple entries during an event, e.g., when tailgating is the event sought to be discovered. Since the system has the complete information of the objects' number and motion history, it can record activities or traffic information 162 in a database 152 when multiple entries are not detected and no violation is recorded. The system can also include an unattended object module 148, which can determine from top hypothesis 130 whether a non-human object appeared within the area under surveillance and was left there. This could be detected by observing a change in the background information. Such an event may also be recorded in an activity recorder 162 as following the alert rules and occurring with high likelihood, but as not being a violation to be recorded at the violation recorder 150 in the database 152. Again, a user such as a review specialist may query 154 the database 152 and access recorder events through a user interface 156 for viewing at a monitor 158. The violations recorded in the violation recorder 150 would likely have higher priority to security personnel if tailgating, for example, is the main problem sought to be discovered, whereas activities merely recorded in the activity recorder 162 may be reviewed for traffic analysis and other data collection purposes.

FIG. 1C is a flow diagram illustrating the operation of alert reasoning 132 in detecting the occurrence of tailgating or piggy-backing responsive to top hypothesis 130. The number of trajectories in the top hypothesis is N. The track number is started at i=0 at 172. When it is determined at 173 that not all of the top tracks have yet been run through the alert reasoning module, then the process proceeds to 174. At 174, it is determined whether the length of the ith track is greater than a minimum length. If it is not, this means that the track is not long enough to be confirmed as indeed a real track, in which case the process moves to increment to the next track in the list at 183. If the ith track is determined to be greater than the minimum length, it is determined whether the ith track is a "coming in" track at 175. "Coming in" track means that the motion direction of the track is from door zone 231 or from non-secure area 22 into secure area 23. If it is not, the process goes to 183 to check next track if there is one. Otherwise, at 176, it is determined whether a card was swiped. If it was, there is no alert and the process moves to 183. If there was no swipe, then it is determined at 177 whether a person on another track swiped a card. If not, the alert code is designated "unknown" at 178 because although there was an entry without a swipe, such entry does not fit the tailgating or piggy-backing scenarios and the alert code is communicated to return alert code processing at 179. If there was a swipe, it is determined at 180 whether a "coming in" time difference is less than a time Td. This parameter is a number that can be determined heuristically and can be, for example, the maximum allowed time difference between when a door opens and closes with one card swipe. If the coming in time is greater than Td, then piggy-backing is suspected and designated at 181 and the piggy-backing alert code is communicated to return alert code processing at 179. If the "coming in" time difference is determined to be less than Td, a tailgating alert code is designated at 182 and the tailgating alert code is communication to alert code processing at 179. It is likely that tailgating occurred in this situation because this means that someone on another track had just swiped a card and had entered and possibly left the door open for the person on the "coming in" track.

The following table is a list of alert conditions, including tailgating and piggy-backing, that the system may be programmed to detect. It will be seen from this table that, although not shown in FIGS., it is possible to detect certain alert conditions using a camera whose area under surveillance is the non-secure area, e.g., area 22. The table uses the following symbols: A=Person A; B=Person B; L(A)=Location of person A; L(B)=Location of person B; S=Secure Area; N=Non-Secure Area.

TABLE-US-00001 Alert Condition Definition Scenario Camera Entry More than one person L(A) = N, L(B) = N; N or S Tailgating enters secure area A cards in; on single entry card. L(A) = S, L(B) = S. Reverse Entry One person enters the L(A) = S, L(B) = N; N or S Tailgating secure area while A cards out; another exits on a L(A) = N, L(B) = S. single exit card. Entry One person uses card L(A) = N, L(B) = N; N Collusion to allow another A cards in; person to enter without L(A) = N, L(B) = S. entering himself. Entry on Person in secure L(A) = S, L(B) = N; S Exit Card area uses card to A cards out; (Piggybacking) allow another person L(A) = S, L(B) = S. to enter without leaving himself. Failed Entry/ Person in non-secure L(A) = N; A N Loitering at area tries to use a unsuccessfully Entry card to open door and attempts to card in fails to gain entry. Loitering Person in secure area L(A) = N; S in Secure goes to door zone Area but does not go through

In determining whether a particular one of these scenarios has occurred, the system uses a) trajectory length, trajectory motion over time and trajectory direction derived from the top hypothesis and b) four time measurements. The four time measures are enter-door-time, leave-door time, enter-swipe-time and leave-swipe time. These are, respectively, the points in time when a person is detected as having entered the door zone, left the door zone, entered the swipe and left the swipe zone, respectively. In this embodiment the system does not have access to electronic signals associated with the door opening/closing or with card swiping. The computer that carries out the invention is illustratively different from, and not in communication with, the computer that validates the card swiping data and unlocks the door. Thus in the present embodiment, the fact that someone may have opened the door or swiped their card is inferred based on their movements. Thus the designations "A cards in" and "A cards out" in the scenarios are not facts that are actually determined but rather are presented in the table as a description of the behavior that is inferred from the tracking/timing data.

As described above relative to FIG. 1C, timing also plays a role in the applying at least some of the scenarios shown in the table in that people must enter and/or leave certain zones within certain time frames relative to each other in order for their movements to be deemed suspicious. Thus in order to decide, based on data from a camera in the secure area, that Entry Tailgating may have occurred, the difference between door-entry-time for one person and the door-entry-time for another person must be less than Td. That is, people who enter at times that are very far apart are not likely to be guilty of tailgating. If the camera is in the non-secure zone, the difference between door-leave-time for one person and the door-leave-time for another person must be less than Td.

The timing for Reverse Entry Tailgating requires that one person's door-leave-time is relatively close to another person's enter-door-time.

The timing for Piggybacking is that one person's enter time is close to another person's enter-swipe-time and, in fact, is less than Td.

The timing for Failed Entry/Loitering at Entry as well as for Loitering in Secure Area is that a person is seen in the swipe zone for at least a minimum amount of time, combined with the observance of a U-turn type of trajectory, i.e., the person approached the swipe zone, stayed there and then turned around and left the area.

In any of these scenarios in which the behavior attempted to be detected involves observing that a person has entered either the door zone or the swipe zone, the time that the person spends in that zone needs to be greater than some minimum so that the mere fact that someone quickly passes through a zone--say the swipe zone within the secure area--on their way from one part of the secure zone to another will not be treated as a suspicious occurrence.

Image Processing and Hypothesis Overview

Returning again to FIG. 1A, the basic components of image processing 103, leading to the generation of top hypothesis 130, are shown. In particular, the information in each video frame is digitized 104 and a background subtraction process 106 is performed to separate background from foreground, or current, information. The aforementioned frame rate of 10 frames/second can be achieved by running camera 102 at that rate or, if the camera operates at a higher frame rate, by simply capturing and digitizing only selected frames.

Background information is information that does not change from frame to frame. It therefore principally includes the physical environment captured by the camera. By contrast, the foreground information is information that is transient in nature. Images of people walking through the area under surveillance would thus show up as foreground information. The foreground information is arrived at by subtracting the background information from the image. The result is one or more clusters of foreground pixels referred to as "blobs."

Each foreground blob 108 is potentially the image of a person. Each blob is applied to a detection process 110 that identifies human forms using a convolutional neural network that has been trained for this task. More particularly, the neural network in this embodiment has been trained to recognize the head and upper body of a human form. The neural network generates a score, or probability, indicative of the probability that the blob in question does in fact represent a human. These probabilities preferably undergo a non-maximum suppression in order to identify a particular pixel that will be used as the "location" of the object. A particular part of the detected person, e.g., the approximate center of the top of the head, is illustratively used as the "location" of the object within the area under surveillance. Further details about the neural network processing are presented hereinbelow.

Other object detection approaches can be used. As but one example, one might scan the entire image on a block-by-block or other basis and apply each block to the neural network in order to identify the location of humans, rather than first separating foreground information from background information and only applying foreground blobs to the neural network. The approach that is actually used in this embodiment, as described above, is advantageous, however, in that it reduces the amount of processing required since the neural network scoring is applied only to portions of the image where the probability of detecting a human is high.

On the other hand, certain human objects that were detected in previous frames may not appear in the current foreground information. For example, if a person stopped moving for a period of time, the image of the person may be relegated to the background. The person will then not be represented by any foreground blob in the current frame. One way of obviating this problem was noted above: simply apply the entire image, piece-by-piece, to detection process 110 rather than applying only things that appear in the foreground. But, again, that approach requires a great deal of additional processing.

The system addresses this issue by supplying detection process 110 with the top hypothesis 130, as shown in FIG. 1A at 134. Based on the trajectories contained in the top hypothesis, it is possible to predict the likely location of objects independent of their appearance in the foreground information. In particular, one would expect to detect human objects at locations in the vicinity of the ending points of the top hypothesis's trajectories. Thus in addition to processing foreground blobs, detection process 110 processes clusters of pixels in those vicinities. Any such cluster that yields a high score from the neural network can be taken as a valid human object detection, even if not appearing the foreground. This interaction tightly integrates the object detection and tracking, and makes both of them much more reliable.

The object detection results 112 are refined by optical flow projection 114. The optical flow computations involve brightness patterns in the image that move as the detected objects that are being tracked move. Optical flow is the apparent motion of the brightness pattern. Optical flow projection 114 increases the value of the detection probability (neural network score) associated with an object if, through image analysis, the detected object can, with a high degree of probability, be identified to be the same as an object detected in one or more previous frames. That is, an object detected in a given frame that appears to be a human is all the more likely to actually be a human if that object seems to be the displaced version of a human object previously detected. In this way, locations with higher human detection probabilities are reinforced over time. Further details about optical flow projection can be found, for example, in B.T.P. Horn, Robot Vision, M.I.T. Press 1986.

The output of optical flow projection 114 comprises data 118 about the detected objects, referred to as the "object detection data." This data includes not only the location of each object, but its detection probability, information about its appearance and other useful information used in the course of the image processing as described below.

The data developed up to any particular point in time, e.g., a point in time associated with a particular video frame, will typically be consistent with multiple different scenarios as to a) how many objects of the type being tracked, e.g., people, are in the area under surveillance at that point in time and b) the trajectories that those objects have followed up to that point in time. Hypothesis generation 120 processes the object detection data over time and develops a list of hypotheses for each of successive points in time, e.g., for each video frame. Each hypothesis represents a particular unique interpretation of the object detection data that has been generated over a period of time. Thus each such hypothesis comprises a particular number, and the locations, of objects of the type being tracked that, for purposes of that hypothesis, are assumed to be then located in the area under surveillance, and b) a particular assumed set of trajectories, or tracks, of that detected objects have followed.

As indicated at 124, each hypothesis is given a score, referred to herein as a likelihood, that indicates the likelihood that that particular hypothesis is, indeed, the correct one. That is, the value of each hypothesis's likelihood is a quantitative assessment of how likely it is that a) the objects and object locations specified in that hypothesis are the objects locations of the objects that are actually in the area under surveillance and b) the trajectories specified in that hypothesis are the actual trajectories of the hypothesis's objects.

Hypothesis management 126 then carries out such tasks as rank ordering the hypotheses in accordance with their likelihood values, as well as other tasks described below. The result is an ordered hypothesis list, as indicated at 128. The top hypothesis 130 is the hypothesis whose likelihood value is the greatest. As noted above, the top hypothesis is then used as the input for alert reasoning 132.

The process then repeats when a subsequent frame is processed. Hypothesis generation 120 uses the new object detection data 118 to extend each hypothesis of the previously generated ordered hypothesis list 128. Since that hypothesis list is the most recent one available at this time, it is referred to herein as the "current hypothesis list." That is, the trajectories in each hypothesis of the current hypothesis list are extended to various ones of the newly detected objects. As previously noted, the object detection data developed for any given frame can almost always support more than one way to correlate the trajectories of a given hypothesis with the newly detected objects. Thus a number of new hypotheses may be generated, or "spawned," from each hypothesis in the current hypothesis list.

It might be thought that what one should do after the hypotheses have been rank-ordered is to just retain the hypothesis that seems most likely--the one with the highest likelihood value--and forget about the rest. However, further image detection data developed in subsequent frames might make it clear that the hypothesis that seemed most likely--the present "top hypothesis"--was in error in one or more particulars and that some other hypothesis was the correct one.

More particularly, there are many uncertainties in carrying out the task of tracking multiple objects in a area under surveillance if single frames are considered in isolation. These uncertainties are created by such phenomena as false detections, missing data, occlusions, irregular object motions and changing appearances. For example, a person being tracked may "disappear" for a period of time. Such disappearance may result from the fact that the person was occluded by another person, or because the person being tracked bent over to tie her shoelaces and thus was not detected as a human form for some number of frames. In addition, the object detection processing may generate a false detection, e.g., reporting that a human form was detected a particular location when, in fact, there was no person there. Or, the trajectories of individuals may cross one another, creating uncertainty as to which person is following which trajectory after the point of intersection. Or people who were separated may come close together and proceed to walk close to one another, resulting in the detection of only a single person when in fact there are two.

However, by maintaining multiple hypotheses of object trajectories, temporally global and integrated tracking and detection are achieved. That is, ambiguities and uncertainties can be generally resolved when multiple frames are taken into account. Such events are advantageously handled by postponing decisions as to object trajectories--through the mechanism of maintaining multiple hypotheses associated with each frame--until sufficient information is accumulated over time.

An example involving the hypothesis shown in FIG. 3A that was introduced and hereinabove shows how such contingencies can lead to different hypotheses.

In particular, as previously noted, FIG. 3A depicts an hypothesis is associated with a video frame, identified by the frame number i+4. As seen in the rightmost portion of the FIG., four detected objects, represented by respective ones of graphical nodes 302 were detected in frame i+4 and it has been determined--by having tracked those objects through previous frames, including frames i, i+1, i+2 and i+3--that those objects followed the particular trajectories formed by the connections 304 from one frame to the next.

An individual one of connections 304 is an indication that, according to the particular hypothesis in question, the two linked nodes 302 correspond to a same object appearing and being detected in two temporally successive frames. The manner in which is this determined is described at a more opportune point in this description.

To see how the hypothesis represented in FIG. 3A was developed, we turn our attention back to frame i. In particular, this hypothesis had as its progenitor in one of the list of hypotheses 128 that was developed for frame i. That hypothesis included four detected objects A, B, C and D and also included a particular set of trajectories 301 that those objects were hypothesized to have followed up through frame i. The four objects A through D are shown in straight vertical line only because the FIG. is a combination spatial and temporal representation. Time progresses along the x axis and since those four objects were detected in frame i, they are vertically aligned in the FIG. In actuality, the objects detected in a given frame can appear in any location with the area under surveillance.

The reason that the objects detected in a given frame are given different letter designations from those in other frames is that it is not known to a certainty which objects detected in a given frame are the same as which objects detected in previous frames. Indeed, it is the task of the multiple-hypothesis processing disclosed herein to ultimately figure this out.

Some number of objects 302 were thereafter detected in frame i+1. It may have been, for example, four objects. However, let it be assumed that the object detection data for frame i+1 is such that a reasonable scenario is that one of those four detections was a false detection. That is, although optical flow projection 114 might have provided data relating to four detected objects, one of those may have been questionable, e.g., the value of its associated detection probability was close to the borderline between person and non-person. Rather than make a final decision on this point, the multiple-hypothesis processing entertains the possibility that either the three-object or the four-object scenario might be the correct one. Hypothesis processing associated with frames following frame i+1 can resolve this ambiguity.

It is the three-object scenario that is depicted in FIG. 3A. That is, it is assumed for purposes of the particular hypothesis under consideration that there were only three valid object detections in frame i+1: E, F and G. Moreover, the processing for this has proceeded on the theory that object E detected at frame i+1 is the same as object A detected at frame i. Hence this hypothesis shows those objects as being connected. The scenario of this hypothesis further includes a so-called merge, meaning that both of the objects B and C became object F. This could happen if, for example, object B walked "behind" (relative to camera 102) object C and was thus occluded. The scenario further has object G being the same as object D.

As we will see shortly, the scenario depicted in FIG. 3A, the above is but one of several possible trajectory stories explaining the relationship between objects A through D detected in frame i and objects E though G detected in frame i+1.

Proceeding to frame i+2, the object detection data from optical flow projection 114 has provided as one likely scenario the presence of five objects H through L. In this hypothesis, objects H and J both emerged from object E that was detected in frame i+1. This implies that both objects A and E represent two people walking closely together, but were not distinguishable as being two people until frame i+2. Objects K and L are hypothesized as being the same as objects F and G. Object I is hypothesized as being a newly appearing object that hadn't followed any of the previously identified trajectories, this being referred to as a trajectory initialization.

Four objects M through P were detected in frame i+3. The hypothesis of FIG. 3A hypothesizes that objects M, 0 and P detected in frame i+3 are objects I, L and J, respectively, detected in frame i+2. Thus respective ones of the connections 304 extend the trajectories that had ended at objects I, L and J in frame i+2 out to objects M, 0 and P, respectively in frame i+3. The scenario represented by this hypothesis does not associate any one of the detected objects M through P with either object H or object K. This can mean either that one or both of the objects H and K a) have actually disappeared from the area under surveillance or that b) they are actually in the area under surveillance but, for some reason or another, the system failed to detect their presence in frame i+3. These possibilities are not arrived at arbitrarily but, rather, based on the certain computations that make them sufficiently possible as to not being able to be ruled out at this point. Moreover, the scenario represented by this hypothesis does not associated object N with any of the objects detected in frame i+2. Rather, the scenario represented by this hypothesis embodies the theory that object N is a newly appearing object that initiates a new trajectory.

In frame i+4, four objects Q through T are detected. The object detection data associated with these objects supports a set of possible outcomes for the various trajectories that have been being tracked to this point and the hypothesis. The scenario of FIG. 3A is a particular one such set of outcomes. In particular, in this hypothesis objects R, S and T are identified as being objects M, 0 and P detected in frame i+3. The object detection data also supports the possibility that object Q is actually object H, meaning that, for whatever reason, object H was not detected in frame i+3. For example, the person in question may have bent down to tie a shoelace and therefore did not appear to be a human form in frame i+3. The data further supports the possibility that none of the objects Q through T is the same as object N. At this point object N would appear to have been a false detection. That is, the data supports the conclusion that although optical flow projection 114 reported the presence of object N, that object did not actually exist. The data further supports the possibility that none of the objects Q through T is the same as object K. At this point object K would appear to truly have disappeared from the area under surveillance.

All of the foregoing, it should be understood, is only one of numerous interpretations of what actually occurred in the area under surveillance over the frames in question. At each frame, any number of hypothesis can be spawned from each hypothesis being maintained for that frame. In particular, the data that supported the scenario shown in FIG. 3A leading to the hypothesis shown for frame i+4 was also supportive of a different scenario, leading to many other hypotheses for frame i+4.

FIG. 3B shows one such alternative scenario. In particular, the data in frame i+1 supported the possibility that the trajectory of object B merged into object E instead of into object F, leading to a different hypothesis for frame i+1 in which that merger is assumed. Moreover, the data for frame i+2 supported the possibility that object I was a false detection. Thus the depicted chain of hypothesis does not include object I at all. The data for frame i+3 supported the possibility that object M, rather than being the same as object I, was really object H and that objects O and P were actually objects J and L instead of the other way around. The data for frame i+3 also supported the possibility that object Q was a false detection.

FIG. 4 is a more generalized picture illustrating the process by which each of the hypotheses generated for a particular frame can spawn multiple hypotheses and how the total number of hypotheses is kept to manageable levels. It is assumed in this example, that the hypothesis list for a certain ith frame contains only one hypothesis. For example, after a period of time when no human objects were detected, a single human form appears in frame i. The single hypothesis, denominated A, associated with this frame contains that single object and no associated trajectory, since this is the first frame in which the object is detected. Let us assume that in the next frame i+1, two objects are detected. Let us also assume that the object detection data for frame i+1 supports two possible hypotheses, denominated AA and AB. Hypothesis AA associates the originally detected person with one of the two people appearing in frame i+1. Hypothesis AB associates the originally detected person with the other of the two people appearing in frame i+1. Hypothesis AA is at the top of the hypothesis list because, in this example, its associated likelihood is greater than that associated with hypothesis AB.

In frame i+2 some number of objects are again detected. Even if only two objects are detected, the data may support multiple scenarios associating the newly detected objects with those detected in frame i+2. It is possible that neither of the two people detected in frame i+2 is the one detected in frame i. That is, the person detected in frame i may have left the area under surveillance and yet a third person has appeared. Moreover, each of the people detected in frame i+2 might be either of the p


Free Web Sudoku Puzzles.
Solve with your browser.
3                
        6 3 7 1  
1 2   8     6    
  7   2 5     6  
    6       4    
  1     9 4   3  
    2     7   5 1
  5 8 9 1        
                6
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!