Automating Camera Surveillance




W

ith
practice, you can recognize the video spies in the city of Washington,
DC. To a casual observer, they resemble lampposts. Some of the cameras
have a 360 degree view and magnify by a factor of 10-17. Some are
equipped with night vision and can zoom in on a target well enough
to read text on a written page or look into a building. Most are
placed at locations that would not come to mind as primary terrorist
targets: Smithsonian Castle, the U.S. Department of Labor, Dupont
Circle, Union Station, Wisconsin Avenue, the Old Post Office, and
the Banana Republic in Georgetown. 


Though
the targets they view may not stand out as particularly vulnerable
to terrorism, the cameras are placed strategically for the purpose
of monitoring demonstrations and protests. 


One
of the first occasions for their use was a demonstration in April
2000 against the World Bank and International Monetary Fund (IMF).
Supplemental data from the U.S. park police monitoring demonstrations
by helicopter was sent as a digital feed to the Metropolitan Police
Department. The DC police, the FBI, the Secret Service, and the
DC school system agreed to pool data as needed. Though the police
have stated there are only a dozen cameras, these cameras can link
to about 1,000 other government cameras to make up a network such
as might be found in a NASA or defense command center. 


Similar
systems exist in other cities for the same purpose. During antiwar
protests in Boston at the 2004 Democratic National Convention, the
police informed the media that camera systems would be used to guard
against acts of terrorism. According to ANSWER (Act Now to Stop
War and End Racism), photographs of protesters from past marches
were circulated to bus drivers and other mass transit employees
to train them to recognize “terrorists.” In Manhattan,
a person walking on a street is in view of at least one of 2,400
cameras. 


In
reaction, privacy advocates who view these surveillance measures
with alarm have begun to publish the locations of surveillance devices
in major U.S. cities, to allow others who object to the technology
to chart surveillance-free paths through the streets. 


The
privacy issues surrounding Closed Circuit TV (CCTV) become more
complicated when computer vision technology is applied to surveillance.
A controversy resulted in 2001 when authorities used face recognition
technology and CCTV at the Tampa Bay Super Bowl to search for criminals
and terrorists. The action led to 19 arrests, all of them for petty
crimes, with no record of whether these arrests were legitimate.
The ensuing public furor led several legislators, such as Dick Armey,
to propose laws for protecting privacy and regulating the use of
biometric technology. 


When
the results of video searches are combined with other existing databases,
powerful methods of identification and tracking become possible.
Unique body marks make identification much easier. In Fort Worth,
Texas police can track gang members by applying a software package
called GangNet. By typing a description of tattoos into the database,
the software can produce pictures of members wearing those tattoos.
Similar searches can be performed on nicknames, vehicle numbers,
telephone numbers, or partial license plate numbers. Salinas, California
received federal funding for a Geographic Information System to
carry out crime tracking of gangs. In Manalapan, Florida—one
of the nation’s wealthiest cities—cameras and computers
have been set up to run background checks on every car and driver
that enters. The system alerts a 911 dispatcher if the car is stolen
or the driver is suspected of a crime. Infrared cameras record each
car’s license number and other cameras photograph the driver. 


In
2003, Ohio transportation officials began testing the use of unpiloted
aircraft equipped with video, infrared cameras, and other sensors
to monitor traffic jams. The information from aerial monitoring
is intended to help police looking for the best route to an accident
scene, as well as assist traffic planners, emergency workers, truck
companies, and commuters. Some of the planes—drones—are
as small as a model aircraft. The military can use these planes
to send back real-time images of battle to commanders. In November
2003, the CIA used a drone to fire a missile into a car containing
six alleged al-Qaida members. Unpiloted aerial vehicles—UAVs—have
attacked high-priority targets in Afghanistan and Iraq. In December
2002, Senate Armed Services Committee Chair John Warner (R-VA) indicated
interest in using drones for homeland security. In January 2003,
as a cost-saving measure, a U.S. Congressional Research Service
report suggested replacing piloted fighters flying combat air patrols
(CAP) over U.S. cities with UAVs armed with air-to-air missiles.
It is unclear whether the FAA would have authority over the UAVs
in such a program. Furthermore, the UAVs may be too small to be
seen and fly too low to be detected by radar. (The possible use
of UAVs to deliver biological and chemical attacks has been a concern
of the federal government.) Though the cameras used in these UAVs
have been remotely piloted vehicles (RPVs) unequipped with artificial
intelligence processing, efforts to develop robotic autonomous vehicles
are being funded by DARPA (Defense Advanced Research Projects Agency).



Military
and local law enforcement agencies already use video surveillance
to automate threat response. In Broward County, Florida, Port Everglades
selected ObjectVideo VEW software to protect its perimeter. The
software contains a tripwire feature that allows security personnel
to create virtual perimeters on land and water by drawing a box
on a digital view of what the camera is observing. Unknown people
or vehicles crossing the tripwire boundaries signal an alert. 


Archival
video data are vulnerable to the same trends in industry and government
that have led to information being sold as a commodity. Information
about individuals is sold and traded routinely for marketing, charity
solicitations, and political polling. Individuals may find more
difficulty controlling the distribution of archived surveillance
imagery, where the data are more likely to be collected surreptitiously.
The breakdown of privacy in the trade of personal information already
makes it possible for government agencies such as the FBI to bypass
the government ban against information collection for people who
are not suspects of investigation by simply accessing personal information
that is already commercially available. 


Some
of the controversies of video surveillance came to public attention
during the Congressional discussion of the proposed Total Information
Awareness (TIA) research programs of the Pentagon. Several research
programs in TIA exploited video pattern recognition. HumanID included
research projects to recognize humans from a distance based on face
and gait along with other biometric tools. Though public outcry
caused the TIA budget to be canceled by Congress in September 2003,
some of the programs continued under other cover, such as Novel
Intelligence from Massive Data (NIMD), Non-Obvious Relationship
Awareness (NORA), Adaptive Concept Understanding from Modeled Enterprise
Networks (ACUMEN), Computer-Assisted Passenger Prescreening System
(CAPPS II), and Multi-state Anti-Terrorism Information Exchange
(MATRIX). 


Among
the pattern matching efforts was a project known as Video Analysis
and Content Exploitation (VACE). The goal of VACE was automatic
content detection and recognition in “video scenes of various
indoor and outdoor activities involving people, meetings, and vehicles,
and TV news broadcasts,” according to the Advanced Research
and Development Activity (ARDA) website. Research goals included
recognition of people, event detection and understanding, video
query, multi-modal video data mining, and object identification
from motion. The VACE solicitations have closed, but as of December
2003, plans for workshops in VACE and other programs were still
planned for 2004. 


Another
military project for widespread video collection was offered as
a DARPA BAA solicitation in May 2003—Combat zones That See
(CTS). The goal of CTS is to develop video understanding of multiple
data feeds arriving from many sources to support military operations
in urban terrain. The military is interested in tracking vehicles
moving from one camera location to another. 


The
ability to extract information from video images is so widely sought
in the scientific community that the complete discontinuation of
funding for similar programs seems unimaginable. The National Geo-
spatial-Intelligence Agency has plans to post solicitations for
geo- spatial information visualization and to award $2.5 million
in FY05 and FY06. Military applications of pattern recognition and
video data mining continue to be developed by means of Department
of Defense solicitations to contractors in the form of BAA, SBIR,
and STTR awards. These programs include efforts at automating algorithms
for detecting human intentions in subjects appearing in video films,
for distinguishing decoys from targets, and for synchronizing many
UAVs to carry out simultaneous reconnaissance and attack. It is
possible that in the future, covert or privatized TIA-like programs
may escape the scrutiny of Congress. It also seems certain that
computer vision will find increasing applications in autonomous
vehicles and that efforts will be made to explore the  abilities
of robotic devices to carry out automated warfare. 


Though
video surveillance is a passive activity, the data mining of video
records to profile individuals is surreptitiously invasive. Given
the difficulty of detecting video surveillance systems hidden on
the ground or in the air, it would be difficult to enforce restraints
against misuse of such data by either private agents or governments.
Imagine each city street patrolled by cameras, some fixed, some
moving, equipped to detect a variety of crimes ranging from minor
traffic violations to terrorism. A few observation stations pass
their data to human observers and the rest function automatically,
assessing complex threats to law and order with the acuity of trained
professionals. UAVs fly overhead in networked configurations called
swarms, passing information to one another that enables the group
as a whole to respond to emergency situations with greater organization
than an individual platform could achieve. Most of the cameras and
UAVs would be invisible to casual observers. Though the components
of surveillance equipment and software are cheap, the infrastructure
to support exchanges of information across many databases and networks
could be afforded only by large corporations or government institutions.
This implies a poten- tially asymmetrical situation in which surveillance
becomes a weapon of class warfare. 



The
ultimate realization of a surveillance society would be dictatorial
and intolerant of dissent. Total control would not require a lone
autocrat—a complex network of private agencies could control
the surveillance devices with little or no accountability. Hierarchies
of privilege favoring race or class are built implicitly into existing
surveillance systems—for example, homeless residents of a city
are more often targeted by video surveillance systems than are other
citizens. Practices that are too blatantly oppressive to succeed
in the U.S. or European countries could be exported to dictatorships
abroad, to nations that can be exploited for their resources and
brought under the hegemony of global corporate powers. The distinction
between the benign possibility and its terrifying alternative depends
on who controls the data. If data collection is transparent and
access to the data is egalitarian, the most beneficial potential
of the technology can be realized. If special political groups,
religious factions, or corporate interests attain control over the
use of data—historically, the most probable scenario—those
groups could enforce unprecedented social control. 


Recent
developments in computer vision, robotics, and pattern matching
increase the possibility of drastic social transformations. The
dictatorship of Big Brother had one small limitation of power: it
depended on the obedience and vigilance of subordinates to enforce
control. The application of data mining methods to massive video
data sets enables a sufficiently organized power to outmatch humans
in carrying out surveillance. Though the robot soldier and the robot
police are not yet reality, present technological achievements can
lead to this future possibility. In case these apprehensions seem
too dire, it is worth remembering how easily other invasions of
privacy such as drug testing have come to be accepted generally,
even when they require active awareness by participants. Polls show
that people are often willing to give up some privacy in exchange
for the perception of better security. Fears of terrorism, appeals
to patriotism, economic incentives, and the insidiousness of visual
surveillance prevent many people from questioning misuses of similar
technology—especially when governments and corpora- tions shroud
their research and development.



 






Andrew Kalukin’s
signal and image processing  research has been published in scientific
journals and used by government agencies and private industry. Photos
are courtesy of www.observingsurveillance.org.