In something that looks straight out of the CBS show “Person of Interest“, the science website Phsy.org is reporting on a potentially important breakthrough from researchers at Carnegie Mellon. In research sponsored by the United States Army Research Laboratory, the Carnegie Mellon researchers presented an artiﬁcial intelligence system that can watch and predict what a person will ‘likely’ do in the future using specially programmed software designed to analyze various real-time video surveillance feeds. The system can automatically identify and notify officials if it recognized that an action is not permitted, detecting what is described as anomalous behaviors. According to the paper, one such example are cameras at an airport or bus stations with an autonomous system flagging a bag abandoned for more than a few minutes.
The paper presents a complex knowledge infrastructure of a high-level artiﬁcial visual intelligent system, called the Cognitive Engine. In particular it describes how the conceptual speciﬁcations of basic action types can be driven by a hybrid semantic resources. In layman’s terms, the context of an action. For example, is a person leaving a bag because he’s sitting next to it? Or has that person left all together?
The goal of the research is to create artiﬁcial intelligence system similar to that of human visual intelligence. The ability for a computer system to make effective and consistent detections. The researchers noted that humans evolved by learning to adapt and properly react to environmental stimuli, becoming extremely skilled in ﬁltering and generalizing over perceptual data, taking decisions and acting on the basis of acquired information and background knowledge. These computer vision algorithms needed to be complemented with higher-level tools of analysis involving knowledge representation and reasoning, often under conditions of uncertainty.
The Cognitive Engine represents the core module of the Extended Activity Reasoning system (EAR) in the CMU-Minds Eye architecture. Mind’s Eye is the name of the Defense Advanced Research Projects Agency (DARPA*) program for building AI systems that can ﬁlter surveillance footage to support human (remote) operators, and automatically alert them whenever something suspicious is recognized (such as someone leaving a package in a parking lot and running away).
Alessandro Oltramari, a postdoctoral researcher and Christian Lebiere, both from the Department of Psychology at Carnegie Mellon, suggest that this automated video surveillance approach could find applications both in military and civil environments.
Below from The New American:
Several aspects of this “Minority Report” come-to-life sound substantially similar to another contest of sorts being concurrently sponsored by DARPA at a secret campus near George Mason University in Virginia.
In a statement announcing the progress of the research, DARPA spokesmen Mark Geertsen said the goal of the project was “to invent new approaches to the identification of people, places, things and activities from still or moving defense and open-source imagery.”
In the statement, DARPA described several concepts being worked on by six teams of researchers chosen to live and labor in the “DARPA Innovation House,” outside George Mason University.
While the descriptions of the projects provided by DARPA spokesman Mike Geertsen were brief, greater detail of the technologies were discovered by The New American.
The first of the projects reportedly being cooked up in the DARPA test kitchens is called PetaVision. The DARPA statement describes PetaVision as one of the “Multi-Modal Approaches to Real-Time Video Analysis. Biologically-inspired, hierarchical neural networks to detect objects of interest in streaming video by combining texture/color, shape and motion/depth cues.”
While that summary is admittedly vague, a website maintained by the Los Alamos National Laboratory (LANL) provides a bit more information not only on the technology, but why the federal government might find it useful in its quest to place every American under constant surveillance and to identify potential “domestic terrorists.”
We seek to understand and implement the computational principles that enable high-level sensory processing and other forms of cognition in the human brain. To achieve these goals, we are creating synthetic cognition systems that emulate the functional architecture of the primate visual cortex. By using petascale computational resources, combined with our growing knowledge of the structure and function of biological neural systems, we can match, for the first time, the size and functional complexity necessary to reproduce the information processing capabilities of cortical circuits. The arrival of next generation supercomputers may allow us to close the performance gap between state of the art computer vision approaches by bringing these systems to the scale of the human brain.
Admittedly, the potential uses for PetaVision are obscured behind the scientific jargon used in its description. However, empowering the federal government with any technology that can simulate the human brain’s ability to see and process information for the purpose of “detect[ing] objects of interest” in streaming video is terrifying.
As the reports on TrapWire have demonstrated, it is very likely that the video feed from many of the traffic cameras, stoplight cameras, and similar devices may be monitored by agents of the federal government. If the ability of those agents to locate and follow a target increases, the ability of that target to evade detection logically decreases proportionally.
That is to say, once a person has been identified by the federal government as a potential threat, that person will be unable to seek refuge anywhere as emerging technology such as PetaVision will put every spot on the planet within the field of vision of the all-seeing, never-blinking eye of government.
Another tool being hammered out on the DARPA anvils is called Videovor. While no specific information on a technology with that name was found, a website offering scholarly journals covering the topic of visualization of video information was discovered.
On that website an abstract of an article written by scholars at the University of Wales, Swansea (U.K.) makes immediately apparent the attraction such work has for the domestic spying agencies of the federal government:
Video data, generated by the entertainment industry, security and traffic cameras, video conferencing systems, video emails, and so on, is perhaps most time-consuming to process by human beings. In this paper, we present a novel methodology for “summarizing” video sequences using volume visualization techniques. We outline a system pipeline for capturing videos, extracting features, volume rendering video and feature data, and creating video visualization. We discuss a collection of image comparison metrics, including the linear dependence detector, for constructing “relative” and “absolute” difference volumes that represent the magnitude of variation between video frames. We describe the use of a few volume visualization techniques, including volume scene graphs and spatial transfer functions, for creating video visualization. In particular, we present a stream-based technique for processing and directly rendering video data in real time. With the aid of several examples, we demonstrate the effectiveness of using video visualization to convey meaningful information contained in video sequences.
Among the noteworthy revelations in this abstract is the fact that this technology will be used to render “video data in real time” and that the source of that video feed is to be provided by “security and traffic cameras, video conferencing systems, video emails, and so on.”
It is foreseeable that such immensely powerful video summarizing technologies could be very valuable to the National Security Agency (NSA) employees who will soon be monitoring, recording, and storing the electronic communications of every American using the supercomputers housed at the NSA’s sprawling complex near Salt Lake City, Utah.
Reading the description of the next item on DARPA’s list makes it easy to see why the spy apparatus of the federal government would spend millions supporting the work of scientists who can provide powerful new weapons in the war on privacy. The next weapon: geospatial oriented structure extraction.
As hinted at by the DARPA status report, geospatial oriented structure extraction is designed to deliver “automatic construction of a 3D wireframe of an object using as few images as possible from a variety of angles.”
Again, not much to go on, but a search of the Internet provides a little more color. And the source of the additional information may be another piece of evidence of the dangerous liaison growing between the federal government and local law enforcement.
Nlets is a non-profit organization owned and operated by the states that maintains the National Law Enforcement Telecommunications Systems. This system is an electronic messaging service that facilitates the exchange of information among state and local law enforcement.
An Nlets website under the heading “International Justice and Public Safety” describes a project called “Geospatial Service Oriented Architecture for Public Safety (GeoSOAPS). GeoSOAPS, the website says is co-sponsored by the National Institute of Justice and the Department of Homeland Security. Again, this is the sort of collaboration the Constitution could do without.
In a frightening admission against interest, the website proudly boasts that “Nlets and its member community offer the ideal proving ground for this nationally focused project.”
And just who are the members of the Nlets community? According to its website, every state police force in the United States, the Secret Service, the FBI, the DHS, the Federal Aviation Administration, TSA, the State Department, and Interpol, among others. That is a coalition of such immense power, reach, and resources that no one can escape it, neither in the real world nor the virtual world of cyberspace.
From predictive surveillance to Petavision, once these tools for warrantless domestic surveillance — in direct violation of the Fourth Amendment — are delivered to DARPA, the vast network of federal spies and local and federal law enforcement will be able to instantly share the data collected from video feeds captured by traffic and stop light cameras located in thousands of street corners in nearly every town in every country around the world and arrest an individual for acts those machines predict the target might make based solely on the software’s predictions.