2.2 Defining Data Overload


An important early step was to translate the cognitive tasks, intelligence analysis
domain, and data overload problem into scientific terms that would allow us to
leverage relevant Cognitive Systems Engineering research bases. Therefore, we
were able to determine relatively quickly that:

• The main cognitive task in intelligence analysis is inferential analysis,
which involves determining the best explanation for uncertain, often
contradictory and incomplete data. The inferential analysis task could be
defined as abductive reasoning (Josephson and Josephson, 1994) in the
sense that the analytic product is always contestable because of the
uncertainty, but that certain conclusions and analytic processes could be
argued to be better than others and recognized as such by experts.

• Another cognitive framing of intelligence analysis could be that of a
supervisory controller (the intelligence analyst) monitoring a process
(national technological and human processes/capabilities). The main
difference between traditional supervisory control and intelligence
analysis is that it is difficult to conduct interventions, either to alter the
process (therapeutic interventions) or to obtain additional information
(diagnostic interventions). Another distinction is that since the data is in a
mostly free-form textual format, it is difficult to alarm setpoint crossings,
unlike with parameter data.

• The intelligence analysis domain is a socio-technical system with many
similarities to other domains studied by cognitive systems engineers. An
analyst monitors a system that is complex and interconnected.
Intelligence analysis is a difficult task that requires significant expertise
and is performed under time pressure and with high consequences for
failure.

Not surprisingly, we found that defining "data overload" was much more
challenging than characterizing the main cognitive tasks and intelligence analysis
domain. Although everyone in the literature agreed that data overload was an
important problem that was difficult to address, the precise definition of data
overload was wide-ranging. Common to most views of data overload in
supervisory control domains was the notion that excessive amounts of data
increased cognitive burdens for the human operator. Beyond that, however, the
wide variety of design aids touted to"solve data overload" attested to the
variability in definitions of the data overload problem (see Woods, Patterson,
and Roth, 1998, for an extended discussion of different characterizations of the
data overload problem and their associated solutions).

Given this variability in definitions of data overload, we were required to resort
to "first principles" in order to come up with a definition of the data overload
problem. In cognitive systems engineering, the fundamental unit of analysis is
the "Cognitive Triad", which includes the demands of the work domain, the
strategies of the practitioners, and the artifacts and other agents that support the
cognitive processes. Since cognitive systems engineering takes the triad as the
fundamental unit of analysis, we rejected definitions that isolated the data from
practitioners, domain constraints, tasks, and artifacts. Therefore, we defined
data overload to be a condition where a domain practitioner, supported by
artifacts and other human agents, finds it extremely challenging to focus in on,
assemble, and synthesize the significant subset of data for the problem context
into a coherent assessment of a situation, where the subset of data is a small
portion of a vast data field. The starting point for this definition was recognizing
that large amounts of potentially available data stressed one kind of cognitive
activity: focusing in on the relevant or interesting subset of data for the current
problem context. When operators miss critical cues, prematurely close the
analysis process, or are unable to assemble or integrate relevant data, this
cognitive activity has broken down.

People are a competence model for this cognitive activity because people are the
only known cognitive system that is able to focus in on interesting material in
natural perceptual fields even though what is interesting depends on context
(Woods and Watts, 1997). The ability to orient focal attention to "interesting"
parts of the natural perceptual field is a fundamental competency of human
perceptual systems (Rabbitt 1984; Wolfe 1992). Both visual search studies and
reading comprehension studies show that people are highly skilled at directing
attention to aspects of the perceptual field that are of high potential relevance
given the properties of the data field and the expectations and interests of the
observer. Reviewing visual search studies, Woods (1984) commented,"When
observers scan a visual scene or display, they tend to look at "informative" areas .
. . informativeness, defined as some relation between the viewer and scene, is an
important determinant of eye movement patterns" (p. 231, italics in original).
Similarly, reviewing reading comprehension studies, Bower and Morrow (1990)
wrote, "The principle . . . is that readers direct their attention to places where
significant events are likely to occur. The significant events . . . are usually those
that facilitate or block the goals and plans of the protagonist."

In the absence of this ability, for example in a newborn, as William James put it
over a hundred years ago, "The baby assailed by eye, ear, nose, skin and entrails
at once, feels it all as one great blooming, buzzing confusion" (James, 1890, I 488).
The explosion in available data and the limits of current computer-based
displays often leave us in the position of that baby -- seeing a "great blooming,
buzzing confusion."



TABLE OF CONTENTS