3.4 Developing Evaluation Criteria




The findings from the study provide insight into what cognitive demands in
supervisory control under data overload are most prominent in intelligence
analysis. By conducting the study, we were able to more directly target designs
that would be useful to the analysts in that they would reduce vulnerabilities to
generating inaccurate or incomplete analytic products. If we had not conducted
the study, we might have designed systems that might have appeared innovative
in a demonstration and could be useful, but that would likely have incorporated
features that would be infrequently used, thereby creating unnecessary
complexity in the interface and expense in the design process.

In addition to generating new design concepts that we pursued, the study also
allowed us to translate the identified vulnerabilities into specific criteria that
successful responses to the data overload problem in intelligence analysis need
to satisfy. Therefore, this step also had the benefit of creating criteria that could
be used to objectively evaluate the usefulness of any design concept designed to
combat data overload in intelligence analysis.

1. Recognition of Unexpected Information. Bring analysts' attention to highly
informative or definitive data and relationships between data, even when the
practitioners do not know to look for that data explicitly. Informative data
includes "high profit" documents, data that indicates an escalation of
activities or a disrupting event, and data that deviates from expectations. A
particularly difficult criterion to meet that should be designed into evaluation
scenarios is to help analysts recognize updates that overturn previous
information.

2. Management of Uncertainty. Aid analysts in managing data uncertainty. In
particular, solutions should help analysts identify, track, and revise
judgments about data conflicts and aid in the search for updates on thematic
elements.

3. Broadening. Help analysts to avoid prematurely closing the analysis process.
Solutions should broaden the search for or recognition of pertinent
information, break fixations on single hypotheses, and/or widen the
hypothesis set that is considered to explain the available data.

These evaluation criteria are interesting, in part, because they are so difficult to
address. We realized quickly that these criteria are not amenable to simple,
straightforward adjustments or feature additions to current tools. Meeting these
design criteria will require fundamentally innovative and novel design concepts.4


At this point in the project, we took stock of what our methodology had
provided us as a research/design team. We believed that already at the
completion of the study, we were able to see progress that reinforced our belief
in complementarity between research and design and learning about
understanding, usefulness, and usability in parallel because:

• it contributed to our general understanding of data overload, as evidenced by
helping us in other settings such as NASA Space Station mission control,
• it revealed the world of the analyst effectively and grounded general concepts
to the particulars of the situation the professional analyst faces,
• we were able to identify characteristics of intelligence analysis that were
similar and unique to other settings in which we had more experience,
• generative design sessions had a step upward in productivity and we were
able to eliminate directions and features to pursue and come to better
consensus within the team as to what concepts to emphasize,
• we were better able to critique and offer suggestions to improve ongoing
projects aimed at solving the data overload problem,
• it generated new practice-centered criteria for evaluating proposed solutions
to data overload,
• it is serving as a basis for interaction and as a stimulus to a more constructive
dialogue across analysts, developers and others for useful design directions to
pursue,5
• many in the analyst community could take home lessons for their own role or
work. For example, a spin-off project at the agency is underway where the
Ariane 501 scenario and database will be used as a training vehicle for new
analysts while they are waiting for their clearance.


4. We would like to note that these criteria, although so easily recognized in hindsight that they
have been challenged as obvious, are different than criteria that were described to us previously.
For example, prior to conducting the study, criteria for addressing the data overload problem
were offered at various points to be 1) to have an analyst be able to read it all, 2) to find the
relevant information that is needed to perform an analysis, 3) to visualize the landscape of the
information space, 4) to have the machine tell an analyst when an important message has been
received, 5) to see an overview of events in an area that have not been monitored for some time,
and 6) to have the machine summarize the important points in each message.
5 It was fascinating to watch a developer and an analyst interact around the kinds of concrete
issues that the study captured after one of our presentations



TABLE OF CONTENTS