Artificial Intelligence and International Relations

September 4, 2017 | Author: Widya Setiabudi Aseli | Category: Technology, Artificial Intelligence, Reason, Intelligence, Science
Share Embed Donate


Short Description

Download Artificial Intelligence and International Relations...

Description

Artificial Intelligence and International Relations: An Overview Philip A. Schrodt

Artificial intelligence techniques have been used to model international behavior for close to twenty years. Alker and his students at MIT were generating papers throughout the 1970s (Alker and Christensen 1972; Alker and Greenberg 1976; Alker, Bennett, and Mefford 1980), and by the the early 1980s work at Ohio State was responsible for the first article to apply AI in a mainstream IR journal (Thorson and Sylvan 1982) and the first edited book containing a number of AI articles (Sylvan and Chan 1984). This volume provides the first collection of essays focusing on AI as a technique for modeling international behavior,1 an approach commonly, if controversially, labeled "AI/IR." The essays come from about twentyfive contributors at fifteen institutions across the United States and Canada; the substantial foci range from Japanese energy security policy to Vietnam policy in the Eisenhower administration. The purpose of this overview is twofold. First, it will provide a brief background of relevant developments in AI in order to provide some perspective on the concepts used in AI/IR. Second, it will identify some common themes in the AI/IR literature that use artificial languages for modeling. I will not deal with any of the issues in depth, nor provide extensive bibliographical guidance, as these are ably presented in the chapters themselves (e.g., those by Mefford and Purkitt in this volume; Mallery [1988] also provides an excellent introduction). I will not discuss the natural language processing (NLP) literature—which is covered in the chapters by Mallery, Duffy, and Alker et al.—though I will discuss projects that use text as a 1

Cimbala (1987) and Andriole and Hopple (1988) provide introductions oriented toward U.S. defense concerns, but these are only a narrow part of the field of IR generally.

source of information for development of models rendered in artificial language (e.g., Bennett and Thorson; Boynton; Sylvan, Milliken, and Majeski). Th chapter does not purport to provide a definitive description of the AI/I field; it is simply one person's view of the organization of the field at th moment. In contrast to many other modeling approaches, the AI/IR con munity is characterized by a healthy level of internal debate. This chapti is the overture, not the symphony. I intend only to draw your attention to themes; the details, in both melody and counterpoint, are found in the chapters that follow.

Artificial Intelligence The label "artificial intelligence" is, ironically, rejected by a majority of the authors in this volume as a description of their shared endeavor. Th preferred label is "computational modeling," which acknowledges the field intellectual roots in the formal modeling and computer simulation literatur within political science, rather than in the AI literature of computer science As will be noted below, the AI/IR efforts utilize only a tiny subset of AI methods, and in many respects AI/IR overlaps at least as much with cognitive psychology as with computer science. The AI label poses two additional problems. The most severe is guilt by association with "the AI hype": the inflated claims made for Al by th popular media, science fiction, and consulting firms. The AI hype has bee followed by the backlash of the "AI winter," and so AI/IR risks being caugh in a counterrevolution just as it is beginning to produce results. The second problem is the controversial word "intelligence." In the AI hype, "intelligence" has usually been associated with superior intelligent such as that exhibited by Star Wars robots (either the George Lucas or Ronald Reagan variety). The most common retort I encounter when presentin AI/IR overviews to unsympathetic audiences is: “You can't model politic using artificial intelligence;

you'd have to use artificial stupidity.”2 As the chapters that follow indicate, that is precisely our shared agenda! “Artificial stupidity” involves limited information processing, heuristics, bounded rationality, group decision processes, the naive use of precedent and memory over logical reasoning, and so forth. These features of human reasoning amply documented in the historical and psychological literature, are key to AI/IR but largely absent from optimizing models of the dominant forme paradigm in political science, rational choice (RC). Ironically, the true "artificial" intelligence is utility maximization, not the processes invoked ii computational models. All this being said, one must confront two social facts. First, the term "computational modeling" has not caught on because it is not reinforce:, by the popular media. Second, AI/IR has borrowed considerably from this part of computer science and the cognitive sciences which calls itself "artificial intelligence," including the widespread use of LISP and Prolog as formal languages, the formalization of rules, cases and learning, and a great deal of vocabulary. In the spirit of mathematician David Hubert's definition of geometry as "that which is done by geometers," the AI label will probably stick.

AI in the Early 1980s The term "artificial intelligence" refers to a very large set of problems and techniques ranging from formal linguistic analysis to robots. Researchers in "AI" may be mathematicians or mechanics, linguists or librarians, psychologists or programmers. Schank (1987:60) notes:

Most practitioners would agree on two main goals in AI. The primary goal is to build an intelligent machine. The second goal is to find out about the nature of intelligence. . . . [However,] when it comes down to it, there is very little agreement about what exactly constitutes intelligence. It follows that little agreement exists in the AI community about exactly what AI is and what it should be.

2

This seemingly original joke seems to occur to almost everyone. . . .

Research in AI has always proceeded in parallel, rather than serially, with dozens of different approaches being tried on any given problem. As such, AI tends to progress through the incremental accumulation of partial solutions to existing problems, rather than through dramatic breakthroughs. Nonetheless, from the standpoint of AI/IR, there were two important changes in Al research in the late 1970s. First, rule-based "expert systems" were shown to be able to solve messy and nontrivial real-world problems such as medical diagnosis, credit approval, and mechanical repair at the same level of competence as human experts (see, for example, Klahr and Waterman 1986). Expert systems research broke away from the classical emphasis in AI on generic problem solving (e.g., as embodied in chessplaying and theorem-solving programs) toward an emphasis on knowledge representation. Expert systems use simple logical inference on complex sets of knowledge, rather than complex inference on simple sets of knowledge. The commercial success of expert systems led to an increase in new research in AI generally —the influx of funding helped— and spun off a series of additional developments such as memory-based reasoning, scripts, schemas, and other complex knowledge representation structures. Second, the personal computer, and exponential increase in the capabilities of computers more generally, brought the capabilities of a 1960s mainframe onto the researcher's desk. The small computers also freed AI researchers from dependence on the slow and idiosyncratic software development designed for centralized mainframes. The "AI style" of programming led to a generation of programmers and programming environments able to construct compl cated programs that would have been virtually impossible using olde languages and techniques. All of this activity lead to a substantial increase in the number of peopl doing Al. The American Association for Artificial Intelligence (AAAI) wa founded in 1979, had 9,935 members by 1985 and 14,269 by 1986 — growth of 43 percent in a single year. In short, AI in the 1980s wa accompanied by a great deal of concrete research activity in contrast to faddish techniques such as catastrophe theory.

The AI Hype Perhaps predictably, the increase in AI research was accompanied (and partially fueled) by a great deal of hype in the popular and semiprofessiona media. An assortment of popular books on AI were produced by researchers such as Feigenbaum (Feigenbaum and McCorduck, 1983; Feigenbaum and McCorduck and Nil, 1988), Minsky (1986), and Schank (Schank and Riesback, 1981); at times these reached sufficient popularity to be featured by paperback book clubs. Journalists such as McCorduck (1979), Sanger (1985), and Leithauser (1987) provided glowing appraisals of Al; these are only three of the hundreds of books and popular articles appearing in the early to mid-1980s. Concern over the Japanese "Fifth Generation Project" (Feigenbaum and McCorduck, 1983) provided impetus for the wildly unrealistic3 "Strategic Computing Initiative" of the Defense Advanced Research Projects Agency (DARPA, 1983). These popular works provided a useful corrective to the outdated and largely philosophical criticisms of Dreyfus (1979) and Weizenbaum (1976) about the supposed limits of AI. By the early 1980s researchers had made substantial progress on problems that by any reasonable definition required "intelligence" and were exhibiting performance comparable to or exceeding that of humans. However, the popularizations were understandably long on concepts and short on code, and their explicit or implicit promises for continued exponential expansion of the capabilities of various systems did not take into account the tendency of technological innovation to follow a logistic curve.4 Because the promises made in these popularizations were

3

For example, DARPA's 1983 timetable calls for the following developments by 1990: vision subsystems with "1 trillion Von-Neumann equivalent instructions per second"; speech subsystems operating at a speed of 500 MIPS that "can carry on conversation and actively help user form a plan [sic]," and "1,000 word continuous speech recognition." Each of these projected capabilities is 10 to 100 times greater than the actual capabilities available in 1990, despite massive investments by DARPA. For additional discussion, see Waldrop (1984); Bonasso (1988) provides a more sympathetic assessment. 4 A 10,000-rule expert system is unlikely to achieve ten times the performance of a 1,000-rule system. To the contrary the 1,000-rule system will probably have 80-90 percent of the functionality of the large

based largely on laboratory results that had not been scaled up nor widely applied in real-world settings, such promises set up AI for a fall.

The AI Winter The hype of the mid-1980s leveled off by the latter part of that decade and some segments of the AI community —particularly companies producing specialized hardware— experienced the "AI Winter." However, the decline of AI was more apparent than real and reflected the short attention span of the popular press as attention turned away from AI to global warming, superconducting supercolliders, parallel processing, and cold fusion. Experimental developments, most notably neural networks, continued to attract periodic media attention, but the mainstream of AI assumed a level of glamor somewhere between that of biotechnology and X-ray lasers: yesterday's news, and somewhat suspect at that. Yet ironically, the well-publicized bankruptcies of "AI firms" (see, for example, Pollack 1988) were due to the success rather than the failure of AI. As AI techniques moved out of the laboratories and into offices and factories, commercial demand shifted from specialized "AI workstations" and languages such as LISP to systems implemented on powerful general-purpose microcomputers using standard procedural programming languages such as C or off-the-shelf expert systems shells. AI research moved in-house and was diffused into thousands of small applications rather than a few large ones. Overall, the AI field remained very healthy. Although membership in the AAAI declined in 1988 and 1989, dropping to 12,500 members, the 1989 International Joint Conference on Artificial Intelligence, the AI equivalent of the International Political Science Association, was large enough to require the Detroit Convention Center, fill every convention hotel in downtown Detroit and nearby Windsor, Ontario, and all this despite a $200 conference registration fee. system; the additional 9,000 rules are devoted almost exclusively to the residual 10-20 percent of the cases.

Beyond the issues of popular perception, it is important to note that the future of AI/IR is largely independent of the successes or failures of AI generally. Whether a chess-playing program will be able to defeat the reigning human grand master or whether simultaneous translation of spoken language is possible will have no effect on most AI/IR research. Even if mainstream AI has some implications for the development of computational models of international behavior, the AI/IR literature is primarily shaped by literatures in psychology, political science, and history rather than computer science. The techniques borrowed from computer science are only tools for implementing those theories.

AI/IR Research: A Framework This section will attempt to structure the various sets of problems studied in AI/IR. For example, there has been frequent confusion outside the field as to why discourse analysis (represented in this volume by Boynton; Thorson and Bennett; and Sylvan, Milliken, and Majeski) should have anything to do with rule-based models (e.g., Job and Johnson) because the techniques are entirely different. The simple answer is that the AI/IR literature is primarily linked by underlying theories and questions rather than by methodology. Although this is consistent with classical Kuhnian notions of science it is decidedly uncharacteristic of formal approaches to the study of international behavior such as correlational analysis, arms-races models, an game theoretic models of war initiation, which are largely linked by technique. As noted earlier, any effort to find common themes in a field as conceptually rich and disputatious as AI/IR is fraught with the risk of oversimplification This chapter is simply a survey of the high points. AI/IR developed in a evolutionary fashion; the organization I have presented below is a typolog imposed, ex post facto, on an existing literature, rather than an attempt to present a consensus view of where the field is going. The typology consists of three parts. The first category is the research on patterns of political reasoning, which provides the empirical groundini for models of

organizational decision making. The second category involve the development of static models of organizational decision making, which aim to duplicate the behavior of an organization or system at a specific point in time. This is the largest part of the literature in terms of model: that have actually been implemented, and it relies on the expert system: literature in the AI mainstream. The final category contains dynamic model: that incorporate precedent, learning, and adaptation, which can show how an organization acquired its behavior as well as what that behavior is. These models are necessarily

more

elaborate

and

experimental,

though

some

large

scale

implementations exist, notably the JESSE model of Sylvan, Goel, and Chandrasekaran.

Patterns of Political Reasoning The Psychological Basis. Virtually all work in AI/IR acknowledges ar extensive debt to experimental work in cognitive psychology. Although a wide variety of approaches are cited, two literatures stand out. The first influence is the work of Allen Newell and Herbert Simon or human problem solving (Newell and Simon 1972; Simon 1979, 1982). Simon's early work pointing to the preeminence of satisficing over maximizing behavior is almost universally accepted in AI/IR, as is the Newell-Simon observation that human cognition involves an effectively unlimited (albeit highly fallible) memory but a fairly limited capacity for logical reasoning These assumptions about human problem solving are exactly opposite those of the rational choice approach, where cognition involves very little memory but optimization is possible. In addition to these general principles, othei work by Newell and Simon on specific characteristics of human problem solving is frequently invoked, for example the re-use of partial solutions and the distinction between expert and novice decision making. The second very large experimental literature is the work of Danie Kahneman, Paul Slovic, Amos Tversky (KST), and their associates in exploring the numerous departures of actual human decision making from the characteristics predicted by

utility maximization and statistical decision theories (Kahneman, Slovic, and Tversky 1982). This work has emphasized, for example, the importance of problem framing, the use of heuristics, the effects of familiarity and representativeness, and so forth. Even though the experimental results, general principles, and concepts of these two literatures are used extensively in AI/IR, their theoretical frameworks are not. Newell and his students have developed a general computational paradigm for AI, SOAR (Laird, Rosenbloom, and Newell 1986; also see Waldrop 1988), but to my knowledge it has not been applied in the IR context. "Prospect theory," the term usually applied to the KST work, is also rarely used. These research results are used instead to explicate some of the characteristics of IR decision making, which, because it is organizational and frequently involves unusual circumstances such as the decision to engage in lethal violence, is far removed from the individualistic studies of much of the psychological literature. Some work on group decisionmaking dynamics has also been used—for example, Pennington and Hastie on decisions by juries (cited in Boynton)—but in general this literature is smaller and less well known in cognitive psychology than the theories and experiments on individuals. Knowledge Representation. Consistent with the Newell-Simon approach, AI/IR models are heavily information-intensive. However, the theory of data employed in AI/IR is generally closer to that of history or traditional political science than it is to statistical political science. Most of the chapters in this volume use archival text as a point of departure; the remainder use secondary data such as events that were originally derived from text using procedures similar to content analysis. The ordinal and interval-level measures common to correlational studies, numerical simulations, and expected utility models are almost entirely absent. This, in turn, leads to the issue of knowledge representation, which is a theme permeating almost all of the papers and which accounts for much of their arcane vocabulary and seeming inconsistency. Whereas behavioral political analysis essentially has only three forms of knowledge representation —nominal, ordinal, and interval variables— AI presents a huge variety, ranging from simple if . . . then

statements and decision trees to scripts and frames to self-modifying programs and neural networks. This surfeit of data structures is both a blessing and a curse. It provides a much broader range of alternatives than are available in classical statistical or mathematical modeling, and certainly provides a number of formal structures for representing the large amounts of information involved in political decision making; this comes at the expense of a lack of closure and an unfamiliar vocabulary. Part of this problem stems from the fact that knowledge representation concepts have yet to totally jell within their parent discipline of computer science. In this regard AI/IR is quite different than the behaviorialist adoption of statistical techniques and the RC adoption of economic techniques: In both cases stable concepts and vocabulary were borrowed from more matur fields. Structure of Discourse and Argument. For the outsider, perhaps the most confusing aspect of the AI/IR literature is the emphasis on the analysis o political argument and discourse. This type of analysis is found in the article: by Boynton; Sylvan, Milliken, and Majeski; and Bennett and Thorson; more sophisticated tools for dealing with discourse are found in the natural language processing (NLP) articles. At first glance, rummaging through the Congressional Record or using Freedom of Information Act requests to uncovei Vietnam-era documents is the antithesis of the formal modeling approach Archival sources are the stuff of history, not models. In fact, these analyses are at the core of modeling organizational cognition. As such the archival work is simply specialized research along the lines of the general psychological studies. The political activities modeled in AI/IR are, without exception, the output of organizations. Because the AI/IR approach assumes that organizations are intentional and knowledge-seeking, it is important to know how they reason. Conveniently, organizations leave a very extensive paper trail of their deliberations. Although archival sources do not contain all of the relevant information required to reconstruct an organization's behavior —organizations engage in

deliberations that are not recorded and occasionally purposely conceal or distort the records of their deliberations— it is certainly worthy of serious consideration.5

Static Modeling: Rule-Based Systems Rule-based systems (RBS) are currently the most common form of AI/IR model, and even systems that go well beyond rules, such as the JESSE simulation, contain substantial amounts of information in the form of rules. Contemporary RBS are largely based on an expert systems framework, but the "production systems" that dominated much of AI modeling from the late 1950s to the early 1970s are also largely based on rules; early production system models of political behavior include Carbonell (1978) in computer science and Sylvan and Thorson (1982) in IR. In addition to chapters by the authors in this volume, other models of international behavior using the rule-based approach have included Soviet crisis response (Kaw 1989), Chinese foreign policy (Tanaka 1986), the political worldview of Jimmy Carter (Lane 1986) and Chinese policy toward Hong Kong (Katzenstein 1989). In its simplest form an RBS is just a large set of if . . . then statements. For example, a typical rule from Job and Johnson's UNCLESAM program —a simulation of U.S. policy toward the Dominican Republic— has the form

IF U.S. Posture to the Dominican Republic government > 4 and Stability Level >= 5 and Stability Level Change > 0 THEN increment CIS. Use of Force Level by 1 An RBS may have hundreds or thousands of such rules; they may exist independently, as in Job and Johnson or production system models, but more 5

Conceptually, this effort is similar to the "cognitive mapping" methodology pursued in Axelrod (1976); the "operational code" studies (e.g., George 1969; George and McKeown 1985) are other antecedents.

typically are organized into hierarchical trees (for example, Hudson in this volume or Kaw 1989). Typical commercial expert systems used for diagnosis or repair have about 5,000 rules; most AI/IR systems are far simpler. Mefford's chapter in this volume describes in considerable detail RBS developments beyond basic if . . . then formulations; one should also note that the boundaries between the more complicated RBS and other types of models (for example, case-based reasoning and machine learning systems) are extremely fuzzy. Nonetheless, virtually all Al models encode some of their knowledge in the form of rules.6 Despite the near ubiquity of rules in computational models, this approach stands in clear contrast to all existing formal modeling traditions in political science, which, without exception, use algebraic formulations to capture information. These methods encode knowledge by setting up a mathematical statement of a problem and then doing some operations on it (in RC models, optimization; in statistics, estimation; in dynamic models, algebraic solution or numerical simulation). The cascading branching of multiple rules found in RBS is seldom if ever invoked; when branches are present they usually only deal with boundary conditions or bifurcations 7 and are simple in structure. Although much of the impetus for the development of RBS in political science came from their success in the expert systems literature, rules are unusually well suited to the study of politics, as much of political behavior is explicitly rulebased through legal and bureaucratic constraints. Laws and regulations are nothing more than rules: These may be vague, and they certainly do not entirely determine behavior, but they constrain behavior considerably. Analyses of the Cuban Missile 6

Neural networks are the primary exception to this characteristic and are a current research focus precisely because they offer an alternative to rule-based formulations. 7

For example an action A would be taken in the expected utility formation E(A) = p(100) + (l-p) (-50) if and only if p > 1/3; the dynamic model xt+1 = axt + b is stable if and only if lal
View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF