source: trunk/essentials/sys-apps/gawk/doc/awkforai.txt

Last change on this file was 3076, checked in by bird, 19 years ago

gawk 3.1.5

File size: 8.3 KB
Line 
1Draft for ACM SIGPLAN Patterns (Language Trends)
2
31996
4
5Why GAWK for AI?
6
7Ronald P. Loui
8
9Most people are surprised when I tell them what language we use in our
10undergraduate AI programming class. That's understandable. We use
11GAWK. GAWK, Gnu's version of Aho, Weinberger, and Kernighan's old
12pattern scanning language isn't even viewed as a programming language by
13most people. Like PERL and TCL, most prefer to view it as a "scripting
14language." It has no objects; it is not functional; it does no built-in
15logic programming. Their surprise turns to puzzlement when I confide
16that (a) while the students are allowed to use any language they want;
17(b) with a single exception, the best work consistently results from
18those working in GAWK. (footnote: The exception was a PASCAL
19programmer who is now an NSF graduate fellow getting a Ph.D. in
20mathematics at Harvard.) Programmers in C, C++, and LISP haven't even
21been close (we have not seen work in PROLOG or JAVA).
22
23Why GAWK?
24
25There are some quick answers that have to do with the pragmatics of
26undergraduate programming. Then there are more instructive answers that
27might be valuable to those who debate programming paradigms or to those
28who study the history of AI languages. And there are some deep
29philosophical answers that expose the nature of reasoning and symbolic
30AI. I think the answers, especially the last ones, can be even more
31surprising than the observed effectiveness of GAWK for AI.
32
33First it must be confessed that PERL programmers can cobble together AI
34projects well, too. Most of GAWK's attractiveness is reproduced in
35PERL, and the success of PERL forebodes some of the success of GAWK.
36Both are powerful string-processing languages that allow the programmer
37to exploit many of the features of a UNIX environment. Both provide
38powerful constructions for manipulating a wide variety of data in
39reasonably efficient ways. Both are interpreted, which can reduce
40development time. Both have short learning curves. The GAWK manual can
41be consumed in a single lab session and the language can be mastered by
42the next morning by the average student. GAWK's automatic
43initialization, implicit coercion, I/O support and lack of pointers
44forgive many of the mistakes that young programmers are likely to make.
45Those who have seen C but not mastered it are happy to see that GAWK
46retains some of the same sensibilities while adding what must be
47regarded as spoonsful of syntactic sugar. Some will argue that
48PERL has superior functionality, but for quick AI applications, the
49additional functionality is rarely missed. In fact, PERL's terse syntax
50is not friendly when regular expressions begin to proliferate and
51strings contain fragments of HTML, WWW addresses, or shell commands.
52PERL provides new ways of doing things, but not necessarily ways of
53doing new things.
54
55In the end, despite minor difference, both PERL and GAWK minimize
56programmer time. Neither really provides the programmer the setting in
57which to worry about minimizing run-time.
58
59There are further simple answers. Probably the best is the fact that
60increasingly, undergraduate AI programming is involving the Web. Oren
61Etzioni (University of Washington, Seattle) has for a while been arguing
62that the "softbot" is replacing the mechanical engineers' robot as the
63most glamorous AI testbed. If the artifact whose behavior needs to be
64controlled in an intelligent way is the software agent, then a language
65that is well-suited to controlling the software environment is the
66appropriate language. That would imply a scripting language. If the
67robot is KAREL, then the right language is "turn left; turn right." If
68the robot is Netscape, then the right language is something that can
69generate "netscape -remote 'openURL(http://cs.wustl.edu/~loui)'" with
70elan.
71
72Of course, there are deeper answers. Jon Bentley found two pearls in
73GAWK: its regular expressions and its associative arrays. GAWK asks
74the programmer to use the file system for data organization and the
75operating system for debugging tools and subroutine libraries. There is
76no issue of user-interface. This forces the programmer to return to the
77question of what the program does, not how it looks. There is no time
78spent programming a binsort when the data can be shipped to /bin/sort
79in no time. (footnote: I am reminded of my IBM colleague Ben Grosof's
80advice for Palo Alto: Don't worry about whether it's highway 101 or 280.
81Don't worry if you have to head south for an entrance to go north. Just
82get on the highway as quickly as possible.)
83
84There are some similarities between GAWK and LISP that are illuminating.
85Both provided a powerful uniform data structure (the associative array
86implemented as a hash table for GAWK and the S-expression, or list of
87lists, for LISP). Both were well-supported in their environments (GAWK
88being a child of UNIX, and LISP being the heart of lisp machines). Both
89have trivial syntax and find their power in the programmer's willingness
90to use the simple blocks to build a complex approach.
91
92Deeper still, is the nature of AI programming. AI is about
93functionality and exploratory programming. It is about bottom-up design
94and the building of ambitions as greater behaviors can be demonstrated.
95Woe be to the top-down AI programmer who finds that the bottom-level
96refinements, "this subroutine parses the sentence," cannot actually be
97implemented. Woe be to the programmer who perfects the data structures
98for that heapsort when the whole approach to the high-level problem
99needs to be rethought, and the code is sent to the junkheap the next day.
100
101AI programming requires high-level thinking. There have always been a few
102gifted programmers who can write high-level programs in assembly language.
103Most however need the ambient abstraction to have a higher floor.
104
105Now for the surprising philosophical answers. First, AI has discovered
106that brute-force combinatorics, as an approach to generating intelligent
107behavior, does not often provide the solution. Chess, neural nets, and
108genetic programming show the limits of brute computation. The
109alternative is clever program organization. (footnote: One might add
110that the former are the AI approaches that work, but that is easily
111dismissed: those are the AI approaches that work in general, precisely
112because cleverness is problem-specific.) So AI programmers always want
113to maximize the content of their program, not optimize the efficiency
114of an approach. They want minds, not insects. Instead of enumerating
115large search spaces, they define ways of reducing search, ways of
116bringing different knowledge to the task. A language that maximizes
117what the programmer can attempt rather than one that provides tremendous
118control over how to attempt it, will be the AI choice in the end.
119
120Second, inference is merely the expansion of notation. No matter whether
121the logic that underlies an AI program is fuzzy, probabilistic, deontic,
122defeasible, or deductive, the logic merely defines how strings can be
123transformed into other strings. A language that provides the best
124support for string processing in the end provides the best support for
125logic, for the exploration of various logics, and for most forms of
126symbolic processing that AI might choose to call "reasoning" instead of
127"logic." The implication is that PROLOG, which saves the AI programmer
128from having to write a unifier, saves perhaps two dozen lines of GAWK
129code at the expense of strongly biasing the logic and representational
130expressiveness of any approach.
131
132I view these last two points as news not only to the programming language
133community, but also to much of the AI community that has not reflected on
134the past decade's lessons.
135
136In the puny language, GAWK, which Aho, Weinberger, and Kernighan thought
137not much more important than grep or sed, I find lessons in AI's trends,
138AI's history, and the foundations of AI. What I have found not only
139surprising but also hopeful, is that when I have approached the AI
140people who still enjoy programming, some of them are not the least bit
141surprised.
142
143
144R. Loui (loui@ai.wustl.edu) is Associate Professor of Computer Science,
145at Washington University in St. Louis. He has published in AI Journal,
146Computational Intelligence, ACM SIGART, AI Magazine, AI and Law, the ACM
147Computing Surveys Symposium on AI, Cognitive Science, Minds and
148Machines, Journal of Philosophy, and is on this year's program
149committees for AAAI (National AI conference) and KR (Knowledge
150Representation and Reasoning).
Note: See TracBrowser for help on using the repository browser.