293x Filetype PDF File size 1.73 MB Source: staff.um.edu.mt
programming
by @z Bentley pearls
LITTLE LANGUAGES
When you say “language,” most programmers think the June column, for instance, Doug McIlroy imple-
of the big ones, like FORTRAN or COBOL or Pascal. mented a program to find the K most common words
In fact, a language is any mechanism to express in- in a document in six lines of the UNIXe SHELL
tent, and the input to many programs can be viewed language.
profitably as statements in a language. This column Languages surround programmers, yet many pro-
is about those “little languages.” grammers don’t exploit linguistic insights. Examin-
Programmers deal with microscopic languages ev- ing programs under a linguistic light can give you a
ery day. Consider printing a floating-point number better understanding of the tools you now use, and
in six characters, including a decimal point and two can teach you design principles for building elegant
subsequent digits. Rather than writing a subroutine interfaces to your future programs. This column will
for the task, a FORTRAN programmer specifies the show how the user interfaces to half a dozen inter-
format ~6.2, and a COBOL programmer defines the esting programs can be viewed as little languages.
picture 999.99. Each of these descriptions is a This column is built around Brian Kernighan’s PIC
statement in a well-defined little language. While language for making line drawings. Its compiler is
the languages are quite different, each is appropriate implemented on the UNIX system, which is particu-
for its problem domain: although a FORTRAN pro- larly supportive (and exploitative) of language pro-
grammer might complain that 999999.99999 cessing; the sidebar on pages 714-715 shows how
is too long when F 12.5 could do the job, the little languages can be implemented in a more prim-
COBOLer can’t even express in FORTRAN such itive computing environment (BASIC on a personal
common financial patterns as $ , $ $ $ ) $ $9.99. computer).
FORTRAN is aimed at scientific computing, COBOL The next section introduces PIC and the following
is designed for business. section compares it to alternative systems. Subse-
In the good old days, real programmers would quent sections discuss little languages that compile
swagger to a key punch and, standing the whole into PIC and little languages used to build PIC.
time, crank out nine cards like:
The PIC Language
//SUMMARY JOB REGION=(lOOK,50K) If you’re talking about compilers, you might want to
// EXEC PGM=SUMMAR depict their behavior with a picture:
//SYSIN DD DSNAME=REP.8601,DISP=OLD,
// UNIT=2314,SPACE=(TRK,(1,1,1)),
// VOLUME=SER=577632 Compiler
//.SYSOUT DD DSNAME=SUM.86Dl,DISP=(,KEEP),
// UNIT=2314,SPACE=(TRK,(l,l,l~~, FIGURE 1. A Simple View of a Compiler
// VOLUME=SER=577632
//SYSABEND DD SYSOUT=A (This diagram is genuine PIC output; we’ll see its
input description shortly.) Or you may desire a little
Today’s young whippersnappers do this simple job more detail about the internal structure:
by typing r---------------~
t Front End BackEnd I
summarize jan.summary
Modern successors to the old “job control” languages 1 I
are not only more convenient to use, they are funda- L---------------J
mentally more powerful than their predecessors. In Compiler
0 1966 ACM OOOI-0782/86/0800-0711 7W UNIX is a trademark of AT&T Bell Laboratories.
August 1986 Volume 29 Number 8 Communications of the ACM
Programming Pearls
This diagram also describes the two tasks that a pro- the string. These devices were used to draw Figure
gram for drawing pictures must perform: a back end 2, which gives a yet more detailed view of a com-
draws the picture while a front end interprets user piler.
commands to decide what picture to draw.
And just how does a user describe a picture?
There are (broadly) three ways to do the job. An
interactive program allows the user to watch the
program as it is drawn, and a subroutine library
adds picture primitives to the constructs in a pro- Analysis
gramming language. We’ll return to these ap- 4
Syntax
proaches in the next section. Analysis
The third approach to describing pictures is the i
topic of this column: a little language. In Kernighan’s Semantic Error
Analysis Handler
PIC language, for instance, Figure 1 is described as i
Code
ellipse “Source” “Code” Generation
arrow 1
Code
box “Compiler” Ontimization
arrow
ellipse “Object” “Code”
The first input line draws an ellipse of default size
and stacks the two strings at its center. The second FIGURE 2. A Detailed View of a Compiler
line draws an arrow in the default direction (moving
right), and the third line draws a box with the text Any particular compiler translates one source lan-
at its center. The implicit motion after each object guage into one object language. How can an organi-
makes it easy to draw the picture and convenient to zation run 5 different languages on 5 different ma-
add new objects to an existing picture. chines? A brute-force approach writes 25 compilers:
Other PIC mechanisms are illustrated in this non-
sense picture (which is a simple version of Figure 2):
5 languages
25 compilers
B2
5 machines u
It was drawn by An intermediate language circumvents much of this
boxht = . 25; boxwid = .25 complexity. For each language there is a front end
down # default direction that translates into the intermediate language, and
Bl : box “Bl” for each machine there is a back end that translates
arrow the intermediate language into the machine’s output
B2: box code.
“B2 ” at B2.w rjust
line right .4 from B2.e
B3: box dashed wid .4 “B3”
line <--> from B3.n to B1.e
The boxht and boxwid variables represent the de-
fault height and width of a box in inches; those 5 machines &
values can also be explicitly set in the definition of a
particular box. Text following the # character is a FIGURE 3. Five Languages for Five Machines
comment (up to the end of the line). Labels such as
B 1, B2, and B3 name objects (LongerName is fine If there are L languages on M machines, the brute-
too); the western point of box B2 is referred to as force approach constructs L x M distinct compilers,
B2. w. A line of the form string at position places a while the intermediate language needs just L front
text string at a given position; r just right-justifies ends and M back ends. (PIC compiles its output into
712 Communications of the ACM August 1986 Volume 29 Number 8
Programming Pearls
a picture-drawing subset of the TROFF typesetting PIG’s programming constructs allow the picture to
language, which in turn produces an intermediate be drawn easily:
language suitable for interpretation on a number of
output devices, from terminal display programs to pi = 3.14159; n = 10; r = .5
laser printers to phototypesetters.) s = 2*pi/n
Figure 3 uses two of PIG’s programming facilities, for i = 1 to n-l do I
variables and for loops: for j = i+l to n do I
line from r*cos(s*i), r*sin(s*i)\
n=5 to r*cos(s*j), r*sin(s*j)
boxht = boxwid = .2
h = .3; w = .35
I: box at w*(n+l)/2,0 (The backslash character \ at the end of a line con-
for i = 1 to n do { tinues it on the next line.)
box with .s at irw, h But handy as such features are, doesn’t parsimony’
line from last box.s to 1.n dictate that variables and for loops properly belong
box with .n at iiw, -h in a full programming language? This concern is ad-
line from last box.n to 1.s dressed by a subroutine library that adds pictures to
1 the primitives supported by a given language. Given
” 1 intermediate language " at 1.w rjust asubroutineline(x1, yl, x2, y~),onecould
" 5 languages " at 2nd box .w rjust easily draw the last picture in Pascal:
” 5 machines " at 3rd box .w rjust
pi := 3.14159; n := 10; r := 0.5;
The picture of the brute-force approach is described S := Z*pi/n;
by a single loop to draw the boxes, followed by two for i := 1 to n-l do
nested loops to make all pairwise interconnections. for j := i+l to n do
The examples in this section should give you an line (r*cos(s*i), r*sin(s*i),
idea of the structure of PIC, but they only hint at its r*cos(s*j), r*sin(s*j) );
complete power. I have not mentioned a number of
PIG’s facilities; such as built-in functions, if state- Unfortunately, to draw this picture
ments, macro processing, file inclusion, and a simple
block structure. Processor
Perspective
In this section we’ll consider several approaches to one must write, compile, and execute a program
picture-drawing programs and compare them to containing subroutine calls like:
Kernighan’s PIC language. Although the particulars
are for pictures, the general lessons apply to design- ellipse(0.3, 0, 0.6, 0.4)
ing user interfaces for many kinds of programs. text(0.3, 0, "Input")
An interactive drawing program allows the user to arrow(0.75, 0, 0.3, 0)
enter a picture with a spatial input device (such as a box(l.2, 0, 0.6, 0.4)
mouse or a drawing pad) and displays the picture as text(l.2, 0, nProcessorn)
it is drawn. Most interactive systems have a menu arrow(1.65, 0, 0.3, 0)
that includes items such as boxes, ellipses, and lines ellipse(2.1, 0, 0.6, 0.4)
of various flavors (vertical, horizontal, dotted, etc.). text(2.1, 0, "Output")
Immediate feedback makes such systems quite com- (And even such simple code may be too hard for
fortable for drawing many simple pictures, but some nonprogrammers who find PIC comfortable,
drawing the following picture on an interactive sys- such as technical typists or software managers.) The
tem would require a steady hand and the patience first two arguments to each routine give the x and y
of Job: coordinates of the center of the object; later argu-
ments give its width and height or a text string.
These routines are rather primitive; more clever
’ Arguments beyond taste suggest that PIG’s for loops may be inappropriate:
their syntax differs from similar loops elsewhere in the UNIX system, and
PIG’s for loom are a few orders of maenitude slower than those in other
languages. P&s may write loops in oyher languages to generate PIC output:
I am a delighted (if morally compromised) user of PIG’s for loops-the quilts
and stereograms in the exercises were easy to generate using that construct.
August 1986 Volume 29 Number 8 Communications of the ACM 713
A Little Language for Surveys Input: An interactive program can administer the
Once a public opinion pollster knows the questions survey from this description and store the results
to ask in a survey, there are a number of data pro- in the database. If an organization uses paper
cessing problems to be faced: questionnaires, the description is used by a
input: Most organizations administer the survey to “pretty-print” program to prepare the master copy
a respondent using a paper questionnaire: the re- and by a data-entry program to describe record
sponses are later keyed into a database. Other or- formats.
ganizations administer questions by computers Validation: From a description like Program 1, a
that record the responses online. program can ensure that all questions are an-
Validation: There are a number of checks for con- swered and that all responses are in a legal range.
sistency and completeness, ranging from global is- We’ll see shortly how another little language can
sues (are all respondents accounted for?) to local be used to check more subtle constraints.
ones (were “Democrat Only” questions adminis- Tabulafion and Output: The description in Program
tered to all and only Democrats?). 1 provides the bulk of the input to the program
Tabulation and Output: Once the questionnaire that produces the final report of a survey. The
database is complete, the responses must be tabu- user also specifies in a simple language the titles
lated and a final report prepared. to appear on the report, which questions should
be cross-tabulated, and headings for the cross-
One approach to these problems is to write a new tabulations.
program from scratch for each task for each survey.
This sidebar sketches how a single little language Just as a FORTRAN description of a program can be
can solve all the problems. compiled and executed on several different kinds of
Program 1 illustrates a little language I once im- computers, one description of a survey can be inter-
plemented in BASIC on a personal computer. Each preted to perform several different tasks.
line that begins with a “Q” describes a question: I have neglected a ton of details that complicate
Question 1, for instance, is stored in column 5 of all survey programs. For instance, even though the
each record, and asks the respondent’s political questions were asked in one order, the user might
party. The next three lines are the three possible want them to appear on the output in a different
responses to the questions; allowing the user to in- order [say, from greatest to least frequency of re-
dent the responses under the question makes the file sponse); we’ll see several other complications
easier to read. shortly. When I first designed the program, I
The single language can serve as input to several sketched half a dozen bells and whistles before I
programs. realized that such was the way of folly: I could
never anticipate all the options a user might desire,
and any program that dealt with all options would
Q1,5 What is your political party? be a rat’s nest of code.
1 Democrat I therefore looked for a general mechanism that
2 Republican could handle the problems, and finally settled on a
3 Other construct I called pseudocolumns. The “real” data was
Q2,6 For whom did you vote in 19841 stored in columns 1 through 250 of the input record.
1 Reagan/Bush As each record is read, the program generates pseu-
2 Mondale/Ferraro docolumns (defined by a little language) that start at
3 Named Other Candidate column 251. Program 3 states that column 5 contains
4 Didn't Vote party information in the order Democrat, Republi-
5 Don't Know
Q3,7 Where is your polling place? can, Other. To print Republicans before Democrats,
1 Named a place one could define column 251 as follows:
2 Did not name a place define 251
Q4,8 In this election, are you I if 5 is 2 X Rep
1 Very interested 2 if 5 is 1 # Dem
2 Somewhat interested 3 otherwise # Other
3 Not interested (As in PIC, the # character introduces a comment.)
The user can now refer to column 251 as any other
PROGRAM 1. A Description of a Survey column:
August 1986 Volume 29 Number 8
no reviews yet
Please Login to review.