298x Filetype PDF File size 0.11 MB Source: web.stanford.edu
.
John K. Ousterhout
Sun Microsystems Laboratories
Scripting: Higher-
Cybersquare Level Programming
for the 21st Century
Increases in computer speed and changes in the
application mix are making scripting languages
more and more important for the applications of the
future. Scripting languages differ from system
programming languages in that they are designed for
“gluing” applications together. They use typeless
approaches to achieve a higher level of programming
and more rapid application development than
system programming languages.
or the past 15 years, a fundamental change has been ated with system programming languages and glued
occurring in the way people write computer programs. together with scripting languages. However, several
FThe change is a transition from system programming recent trends, such as faster machines, better script-
languages such as C or C++ to scripting languages such ing languages, the increasing importance of graphical
as Perl or Tcl. Although many people are participat- user interfaces (GUIs) and component architectures,
ing in the change, few realize that the change is occur- and the growth of the Internet, have greatly expanded
ring and even fewer know why it is happening. This the applicability of scripting languages. These trends
article explains why scripting languages will handle will continue over the next decade, with more and
many of the programming tasks in the next century more new applications written entirely in scripting
better than system programming languages. languages and system programming languages used
Scripting languages are designed for different tasks primarily for creating components.
than are system programming languages, and this
leads to fundamental differences in the languages. SYSTEM PROGRAMMING LANGUAGES
System programming languages were designed for To understand the differences between scripting lan-
building data structures and algorithms from scratch, guages and system programming languages, it is
starting from the most primitive computer elements important to understand how system programming
such as words of memory. In contrast, scripting lan- languages evolved. System programming languages
guages are designed for gluing: They assume the exis- were introduced as an alternative to assembly lan-
tence of a set of powerful components and are guages. In assembly languages, virtually every aspect
intended primarily for connecting components. System of the machine is reflected in the program. Each state-
programming languages are strongly typed to help ment represents a single machine instruction and pro-
manage complexity, while scripting languages are type- grammers must deal with low-level details such as
less to simplify connections among components and register allocation and procedure-calling sequences.
provide rapid application development. As a result, it is difficult to write and maintain large
Scripting languages and system programming lan- programs in assembly languages.
guages are complementary, and most major comput- By the late 1950s, higher-level languages such as
ing platforms since the 1960s have included both kinds Lisp, Fortran, and Algol began to appear. In these lan-
of languages. The languages are typically used together guages, statements no longer correspond exactly to
in component frameworks, where components are cre- machine instructions; a compiler translates each state-
0018-9162/98/$10.00 © 1998 IEEE March 1998 23
.
Scripting languages ment in the source program into a sequence of solely by the way it is used, not by any initial promises.
assume that a binary instructions. Over time a series of sys- Modern computers are fundamentally typeless. Any
collection of useful tem programming languages evolved from word in memory can hold any kind of value, such as
Algol, including PL/1, Pascal, C, C++, and Java. an integer, a floating-point number, a pointer, or an
components already System programming languages are less efficient instruction. The meaning of a value is determined by
exist in other than assembly languages but they allow appli- how it is used. If the program counter points at a word
languages. They are cations to be developed much more quickly. As of memory then it is treated as an instruction; if a word
a result, system programming languages have is referenced by an integer add instruction, then it is
intended not for almost completely replaced assembly languages treated as an integer; and so on. The same word can
writing applications for the development of large applications. be used in different ways at different times.
from scratch but In contrast, today’s system programming languages
rather for combining Higher-level languages are strongly typed. For example:
System programming languages differ from
components. assembly languages in two ways: they are higher • Each variable in a system programming language
level and they are strongly typed. The term must be declared with a particular type such as
“higher level” means that many details are han- integer or pointer to string, and it must be used
dled automatically, so programmers can write in ways that are appropriate for the type.
less code to get the same job done. For example: • Data and code are segregated; it is difficult if not
impossible to create new code on the fly.
• Register allocation is handled by the compiler so • Variables can be collected into structures or
that programmers need not write code to move objects with well-defined substructure and pro-
information between registers and memory. cedures or methods to manipulate them. An
• Procedure calling sequences are generated auto- object of one type cannot be used where an object
matically; programmers need not worry about of a different type is expected.
moving arguments to and from the call stack.
• Programmers can use simple keywords such as Typing has several advantages. First, it makes large
whileand if for control structures; the com- programs more manageable by clarifying how things
piler generates all the detailed instructions to are used and differentiating among things that must be
implement the control structures. treated differently. Second, compilers use type infor-
mation to detect certain kinds of errors, such as an
On average, each line of code in a system pro- attempt to use a floating-point value as a pointer.
gramming language translates to about five machine Third, typing improves performance by allowing com-
instructions, compared with one instruction per line in pilers to generate specialized code. For example, if a
an assembly program. (In an informal analysis of eight compiler knows that a variable always holds an inte-
C files written by five different people, I found that the ger value, then it can generate integer instructions to
1 manipulate the variable; if the compiler does not know
ratio ranged from three to seven instructions per line;
in a study of numerous languages, Capers Jones found the type of a variable, then it must generate additional
that, for a given task, assembly languages require three instructions to check the variable’s type at runtime.
to six times as many lines of code as system program- Figure 1 compares a variety of languages on the
2 basis of their level of programming and strength of
ming languages. ) Programmers can write roughly the
same number of lines of code per year regardless of typing.
3
language, so system programming languages allow
applications to be written much more quickly than SCRIPTING LANGUAGES
4 5 6
assembly languages. Scripting languages such as Perl, Python, Rexx,
Tcl,7Visual Basic, and the Unix shells represent a very
Typing different style of programming than do system pro-
The second difference between assembly languages gramming languages. Scripting languages assume that
and system programming languages is typing. I use a collection of useful components already exist in other
the term typing to refer to the degree to which the languages. They are intended not for writing applica-
meaning of information is specified in advance of its tions from scratch but rather for combining compo-
use. In a strongly typed language, the programmer nents. For example, Tcl and Visual Basic can be used
declares how each piece of information will be used, to arrange collections of user interface controls on the
and the language prevents the information from being screen, and Unix shell scripts are used to assemble fil-
used in any other way. In a weakly typed language, ter programs into pipelines. Scripting languages are
there are no a priori restrictions on how information often used to extend the features of components; how-
can be used; the meaning of information is determined ever, they are rarely used for complex algorithms and
24 Computer
.
data structures, which are usually provided by the com-
ponents. Scripting languages are sometimes referred to 1,000 Scripting
as glue languages or system integration languages.
Scripting languages are generally typeless Visual Basic
To simplify the task of connecting components, 100
scripting languages tend to be typeless. All things look
and behave the same so that they are interchangeable. Tcl/Perl
For example, in Tcl or Visual Basic a variable can hold Java
a string one moment and an integer the next. Code C++
and data are often interchangeable, so that a program 10 C
can write another program and then execute it on the
fly. Scripting languages are often string-oriented, as Instruction/statement
this provides a uniform representation for many dif- Assembly
ferent things. System programming
A typeless language makes it much easier to hook 1
together components. There are no a priori restrictions
on how things can be used, and all components and None Strong
values are represented in a uniform fashion. Thus any Degree of typing
component or value can be used in any situation; com-
ponents designed for one purpose can be used for This command creates a new button control that dis- Figure 1. A compari-
totally different purposes never foreseen by the plays a text string in a 16-point Times font and prints son of various
designer. For example, in the Unix shells all filter pro- a short message when the user clicks on the control. programming
grams read a stream of bytes from an input and write The command mixes six different types of things in a languages based on
a stream of bytes to an output. Any two programs can single statement: a command name (button), a but- their level (higher-
be connected by attaching the output of one program ton control (.b), property names (-text, -font, and level languages exe-
to the input of the other. The following shell command -command), simple strings (Hello! and hello), a cute more machine
stacks three filters together to count the number of lines font name (Times 16) that includes a typeface name instructions for each
in the selection that contain the word “scripting”: (Times) and a size in points (16), and a Tcl script language statement)
(puts hello). Tcl represents all of these things uni- and their degree of
select | grep scripting | wc formly with strings. In this example, the properties typing. System pro-
can be specified in any order and unspecified proper- gramming languages
The selectprogram reads the text that is currently ties are given default values; more than 20 properties such as C tend to be
selected on the display and prints the text on its out- were left unspecified in the example. strongly typed and
put; the grep program reads its input and prints on The same example requires seven lines of code in medium level (five to
its output the lines containing “scripting”; the wc pro- two methods when implemented in Java. With C++ 10 instructions per
gram counts the number of lines on its input. Each of and Microsoft Foundation Classes (MFC), it requires statement). Scripting
these programs can be used in numerous other situa- about 25 lines of code in three procedures.1 Just set- languages such as Tcl
tions to perform different tasks. ting the font requires several lines of code in MFC: tend to be weakly
The strongly typed nature of system programming typed and very high
languages discourages reuse. It encourages program- CFont *fontPtr = new CFont(); level (100 to1,000
mers to create a variety of incompatible interfaces, each fontPtr->CreateFont(16, 0, 0, 0, 700, instructions per state-
of which requires objects of specific types. The com- 0, 0, 0, ANSI_CHARSET, ment).
piler prevents any other types of objects from being OUT_DEFAULT_PRECIS,
used with the interface, even if that would be useful. So CLIP_DEFAULT_PRECIS,
to use a new object with an existing interface, the pro- DEFAULT_QUALITY,
grammer must write conversion code to translate DEFAULT_PITCH|FF_DONTCARE,
between the type of the object and the type expected by “Times New Roman”);
the interface. This in turn requires recompiling part or buttonPtr->SetFont(fontPtr);
all of the application; many applications today are dis-
tributed in binary form so this is not possible. Much of this code is a consequence of the strong typ-
To appreciate the advantages of a typeless language, ing. To set the font of a button, its SetFont method
consider the following Tcl command: must be invoked, but this method must be passed a
pointer to a CFontobject. This in turn requires a new
button .b -text Hello! -font {Times object to be declared and initialized. To initialize the
16} -command {puts hello} CFont object its CreateFont method must be
March 1998 25
.
It might seem that invoked, but CreateFonthas a rigid interface length strings in situations where a system programming
the typeless nature of that requires 14 different arguments to be spec- language would use a binary value that fits in a single
ified. In Tcl, the essential characteristics of the machine word, and they use hash tables where system
scripting languages font (typeface Times, size 16 points) can be used programming languages use indexed arrays.
could allow errors to immediately with no declarations or conversions. Fortunately, the performance of a scripting language
go undetected, but in Furthermore, Tcl allows the button’s behavior is not usually a major issue. Applications for scripting
to be included directly in the command that cre- languages are generally smaller than applications for
practice scripting ates the button, while C++ and Java require it to system programming languages, and the performance
languages are just as be placed in a separately declared method. of a scripting application tends to be dominated by the
safe as system (In practice, a trivial example like this would performance of the components, which are typically
programming probably be handled with a graphical develop- implemented in a system programming language.
ment environment that hides the complexity of Scripting languages are higher level than system pro-
languages. the underlying language. The user enters prop- gramming languages in the sense that a single state-
erty values in a form and the development envi- ment does more work on average. A typical statement
ronment outputs the code. However, in more in a scripting language executes hundreds or thou-
complex situations, such as conditional assign- sands of machine instructions, whereas a typical state-
ment of property values or interfaces generated pro- ment in a system programming language executes
grammatically, the developer must write code in the about five machine instructions (as Figure 1 illus-
underlying language.) trates). Part of this difference is because scripting lan-
It might seem that the typeless nature of scripting lan- guages use interpreters, but much of the difference is
guages could allow errors to go undetected, but in prac- because the primitive operations in scripting languages
tice scripting languages are just as safe as system have greater functionality. For example, in Perl it is
programming languages. For example, an error will about as easy to invoke a regular expression substitu-
occur if the font size specified for the button example tion as it is to invoke an integer addition. In Tcl, a
above is a noninteger string such as xyz. Scripting lan- variable can have traces associated with it so that set-
guages do their error checking at the last possible ting the variable causes side effects; for example, a
moment, when a value is used. Strong typing allows trace might be used to keep the variable’s value
errors to be detected at compile time, so the cost of run- updated continuously on the screen.
time checks is avoided. However, the price to be paid Scripting languages allow rapid development of glu-
for efficiency is restrictions on how information can be ing-oriented applications. Table 1 provides anecdotal
used; this results in more code and less flexible programs. support for this claim. It describes several applications
that were implemented in a system programming lan-
Scripting languages are interpreted guage and then reimplemented in a scripting language
Another key difference between scripting languages or vice versa. In every case, the scripting version
and system programming languages is that scripting lan- required less code and development time than the sys-
guages are usually interpreted, whereas system pro- tem programming version; the difference varied from
gramming languages are usually compiled. Interpreted a factor of two to a factor of 60. Scripting languages
languages provide rapid turnaround during development provided less benefit when they were used for the first
by eliminating compile times. Interpreters also make implementation; this suggests that any reimplementa-
applications more flexible by allowing users to program tion benefits substantially from the experiences of the
the applications at runtime. For example, many synthe- first implementation and that the true difference
sis and analysis tools for integrated circuits include a Tcl between scripting and system programming is more
interpreter; users of the programs write Tcl scripts to like a factor of five to 10, rather than the extreme
specify their designs and control the operation of the points of the table. The benefits of scripting also
tools. Interpreters also allow powerful effects to be depend on the application. In the last example in Table
achieved by generating code on the fly. For example, a 1, the GUI part of the application is gluing-oriented
Tcl-based Web browser can parse a Web page by trans- but the simulator part is not; this might explain why
lating the HTML for the page into a Tcl script using a the application benefited less from scripting than other
few regular expression substitutions. It then executes the applications.
Tcl script to render the page on the screen. The information in the table was provided by vari-
Scripting languages are less efficient than system pro- ous Tcl developers in response to an article posted on
gramming languages, in part because they use inter- the comp.lang.tcl newsgroup.1
preters instead of compilers but also because their basic
components are chosen for power and ease of use rather DIFFERENT TOOLS FOR DIFFERENT TASKS
than an efficient mapping onto the underlying hardware. A scripting language is not a replacement for a sys-
For example, scripting languages often use variable- tem programming language or vice versa. Each is
26 Computer
no reviews yet
Please Login to review.