The Implementation of Etude, An Integrated and Interactive Document Production System

Michael Hammer, Richard Ilson, Tim Anderson, Edward Gilbert,
Michael Good, Bahram Niamir, Larry Rosenstein, and Sandor Schoichet

Laboratory for Computer Science
Massachusetts Institute of Technology
Cambridge, Massachusetts, USA

Originally published in Proceedings of the SIGPLAN SIGOA Symposium on Text Manipulation (Portland, Oregon, June 8-10, 1981), SIGPLAN Notices, 16 (6), June 1981, pp. 137-146. Included here with permission. Copyright © 1981 by ACM, Inc.

1. Introduction

Etude is an experimental text processing system that is being developed in order to formulate and evaluate new approaches to the design of user interfaces for office automation tools. The primary design goal for Etude is to provide the user with substantial functionality in the editing and formatting of documents in the context of a system that is easy to learn and use.

Office workers can now have access to the technology that will allow them to produce documents of typeset quality, due to the dramatic increase in quality and decrease in cost of ever more powerful and flexible output devices, such as laser and electrostatic printers. However, the control of such equipment as provided by conventional formatting systems introduces substantial complexity into the document production process and effectively requires that the operator of the system be an amateur typographer. The commands the operator must employ to specify the appearance of a document are low-level and very detailed. Moreover, the operator must engage in a lengthy process consisting of initial specifications, printing on the output device, revision of specifications, additional printing, and so on, until the output has the desired appearance.

Etude seeks to address these problems by providing an environment in which the document is displayed to the operator as he is working on it, and in which the appearance of this document is specified in high-level terms based on the natural structure of the document. Etude is designed to operate on a high-resolution screen which can present to the operator a representation of a document as it would be produced on a typeset quality output device. Formatting information is provided by specifying the type of the document (letter, report, memo, or the like), and by identifying the components of the document that are associated with its particular type (such as return address. salutation, body, and closing for a letter). Etude has drawn from the Scribe system in this regard [8, 9]. The Etude system has the responsibility of translating these descriptors into detailed formatting instructions for the screen and the printing device.

Etude also seeks to support and enhance the entire process by which an operator interacts with a system to produce a document, encompassing both the editing and formatting functions. The design of the user interface plays the dominant role in achieving this end. Etude seeks to lower the anxiety factor typically associated with word processing and other computer-based document production systems by providing a command structure that is accessible and comprehensible to a novice user, together with a variety of user aids. These include on-line menu and help facilities, as well as the ability to “undo” any completed activity; the latter makes the entire document production process less susceptible to user error. Moreover, Etude seeks to avoid the conflict between a system that is easy for a novice to learn and one that is convenient for an experienced user. This is accomplished by providing the user with a diversity of interaction modes, ranging from succinct command codes through specialized function keys to detailed menus and prompts. The user has the choice of which mode to employ, and can readily shift from one to another. All share a consistent underlying framework.

The Etude user interface is summarized in Section 2 below; further details may be found in Hammer et al. [1] and the complete specifications [2]. While Etude does resemble a number of other systems in some of its individual constructs (notably Scribe and Bravo [4]), it also provides a number of innovative features, and is unique in its integration of a wide range of facilities and in its approach to the total document production process. A prototype implementation of the Etude system has been completed, and a second version is now being developed. In attempting to implement the range of Etude facilities with a modicum of efficiency (although performance was not our principal design goal for the prototype), a number of major issues relating to the construction of text processing systems were confronted. The purpose of this paper is to identify these issues and describe the approaches taken to them. Section 3 summarizes the overall architecture of the Etude prototype implementation, and Section 4 describes the way in which Etude represents documents: this representation is fundamental to all that follows. Sections 5 and 6 discuss the ways in which this representation is used to support editing and formatting, and document display on the screen. Section 7 summarizes the approach taken to handling the user interface, especially the facilities that support the user. Section 8 describes how the Etude system interacts with its terminal device.

2. Overview of Etude

Etude is designed to operate on a hardware base that provides a high-resolution bit-map display device. (Etude may also be used with conventional CRT screens, but only a page-size bit-map display will take advantage of all of Etude’s capabilities.) The Etude operator sees the display divided into a number of windows. The center portion of the screen is the text window, in which a full page of formatted text is displayed. This page may include text in a variety of type styles and sizes, may be right-justified, may exhibit proportional spacing, and may be organized in a variety of page layout formats, as determined by the document’s component specifications. The display is intended to represent the appearance of the document as it would be printed by a typeset quality output device; it is constantly maintained to reflect the current status of the document as it is changed in the course of the editing process. The left-margin of the page serves as the format window, in which the components of the document’s structure are indicated. (Etude has several user-selectable pagination and display modes, which determine whether page-breaks are recomputed dynamically or by explicit request, whether components are displayed or suppressed in the format window, and so on.)

Part of the screen is reserved for interaction and status windows; the former displays user commands as they are entered and system responses, as well as error information; the latter shows a variety of contextual information. Help information and menus are displayed on request in special windows that “pop up” on the screen. These windows are placed so as not to obscure the area of current interest to the operator, which in general is defined to be the area around the position of the cursor.

Etude commands are structured like English imperatives; they consist of a verb phrase and up to two noun phrases. A noun phrase consists of a series of modifiers and an object. Etude provides a number of basic objects (character, word, sentence, line, paragraph, column, page. and document) that are widely used in many different contexts, as well as a set of document-specific components. There are four basic modifiers (next, previous, start-of, and end-of); any positive integer may also serve as a modifier. Modifiers and objects may be combined as in English; e.g. start-of paragraph, end-of next 3 sentence(s), previous 10 line(s).

Etude verbs may be divided into five categories as determined by their function:

User aids (undo, help, menu, go ahead, cancel, again)
Cursor movement (go-to, arrow keys)
Region definition (begin, end)
Editing (erase, copy, move, label, back-space, back-word)
Formatting (make, remove, change, split, combine)

The editing verbs are similar to those encountered in conventional text-editing systems. The editing verbs (and some others) are applied to a region of text, which may be identified by a noun phase or by means of explicit region definition (see below). (The prototype system implements only a basic set of editing verbs; a more complete set will be provided in the next version.)

The formatting verbs are used to manipulate the component structure of a document. Make is used to associate a component with already existing text while change and remove alter or remove a component associated with some text. Split and combine provide additional ways to manipulate the component structure.

To move the cursor, the operator may use either the cursor control keys, or the go-to command followed by a noun phrase. If the noun phrase starts with a non-numeric modifier, the go-to command may be omitted.

Region definition is not done as an independent operation in Etude, but rather in the context of a command such as move, erase, or label. A region is a sequence of text, and may be defined either by using a single noun phrase, or by using begin and end to bracket a series of explicit cursor movements.

The user aids are critical elements in the plan to make Etude truly easy-to-use. As implemented, undo will reverse the effects of the last operation performed. All operations except labeling, anchoring, and region definition are undoable in the prototype version. Although the prototype only allows the most recent operation to be undone, the system architecture supports a completely general reversal facility, and a richer user interface would allow an arbitrary number of operations to be undone. Help is available at any time in the session, and provides information about what the user has been doing and what his options are at the current point. Cancel terminates the specification of an operation, while again repeats the last command. (This latter facility is generally available only for the cursor movement and editing commands.)

We plan to design a keyboard specifically for the Etude system. The prototype uses a conventional ASCII keyboard: the Etude verbs and objects can be directly keyed via control and escape sequences. Any verb or object name can also be explicitly typed in full, or selected from a menu displayed to the operator.

Go ahead is used for the confirmation of operations such as erase; the area affected by the command is highlighted, and execution occurs only when the user hits go ahead. Moreover, when typing in a name during an Etude command (usually that of a component), typing go ahead instructs the system to automatically “complete” the name, from what has been typed. If the string typed in so far is ambiguous (i.e., a prefix of more than one name), Etude informs the user of the ambiguity. The menu command may then be used to display available options (in the form of legitimate names that have the typed string as prefix). Whenever menu is pressed, whether in this situation or in others where a menu can be used, the available options are displayed in a “pop-up” window. The cursor control keys can then be used to select from the menu–the currently selected item is highlighted on the screen, and each cursor control key moves the selection over by one item in the menu. Pressing go ahead finalizes selection from the menu. Alternatively, the user may decide to type in the name of an item listed in the menu and then press go ahead.

3. System Architecture

An initial implementation of Etude, which focused on exploring some of the more challenging implementation issues, was constructed in the CLU programming language [6]. This system currently runs on a DECSYSTEM-20 and drives a Nu terminal [12], which has a bit-map display. The Nu terminal runs a virtual terminal interface (VTI) program, which provides operations to display multiple type faces and manipulate rectangular screen areas. In this section, we present a brief description of the software architecture of the entire Etude system.

The implementation of the Etude system is divided into two parts: the user interface and the editor / formatter / display. The user interface is responsible for parsing the keystrokes entered by the user, interpreting them, and then invoking the appropriate internal operations to realize the desired function. Most of the time a function of the editor, which is responsible for making changes to the document, is invoked. If the user’s command does not involve changing the document, the user interface handles the function directly; this is the true for help, menu, and cancel functions. After the appropriate internal operations have been performed, the user interface updates the session state, which is a record of what actions have been performed, and is mainly used for the purpose of implementing help and undo. The formatter, which reformats the regions in the document that have been changed by the just completed command (and that appear on the screen), is then called. Finally, the display system is invoked, which updates the screen image to reflect the current state of the document.

For example, if the user types a text character, such as “i,” when he is not in the midst of issuing a command, the character is to be inserted into the document. The user interface will instruct the editor to insert the character “i” into the document at the position marked by the cursor and will update the session state to indicate that the character was so inserted. The user interface invokes the formatter, which reformats at least the line into which the “i” was inserted, and possibly more (if the line “overflowed” as a result of the insertion). The user interface then asks the display system to update the screen. The display system redisplays at least the line containing the new “i”; again, more lines may be redisplayed if the insertion of the “i” caused changes to other lines in the document. After all these operations are completed, the user interface waits for more keystrokes from the user.

All changes to the document follow this same basic pattern. A more complicated command, such as erase 3 lines, requires additional work from the user interface to parse the command, to invoke more general operations of the editor, and to record the operation in the session state. The operations of the formatter and display system remain essentially the same, although larger regions of text may need to be reformatted and redisplayed.

The user interface handles a help request by examining the session state and constructing a temporary document containing the text of the help information. It allocates an area on the screen for the text of the help information, then calls the formatter and display system on this temporary document, which results in the appearance of the help information on the screen.

4. The Etude Document Representation

In order to provide “real-time” and high-level formatting, the Etude system makes use of a rich representation of the document being processed.

Etude deals with three aspects of a document:

the content, the sequence of characters that form its text;
the internal structure, the organization and classification of the ideas and information contained in the document;
the outward appearance, the arrangement of text that determines the way the document appears, either when printed or displayed.

Each of these aspects is represented in Etude’s document representation.

The content of a document is represented by a simple linear sequence of characters, stored in a list structure called the text chain.

Both the internal structure and outward appearance of a document are modeled by hierarchies. These are distinct hierarchies that cannot be directly related. For example, the body of a letter (part of its internal structure) might be completely contained within a single page (outward appearance): or it might occupy parts of two pages; or it might extend over several pages, completely containing one or more of these pages.

The representations of both the internal structure and outward appearance of a document, described in the following two sections, are woven over the content of the document. In order to maintain the relationship between these structures and the content of the document, markers may be inserted into the text chain. These markers serve to relate the internal structure and outward appearance of the document to each other by delimiting the segments of the text chain associated with components of each structure.

4.1. Representation of Internal Structure

4.1.1. Subdocuments

The total content of a document is not actually stored as a single text chain, but is broken up into a number of disjoint pieces called subdocuments. Subdocuments are document components that have no sequential or containment relationships with one another (although, as will be discussed later, they may have spatial relationships). The content of each subdocument is stored as a single text chain. The major subdocument is typically the main text; in some documents (such as a simple business letter), it is the only subdocument. More complex documents might have several subdocuments: running heads and feet, footnotes, and figure captions would all be represented as separate subdocuments–as would the two versions of the main text in a dual-language book.

4.1.2. Components

The internal structure of each subdocument is modeled by a hierarchical tree structure. The objects in the tree structure are called components. A component has the following information associated with it:

Each component has a single owner component and an array of owned components (children), used to implement the hierarchical tree structure.
Each component is of a specific class, which identifies the kind of internal structure component that it represents. Typical component classes are “return address,” “address,” “body,” and “paragraph.”
Each component identifies a region of its subdocument’s text chain. This relationship is represented by the presence of a begin component marker and an end component marker in the text chain, which are pointed to from the component. The characters contained in any component are thus accessible. The component structure is also accessible from the text chain because each begin and end component marker (begin or end) has a pointer to its associated component.

4.2. Representation of External Structure

The external structure, like the internal structure, is represented as a hierarchy built over the content of the document. Unlike the content or the internal structure, however, the outward appearance is not directly specified by the user. Instead, it is constructed automatically by the Etude system.

4.2.1. Pages

Just as the internal structure of the document is divided up into a number of subdocuments, the external structure is divided up into a sequence of pages. A page in our representation has the conventional meaning: the total arrangement of printed markings on one side of a sheet of paper (or on a display screen).

4.2.2. Boxes

The box is the fundamental unit out of which the outward appearance of a document is built. Our use of boxes is modeled after the TEX system [3], although we have modified and extended the concept somewhat. The outward appearance of each page is represented as a hierarchical structure of boxes in a fashion analogous to the internal structure of subdocuments.

A box is a two-dimensional object with rectangular shape. All boxes have a reference point and three associated measurements: a height above the reference point, a depth below it, and a width.

A character is represented by the simplest kind of box, having only these three basic attributes. Its reference point corresponds to the base line of the character. The base line of a character is an imaginary line at the top of the descender of a character with a descender (for example, “g” or “p”), or the bottom of a character if it has no descender (for example, “a” or “b”).

When boxes are joined together to form larger boxes, they are either joined horizontally or vertically. If they are joined horizontally, they are all aligned on their reference points and the resulting box is called a line. Thus, when constructing a line of characters, the base lines of the characters will be aligned properly without further effort.

If boxes are joined vertically, they are also aligned on their reference points, and the resulting box is called a column. The typical column of text is a box constructed of lines. A line’s reference point is normally at the left edge of the line, because the reference point of a line is the reference point of its first character. Thus, lines that are joined together to form a column are aligned on their left edges.

A fourth kind of box, glue, is inserted into a line or column box when extra space is needed between its component boxes. There are a variety of different classes of glue. For example, the two classes of glue normally found in a line are inter-word glue and inter-sentence glue, and the glue normally found in a column is inter-line glue.

Glue has three more attributes in addition to its class: a natural space, a stretch, and a shrink. A glue’s class determines its particular values of natural space, stretch, and shrink. The natural space is the desired or “optimal” size of a particular blank space in a line or column. The stretch and shrink components determine how much the normal blank space may be expanded or contracted if it is necessary to increase or decrease the size of a blank space in order to, for example, justify a line of text.

Lines and columns are both constructed boxes, that is, they are made up out of smaller boxes. Like components, constructed boxes have associated with them an owner box and a set of owned boxes which together define the hierarchical tree structure of each page’s outward appearance.

Each line and column contains a region of a particular subdocument’s text chain. This relationship is represented, as it is for components, by begin owap markers and end owap markers inserted into the text chain. (Owaps represent the OutWard APpearance, as components represent the internal structure.) The outward appearance structure is accessible from the text chain because each owap marker has a pointer to its associated line or column.

4.2.3. Layouts and Containers

In order to make up the arrangement of columns that constitutes a page, Etude abandons the “boxes and glue” approach adapted from TEX. Instead, it provides a third type of constructed box, the page box, which allows its component boxes to be located arbitrarily in two-dimensional space.

This approach has been taken because there are problems, such as page makeup, for which horizontal and vertical lists of boxes (lines and columns) are not natural constructs. For example. a page in a complex document might have two or more columns of text, several cut-outs for illustrations, and a running header. The page box allows these diverse components to be correctly and directly positioned on the page. Although a structure with the same appearance could be built out of a complex hierarchy of line and column boxes held together with glue, it would be cumbersome to do so. Such a structure would also be much more difficult to manipulate if it were altered, say by the addition or deletion of an illustration.

Even though page boxes were devised mainly for addressing the problem of page layout, they are just one more type of constructed box, and so can be included anywhere in the hierarchical box structure.

The arrangement of the components of a page box is represented by a layout that locates a set of containers with respect to the upper left corner of the page box. (Typically, this would be the corner of a sheet of paper, or of the screen.) Each container is associated with one of the columns of text that is to appear on the page. A container may define any rectilinear area, and is represented as the union of a set of rectangles. A container may be thought of as providing the size and shape constraints under which the text of its associated column is to be formatted. By setting the width and the horizontal position of each line appropriately as the column is composed, the outline of the column’s text can be made to conform to any shape the container may assume. These component columns may come from any of the subdocuments in a document. For example, one column may come from a “header” subdocument and another from a “bodytext” subdocument. The page box will have a pair of begin and end owap markers in the text chain of each subdocument that has a portion of its text appearing on the page.

In addition to the text containers that have already been implemented, image, line art, and table containers will eventually be made available in the Etude system. Each of these container types will be associated with objects that can have their own unique internal representations, designed to be natural for the manipulations that will be applied to them. All that is required for such a box to be incorporated into the layout of a page box is that it provide the standard small set of box attributes and operations. In this manner, Etude allows a variety of different data structures to be cleanly integrated into a single overall document representation.

5. Editing and Formatting

5.1. Representing Changes to the Document

In order to give the user immediate feedback on the display of any changes he makes to the content or internal structure of his document, Etude performs incremental formatting and incremental redisplay. Incremental formatting is the ability to reformat–i.e., reconstruct the outward appearance of–only those portions of the document that have been changed by an editing operation. Similarly, incremental redisplay is the ability to redisplay only those portions of the document that appear on the screen and that have been changed.

As the user edits, the document maintains indications of the changes that have been made to it; the outward appearance hierarchy is used to keep track of these changes. When a change is made to the document, the lines in the changed section are marked as changed, and the columns containing those lines are also marked as changed. Thus under this scheme, the smallest unit of text that is reformatted and redisplayed is one line. Although this is not ideal–it might be desirable at times to redisplay only a single character–it is adequate for our purposes.

We have only discussed in a vague sense what it means for a section of the document to be “changed.” We have said that such a section, if it appears on the screen, needs to be reformatted and redisplayed. Just marking a section of the document as “changed” is not adequate to fully represent the dynamics of formatting and display. For example, a section of the document that has not been changed may still need to be reformatted. This may happen when a character is deleted from a line: the previous line has not been changed, but it still may need to be reformatted, because deleting a character in a line may allow a word at the beginning of that line to move up to the end of the previous line.

Thus, there are actually two kinds of marking done on the document: unformatted marking and changed marking. Sections of the document that are potentially unformatted as a result of an editing operation are marked unformatted; this is an indication to the text formatter that it must examine that section and reformat it, if necessary. Sections of the document that have actually been altered are marked as changed; this is an indication to the redisplay subsystem that these sections need to be redisplayed on the screen.

If a character is inserted or deleted from the text chain, then the line containing that character is marked as both changed and unformatted. If a component is inserted or deleted from the component hierarchy, then all the lines that have characters contained in the scope of the component are marked as changed and unformatted.

This marking propagates upward through the outward appearance hierarchy. When a line is marked unformatted (changed), the column containing it is also marked unformatted (changed).

The text formatter formats a section of the document by formatting all the lines in that section that are marked unformatted. In doing the formatting, it may unformat and change additional lines of the document, which will then be marked appropriately. When the formatter finishes formatting all the lines in the section, it marks those lines as formatted. The redisplay system, in order to keep the screen up-to-date, would then redisplay all the lines that are marked as changed and then mark them as unchanged.

5.2. Formatter Architecture

The structure of the Etude formatting subsystem is stratified to handle the reconstruction of each layer of the outward appearance hierarchy as independently as possible. Four basic modules implement the formatter:

The displaywright manages the various display and pagination modes, and controls the invocation of the next two modules. (Just as a “shipwright” builds and repairs ships, the “displaywright” builds and repairs the displayed image of the document. Similarly, the pagewright, columnwright and linewright build and repair pages, columns and lines.)
The pagewright is responsible for maintaining and rearranging the layout of individual pages as required by the user’s actions and the current pagination mode. The pagewright works exclusively with containers, that is, with the shape, size, and location of the various text areas on the page.
The columnwright is invoked within the context of a particular container and controls the invocation of the linewright in order to build up a single column of text that conforms to the size and shape constraints imposed by the container. The effect of the pagewright and the columnwright operating together is to “flow” text through the appropriate areas of the page.
The linewright is responsible for composing each individual line of text within the formatting environment defined by the component hierarchy and the line width constraints provided by the columnwright.

5.3. Format Specification

The text formatter of the Etude system composes the outward appearance of a document. Etude’s formatter uses a data base of formats for determining how each component in the internal structure of a document should be formatted. (This is quite similar to the approach taken by the Scribe text formatter.) It derives the formatting information from both the data base, which contains a set of pre-defined formats for each class of component, and the arrangement of components in the internal structure hierarchy.

The data base contains a format specification for each class of component known by Etude. The format specification includes a number of format attributes and a value specification for each attribute. For example, “type face” is an attribute that might have the value specification “italic,” and “right margin” is an attribute that might have the value specification “1.5 inches.”

A component’s format specification need only partially determine the formatting environment for the component. (The formatting environment is a total specification of all the typographic attributes and values for the component.) For example, the “center” component centers the text contained in it. We would want the text to be centered within the margins of the document, whatever they happened to be. The format specification associated with the “center” component would not specify the margins between which the text should be centered; rather, the margins would be derived from the margins defined for the document type. Thus, the desired margins for the centered text would be inherited from previous specifications.

6. Display

6.1. Goals of the Redisplay Mechanism

Like the formatter, the redisplay mechanism operates incrementally–only those parts of the screen that have been changed are redisplayed. Also, parts of the screen that have not been changed but are in the wrong position (e.g., following the deletion of a line) are moved, though the use of a screen bit-map operation (referred to as a block move, and implemented by the Virtual Terminal Interface). The use of incremental redisplay not only reduces the amount of information that must be sent to the display, but also reduces the distraction caused by a large part of the display changing frequently.

The redisplay mechanism also supports the screen layouts needed for Etude, which uses a number of system-defined windows on the screen (text, format, interaction. etc.). Etude can also display multi-column documents, and provide “pop-up” help and menu windows.

6.2. Redisplay Approach

The Etude redisplay mechanism is organized around the concept of a column picture, which represents the display image of a column of text. Column pictures are used to represent not only columns of a document, but also system-defined windows. Each column picture contains a pointer to the column of text that it contains and information about each line of the column that is currently on the screen.

Column pictures are organized on the screen using window objects. Each window has a rectangular shape, a position, and a contents. There are two types of windows: basic windows, which contain a column picture, and compound windows, which are composed of zero or more windows. The windows in the system, therefore, are organized into a tree structure, with basic windows as the leaves and a window that corresponds to the total physical screen as the root.

Etude might display only a portion of a column picture. A basic window may be thought of as a rectangular hole in a large piece of paper. This piece of paper is put down on top of each column picture so that the part of interest is visible through the hole. This visible part can then be considered as another image, and can also be placed (along with other such images) in a compound window.

The physical screen is updated by calling a special redisplay procedure. This procedure traces through the window structure. until it reaches the basic windows. Then the column picture display routine displays each column picture contained in a basic window. This routine will update its part of the screen to correspond to the current state of the text in the column picture.

6.3. Redisplay Implementation

Incremental redisplay, along with block moves, is done on a column picture basis by the column picture display routine. First, the display routine determines what part of the column picture has changed. This is done by examining the changed marks left in the document representation by the editing and formatting operations. The column picture display routine looks at these marks, and resets them when a line is displayed on the screen.

There are two passes in redisplaying a column picture. On the first, the redisplay routine checks to see if a block move can be performed. The routine tries to find the largest number of consecutive lines that have not been changed but are in the wrong position on the screen. These lines are then directly moved into the correct position.

On the second pass, the routine goes through all the lines again, and displays any line that is either changed or in the wrong position. A line is displayed by clearing out the part of the screen it occupies, and displaying it in full. Thus, any change to a line (even the addition of one character to the end) will cause that entire line to be re-displayed.

The information about what lines are on the screen is stored in a table associated with each column picture. This table contains pointers to the line and to the part of the screen it occupies. This information is used on the first pass to determine which lines are out of position. After block moves are done, the table is updated to reflect the results of the moves.

7. The User Interface

The Etude user interface is responsible for considerably more than simple command parsing. Given the need to support online help, menus of possibilities, and an undo operation, it has to know what the user is doing, what he has done recently, and what he can do next.

7.1. Document Interface

Although the user interface is constrained to manipulate Etude documents, their representation is too rich for that to happen directly. The interface therefore deals with two abstractions: cursors and regions.

A cursor represents a location in a document. There can be many cursors in a document, but each subdocument has a single main cursor, which the user sees as his location in the subdocument he is editing. The principal operations on cursors are movement, copying, and text insertion and deletion: when the user types next 3 words, the user interface translates this into “move the main cursor over the next three words”; when he types x, the user interface translates this into “insert the character x at the main cursor.”

A region is essentially a pair of cursors. To the user interface, all text objects, whether chapters or words, are seen only as regions: a cursor pointing to the beginning, and a cursor pointing to the end. Thus, erase next 3 words is translated into “acquire a region containing the next three words, and erase it.”

7.2. The Screen

Of the three windows normally displayed on Etude’s screen, the user interface is directly responsible for two: the status window and the interaction window.

The status window displays, in addition to the time of day and system load, the current document type, the name of the current subdocument, and the logical location of the main cursor. The physical location of the cursor in the document is of course displayed directly on the screen, but a single physical location may correspond to many logical locations: the logically distinct positions of “the end of a paragraph” and “the end of the last sentence in a paragraph” are physically indistinguishable. At present, the logical location is shown as a list of all the components containing the cursor, ascending in the hierarchy; e.g.,

paragraph/subsection 2/section 8/article

The interaction window is divided into two sections. The first is used to display error messages and responses to some commands; the second echoes commands as they are typed. All three sections of the user interface’s display are Etude documents, displayed through the normal redisplay system.

7.3. Command Parsing

Etude commands are entered as English imperatives: erase next 3 words, for example. Parsing of a command line is driven by the verb, which has associated with it a description of the number and type of arguments that it takes. Back-word takes no arguments, while move takes two: a region, and a location. The command parser begins by looking the verb up in one of several dictionaries; in Etude, some verb properties depend on the dictionary containing the verb. The lookup returns the verb’s description (if, indeed, the verb is valid), and the parser then interprets each of the arguments in turn.

A location is always parsed by a recursive call on the command parser, with a different set of dictionaries. This provides the user with the full set of cursor-movement commands (arrow keys, go-to, etc.), in addition to help and undo, with which to get to a location, but does not permit the intervening execution of other commands. The recursive call does not terminate until the user enters go ahead as a command, so a sequence movement commands may be entered. Thus,

move word (to) start-of next chapter down-arrow down-arrow

would move the current word to the third line of the next chapter.

A region is parsed in one of two ways, depending on the first character typed in the region definition. The user may define a region simply as two cursors, by typing begin, an arbitrary sequence of movement commands (handled via a recursive call), and end; or he may enter a symbolic region definition, such as next 3 words.

A simple grammar constrains the user to enter reasonable phrases, composed of some modifiers and a single noun: start-of next chapter is accepted, while next previous chapter is not. Similarly, one can specify 3 words, but not 3 documents. The user can edit the definition by using the back-word and back-space keys; beyond that, he can only cancel the command.

One of the possibilities for the noun in a region definition is a component name, such as chapter. A table of these is maintained by the formatter; if the user wishes to enter one, he simply starts to type the name. When he attempts to confirm, if enough of the name has been typed to uniquely specify it, he will succeed; otherwise, the name will be completed as far as possible (or truncated, if the user has gone astray), and he will be informed that the name is still ambiguous.

The user may type menu at any time during this process. Etude will select the remaining possibilities from the list of component names, and put the cursor into a special window (overwriting part of the document display) in which they are displayed. The arrow keys are redefined to move among the possibilities, so the user can point to his choice. (In later versions, we hope to have some physical pointing device, such as a mouse, which could make this less cumbersome.) He may also continue typing the name, having seen the possibilities; this is in keeping with our philosophy of providing help without getting in the way of the experienced user. The menu display is a multi-column layout, provided by the standard formatting facilities.

The command parser is strongly influenced by the pseudo-English parsers of computer-based games like Zork and Adventure [5], which have found wide acceptance among novice users. The present implementation is lacking in several respects: menus are only available in a few cases, and command-line editing is severely restricted. However, it does provide at least a sample of all the desired facilities, and implements an easy-to-understand and easy-to-remember command language.

7.4. Command Execution

The user interface must keep track of what the user has done and what he is doing in order to implement help and undo. As it processes a command, the parser builds a node tree, where each node represents a command or a command argument. The node for a move command, for example, has two children: one for the region definition, and one for the location moved to. It also contains a cursor pointing to the original location of the region and a copy of the region, so that it can undo the operation. The location node might have many children, one for each cursor movement command entered during the operation. The node tree thus provides a complete description and history of the user’s actions.

Help is also supported by the node structure. Each node type has associated help information expressed in past, present, and future tenses. (Each form is required since help is dependent on the context in which it is invoked.) In the current version, help messages are quite curt. They would correspond to the first level of detail in a general message, but successively more detailed information will be needed to support a query-in-depth facility.

When help is typed, the tree is traced starting at top level. The user is informed of the past few operations he has done and of what he is currently doing. If the current node type is “top level”, then the standard list of available operations is retrieved. Otherwise, the node searches down the most recent branch of the tree to provide information about what has been done within the operation currently being performed, and what remains to be done. This information is assembled into a temporary document and presented to the user.

While history information is useful for undo and help, carrying all this information around in memory is certainly not feasible. A compile-time parameter determines how many operations are to be kept within memory; older operations are then deleted. In the next version, they will be written out to disk, so that they can be retrieved when necessary.

8. The Virtual Terminal Interface

The Etude prototype runs on a DECSYSTEM-20 and has access to the Nu terminal over a low speed (9600 baud) “terminal” line. This necessitates the use of coded communication to initiate tasks that manage the screen. In the prototype, the workstation and its associated driver program are viewed as an advanced terminal with the capability of displaying multiple character sets of different types and faces consisting of variable pitch characters. Because the low speed nature of the communication line prohibited the sharing of responsibility between Etude and the Nu, the Nu was primarily viewed as an output device. However, the Nu in its emulation of a terminal attempted to provide a number of high-level commands that would shield Etude from performing the detailed and tedious operations that would be required if Etude were driving a traditional terminal.

The interface provided by the Nu terminal emulator provides two classes of commands for text applications. In addition to cursor movement and character oriented commands such as changing the mode in which a character is “painted” on the screen, the interface provides for commands geared towards screen management that take advantage of the unique features of the bit-mapped display. The work-horse of this class of commands is the array operation. Every other screen management function is either a special case of the array operation, or can be broken down into a sequence of such operations.

An array operation is an operation on a rectangular array of bits in the display memory. Two arrays (the source and the destination arrays) are specified to the operation. The two arrays are operated upon bit by bit and the result is stored in the destination array. The operation may be one of sixteen possible boolean operations on two arguments. This array operation is similar to the RasterOp function of Newman and Sproull [7] and the BitBlt instruction of the Alto Personal Computer [11]. It has the added feature that the source and destination rectangles need not be of the same size. If the source array is larger than the destination array in a dimension, it will be truncated in that dimension. If the source array is smaller in a dimension, it will be replicated in that dimension. Thus if the source is a half toning pattern such as a 4 X 4 raster, the destination will be filled to that pattern (this is similar to the ADISRegionOp in [10]).

Most terminal operations can be decomposed into array operations: e.g., clearing the screen or an area of the screen can be achieved by one operation. (If the area is not rectangular, it may be broken down into a sequence of rectangular operations.) Likewise, deleting n lines may be accomplished by two array operations. Writing characters is just another array operation where the source is in the fonts area. “Reverse video” consists of complementing the bits in the destination rectangle. An area on the screen or a group of characters on the screen may be highlighted by specifying the source to be a half-tone pattern.

The array operations makes it unnecessary for Etude to retransmit characters as long as the relative position of characters within the group does not change drastically during the redisplay attempt. Hence redisplay throughput is increased. This is especially significant when the screen contains a large number of characters and the communication line makes retransmission costly. The array operation may also be used as a primitive for other higher level screen management functions; for example, window display and maintenance can be easily translated into sequences of array operations.

9. Summary

We have presented an overview of the Etude document production system and its prototype implementation. The Etude system strives to provide high functionality with low interface complexity by structuring the operator’s command language and supporting the entire user interaction process. In order to provide these capabilities, a new approach to the implementation of a text processing system is needed. The key issue is one of document representation, providing a structure that can both support the rich functionality and provide an acceptable level of performance.

The Etude effort is continuing. A human factors evaluation of the interface has been designed and is about to commence. The system itself is being migrated to run entirely on a “personal” single-user machine, with no mainframe in the background. Etude is being used as the base of a complete integrated office workstation, which will provide such facilities as electronic mail, database management, graphics and image processing, and more. The architecture of the system is being revised and extended, but the implementation concepts presented here will continue to be used.

References

Hammer, Michael et al. Etude: An Integrated and Interactive Document Production System. Proceedings of the 1981 Office Automation Conference, AFIPS, March, 1981.
Ilson, Richard and Michael Good. Etude: An Interactive Editor and Formatter. Memo OAM-029, MIT Lab. for Computer Science, Office Automation Group, March, 1981.
Knuth, Donald E. TEX and METAFONT: New Directions in Typesetting. American Mathematical Society and Digital Press, 1979.
Lampson, Butler W. Bravo Manual. In Alto User’s Handbook, Xerox PARC, 1979.
Lebling, P. David, Marc S. Blank, and Timothy A. Anderson. Zork: A Computerized Fantasy Simulation Game. Computer 12, 4 (April 1979), 51-59.
Liskov. Barbara et al. CLU Reference Manual. Tech. Rep. 225, MIT Lab. for Computer Science, Oct., 1979.
Newman, William M. and R. F. Sproull. Principles of Interactive Computer Graphics. McGraw-Hill, New York, 1979. Second Edition.
Reid, Brian K. A High-Level Approach to Computer Document Formatting. Conference Record of the Seventh Annual ACM Symposium on Principles of Programming Languages, ACM, Jan.. 1980, pp. 24-31.
Reid. Brian K. and Janet H. Walker. Scribe Introductory User’s Manual. Third edition, Unilogic, Ltd., 605 Devonshire St., Pittsburgh PA, 15213, 1980.
Sproull, Robert F. Raster Graphics for Interactive Programming Environment. Tech. Rep. CSL-79-6, Xerox PARC, June, 1979.
Thacker, C. P. et al. Alto: A Personal Computer. Tech. Rep. CSL-79-11, Xerox PARC, Aug., 1979.
Ward, Stephen A. and Christopher J. Terman. An Approach to Personal Computing. Digest of Papers, Compcon ’80, IEEE, Feb., 1980, pp. 460-465.

Copyright © 1981 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.

This is a digitized copy derived from an ACM copyrighted work. ACM did not prepare this copy and does not guarantee that is it an accurate copy of the author’s original work.