Rules of the Screen

Stolen from the Internet

follow a set of (largely unwritten) rules that have been developed over the last ten years. Uniquely, this article takes a look at both the rules and their basis in psychology.

The elements you see in modern user interfaces (windows, menus, icons, etc) were founded on basic, psychological principles – eg, people are better at recognition than recall.

Psychologists spend their time devising deceptively simple experiments using, as their laboratory equipment, people’s heads. Some of their discoveries do not seem earth-shattering (it is, apparently, harder to remember “The man looked at the sign” than “The fat man looked at the sign warning of thin ice.”) Over time, they have built up a model of the mind that is far from the powerful, flexible superbrain we might think humans possess. Rather, it is quirky, fallible, distracted and often downright cantankerous-more Homer Simpson than Side-Show Bob.

One of the earlier findings was that people are considerably better at recognition than recall. Ask me to describe the icon for ‘set foreground colour to match this’ and I’ll have difficulty, but show me a palette of icons and I’ll pick out the pipette. Or, at least, I will if I’ve used it before. This is the fundamental principle behind both icons and menus: having people choose from a list they’ve seen before is far easier than having them type long strings of ill-remembered commands. Information in the world is easier on users than information in the head. Not just easier, but quicker too; measured in microseconds, recognition beats recall hands down.

Research is ongoing-new techniques and interface ‘widgets’ are introduced with each new operating system release. Some are successful and others die out.

Of course, every application now uses menus and icons, but there is more to it than that. Further research shows that it’s easier to find what you’re looking for when items are sorted into meaningful groups. If that seems obvious, watch someone using menus. They will frequently look in two or more menus before finding what they’re looking for, no matter how many times they’ve seen it before. Are the menus grouped in a way that is meaningful for the user or for the developer? Or for a designer in search of elegant simplicity-one set of menus across different applications, achieved by making their names as bland as possible.

What the user sees…
The key to a successful interface is determining the predominant user type-beginner, intermediate or expert-and considering what users are thinking when they use an application.

Diversity of users is a key problem for the visual interface designer.
Typically, users are segmented into three capability levels: beginner, intermediate and expert. These levels can be applied to three domains: computing in general, a specific operating system and a specific application. You can be an expert Windows user but a beginner at Excel, for instance; or you could be an expert user of your company’s order entry system but oblivious to everything outside it (there are users who believe the order entry system is the computer). ‘Expert’ here means ‘thoroughly familiar with’, rather than the usual sense of ‘world’s leading authority’.

A common mistake is to focus too much on the beginner, forgetting that the user will progress to an intermediate level fairly swiftly and will then be held up by features designed for beginners.

Beginners get a lot of attention, but in fact there are not many of them around, simply because they learn. For the custom business applications we are concerned with, users are almost entirely intermediates learning to become experts since they use the software nearly every day. If you design your interface for beginners, it will quickly be viewed as tiresome by most of its users. Instead, your application should quietly introduce itself and aim to make users intermediates as soon as possible.

Before looking at which interface features favour the different user types, consider what users are thinking when they use an application. Broadly, a user will have one of three questions:

What can I do with this application?
How do I achieve this specific task?
If I rummage around, what will I find?

1) and (3) might seem trivial, but they are enormously important in transforming beginners into intermediates and intermediates into experts. That said, you don’t need to build much direct support for them. The crucial question is (2)-to see why, we need a model.

When you want to do something, say read a book after dark, you go through four steps (what follows is a simplified version of a model introduced by interface design guru Don Norman).

First, you determine a goal; in our example it might be “make it lighter, so I can read.” You then translate the goal into one or more actions such as “get out of the chair and turn on the light” or “ask someone else to turn on the light.”

When the action has been carried out you look for feedback, in this case it gets lighter. Finally you evaluate whether the goal has been achieved.
Users apply a four-step sequence when trying to carry out a task on an application-goal, action, feedback and evaluation.

The channel most appropriate for completing a task varies with the user type: menus cater well for beginners, keyboard short cuts are for experts and the still controversial toolbar is aimed at the largest user group-the intermediates.
That might seem like an awful lot of thinking just to turn on a light, but of course for such simple actions it’s all subconscious. The same goal-action-feedback-evaluation sequence applies to users trying to carry out tasks with your application. Typically, two problem areas arise: my goal is to send a mail message, but how do I do it (the ‘gulf of execution’); and I think I just sent one, but how can I be sure (the ‘gulf of evaluation’)? The first of these is our old friend (2) from the discussion above and is by far the most common, and most serious, interface problem.

When you want to offer your users a way of completing a task, you have four main channels to choose from: menus, the keyboard, a toolbar or direct manipulation with the mouse. For some tasks, such as resizing a window or typing a letter, the choice is obvious (though ibm’s ageing CUA guidelines still insist that everything should be possible from the keyboard—have you ever tried resizing a window from the keyboard?). For commands, though, the choice is menu, keyboard and/or toolbar. Interface gurus agree that menus cater well for beginners and keyboard shortcuts are for experts, but this leaves our main target group, the intermediates. It was for this group that the toolbar was invented.
However, it doesn’t boil down to building an interface exclusively for one user type-it’s more a question of getting the right mix. Inter-mediates will quickly get accustomed to toolbars and accompanying tooltips. This frees menus up to teach beginners.

The argument against toolbars runs like this: they waste screen real estate, the icons are too small to be recognizable and, since users can’t figure out how to customise them, they usually feature the wrong set of commands. The argument in favour is more practical: once people start using a toolbar, they can’t live without them. Microsoft neatly solved the recognisability problem with ‘tooltips’-the label that appears if you hold the mouse over a toolbar icon for a couple of seconds. In fact, if you put the most commonly used commands on a toolbar (preferably customisable, though this can be hard work for developers and users alike) as well as giving them keyboard shortcuts, you have freed your menu bar to do what it does best: teach beginners. Faced with a tight budget, custom software developers often drop niceties like toolbars and right mouse-button shortcuts (the fifth channel). This is a mistake.

There are a number of further aspects to consider when examining command channels-affordance, the way in which an object’s appearance hints at what you can do with it, and pliancy, the way an object highlights itself or produces an entire menu when the pointer moves near it.

Before leaving the important topic of command channels, let’s look at two other aspects, one very old and one new. Direct manipulation (mousing an object) usually involves drag-and-drop. Though drag-and-drop has been around as long as guis, users have trouble with it. The problem seems to be with affordance, the way in which an object’s appearance hints at what you can do with it. In the real world, for instance, handles afford pulling. This makes handles on doors that need pushing pretty annoying (the designer probably thought this was a small price to pay to avoid the ugliness of a metal plate for pushing). Icons of files, just like real world files, do afford ‘physical’ drag-and-drop between folders or into the wastebasket. But chunks of text do not afford dragging to somewhere else in a document—so this should be an optional feature, aimed at experts. The rule is, except where drag-and-drop is clearly an ideal solution (such as rescheduling events in a diary application), avoid making your users do it.

Affordance is also behind a much more recent development: pliancy, or having an object highlight itself when the pointer is moved near it. The buttons in Internet Explorer are a good example of this—they change from subtle monotone to 3D colour whenever the pointer is within clicking distance.

Some objects go further than this, producing whole menus without being clicked (see Microsoft Encarta, for instance). This can be somewhat unnerving at first, but offers both ease-of-use to beginners and speed-of-use to everyone.

For speed users, Apple’s single menu bar is faster than menus mounted on windows. The keyboard, of course, beats both and ones with a Dvorak layout are faster than QWERTY, but only by 10%.

On the subject of speed, here are a couple of other research findings that will be of interest in situations where every microsecond counts (high volume data entry, for example). First, mouse usage.

Fitt’s Law says that the time it takes to click an object is proportional to the distance the pointer has to move and the size of the target. This explains the surprising fact that Apple’s single menu bar is more efficient than Microsoft’s one-per-window paradigm. Apple users need much less accuracy to hit the target: they simply wham the pointer off the top of the screen. In effect, their target is several times larger. (Windows 95 uses the same trick for its Task bar.)

Second, keyboard layout: isn’t it about time we did away with those silly QWERTY keyboards? In fact, an Alpha layout would be no quicker to learn (you might know the alphabet, but that doesn’t help significantly when the sequence is broken into three rows) and actually slower for experts.

The Dvorak layout, based on statistical analysis of key usage, would be fastest, but only 10% better than QWERTY.

The problem with overlapping windows
Overlapping windows were developed during the backlash against character-based terminals. In fact, it is rare that a user needs to view more than one window at once. This fact is exploited by browser-centric applications.

Ever since we had GUIs, we had overlapping windows. There is, of late, a growing feeling that this was a wrong turn in the history of visual interfaces. People are most effective when they are in a state of ‘flow’—ie, focused on a single task without interruption. Having to find a window somewhere behind the front one is an interruption. So is a dialog box or an error message (more of these later).

If you have three sets of papers on your desk, each relating to a different task, you do not put them on top of each other, you arrange them in separate piles and shift your focus from one to the other only when changing tasks. Nor do sequences in movies appear from behind each other and then drop into the background. This just isn’t how your mind works: attention is serial. Tellingly, the most successful application ever, the web browser, uses a serial, single-window idiom.

There are situations where you focus on more than one thing at once: when comparing, say, or copying from one place to another. Which is why overlapping windows work best where they were first applied: file managers like Apple’s Finder.

In those days you could run only one application, so it was difficult to get confused. As PCs became able to run several applications, a slew of interface innovations were thrown at the problem of managing them: Apple’s application menu with Hide and Show, Microsoft’s ALT-TAB and, more recently and most successfully, the Task bar.

For designers of custom applications, the task is somewhat simpler as they can assume that the user concentrates on a single application.

Forcing a user to move between windows and dialogs in the course of completing a single task is at first confusing and then annoying-the activity is surplus to the task.
If you are designing a custom application (other than a small utility), you can make the simplifying assumption that, if a user is working with your application, they are not doing anything else. So use the whole screen. Some gurus go further than this. Alan Cooper (creator of Visual Basic) suggests the following way of thinking about dialog boxes: imagine each window is a different room. If I visited your home and you demanded we move to a different room to shake hands, I would consider that eccentric at first and, pretty soon, downright annoying.

Of course, if we decided to start a new task, such as eating dinner, I would think nothing of moving to the dining room (not that I want to suggest that eating at your place would be a ‘task’).

If your user wants to do something they consider part of the same task, such as change font or view more detail, don’t give them a dialog box, let them do it right there in the window they already have. Reserve dialogs for new goals, such as starting a new search, not new functions. Some applications make you fight through several layers of nested dialogs, which is like finding yourself in the room under the stairs in the cellar just because you asked the time. Still others lure you into a false sense of security with commands like “New Table” which produces, not a table, but a dialog asking you about that table you requested.

There are still a number of common pitfalls designers tend to fall into-boundaries between new tasks, default settings, undo options and poor error messages.

Users will come up with a myriad of task variances which will drive the requirement analyst mad, but in practice users will take the default option 95% of the time. Toolbars are recently starting to get this right.
While many dialogs can be replaced with objects in the main window (on a toolbar, for instance), many more aren’t needed at all. Software designers do not distinguish between the occasional and the frequent: if something happens at all, the application has to cope with it. If you want to wind up a requirements analyst, neglect to tell them something and, when they find out about it, say “well it doesn’t happen very often so I didn’t think it was worth mentioning.” Yet this is how users really think, so when they choose Print, asking them stupid questions such as “where?” and “how many copies?” is annoying. They take the defaults 95% of the time. Print buttons on toolbars have recently started getting this right—they just print. If you really want multiple copies, you go and find the menu entry. A famous dialog in an early version of Excel would appear every time you tried to clear a cell and ask what, exactly, you wanted to get rid of—the contents? or maybe just the formatting? Getting just a little cleverer with defaults (remembering what the user told you last time, for instance) can make a big difference.

All too often designers use error messages as a get out for poor design. With a well designed application, a user sees an error message so rarely that, when they do, they really sit up and take notice.

Some confirmation dialogs and error messages are even worse. Users who have made what psychologists call a slip—deleting the wrong thing, say—will simply confirm the slip when asked. What they need is a way to undo, though this is a notoriously difficult thing to implement. A classic confirmation dialog in an early database application (we couldn’t find a screen shot) said “Continuing may corrupt the database” at which point the user could choose between two buttons, one labelled “Yes” and the other “No.” Kafka-esque error messages like this, many of which are simply unnecessary, seem deliberately worded to offend.

Don Norman’s Six Slips (from The Design of Everyday Things)
Capture errors A frequently performed activity takes over from (captures) the one you intended. For example, driving your car to the supermarket and finding yourself at the office.
Description errors You perform the correct action but on the wrong object due to their similarities. For example, putting your dirty washing into the tumble drier rather than the washing machine.
Data-driven errors The arrival of sensory data triggers an automatic action, and this disrupts an ongoing action sequence. For example, you spill your drink when someone asks you the time.
Associative activation errors Internal events (thoughts) can also trigger automatic actions— eg, you think of something you ought not say, then say it (the classic Freudian slip).
Loss-of-activation errors You are half-way through an action sequence and realise you have no idea why you started. For example, you find yourself walking into the kitchen but have no idea why you are there.
Mode errors A device, say your video recorder, has more than one mode of operation and the same action has different results in different modes. This is probably the most common slip caused by poor visual interface design.

Trying to be helpful

A common problem faced by beginners on their way to intermediate status is information or ‘button’ overload.

A number of UI features have been designed specially for beginners such as help, wizards and tips. For custom business applications our objective with beginners is to make them intermediates. Training courses, whilst efficient, are quickly forgotten. What is needed is ‘Just In Time’ training.

Keeping in line with a person’s natural learning process-declarative then procedural knowledge phases-a ‘Just in Time’ training method supports the user’s progression from beginner to intermediate far more effectively.

When faced with a new environment, people perceive its complexity to be higher the more buttons there are to press. This is why the ‘simplest’ telephones have just the digits 0-9, *, # and ‘R’. They are, of course, almost impossible to use. The easiest interface, which would ironically be perceived as horribly complex, would be one button per function. On screen, the designer is able to hide functions until they are needed (depending on the user’s current task, for example) so there is no excuse for overloading buttons with several functions.

After overcoming their initial impression, people first pick up declarative knowledge—a real world example would be what a bicycle is and the parts it is made of. Only later does their knowledge become procedural—how to ride a bike ‘without thinking’. Beginners have to plod through each step; experts rely on subconscious (and much faster) procedural knowledge.
Help has also evolved from being a mere reference tool to a more active task orientated search function.

Help, which has been around for ever, has only recently become helpful. This is a result of a move away from pure reference help (mainly declarative) and a new focus on how to achieve specific tasks (much easier to make procedural). Some help is now ‘active’ in the sense that it will take actions for you, such as opening a control panel, if asked. Since this makes for a fast payoff, users who previously thought help was for dummies will now make the investment when they are stuck.

Tips (shortcuts suggested by the software, often as a result of watching what you are doing) and tool tips are further examples of the blurring boundary between functionality and help. The trend is towards help that is ‘in your face’-a trend that is being accelerated by the World Wide Web for two reasons: the search idiom is becoming ingrained (so searching for help is too) and applications are decomposing into applets so users will need guidance to fit them back together.

Also classified as ‘active help’ are wizards and, although they take the user through the whole task quite simply, they do little to teach him how to complete the task himself.

Wizards are another example of active help, though these bring problems. Using a wizard, you quickly complete a task but you learn little about how. If anything, you are left more impressed by the mystery of it all. This makes wizards good for situations where learning is not the objective: infrequent tasks, say, or infrequent users. Over-reliance on wizards has resulted in some lazy design.

The processes by which these rules of the screen are reached are not laid out or even chronological for that matter. They are, more often than not, found by trial and error, with the outcome confirmed or refuted once the user gets his hands on the application.

In case you think that any of this is obvious, you should bear in mind that grown adults have spent days arguing over such things as whether there should even be a right mouse-button, let alone when to use it. The argument is now irrelevant, of course, since 95% of users have one. This is typical of how user interface progresses: gurus pontificate, academics research (into technology on which they get educational discount rather than technology everyone else is using) and try to prove ideas that are already out of date, developers try their best to follow the written guidelines and a vendor, usually Microsoft or Apple, makes the whole lot irrelevant by introducing an innovation which users, voting with their fingers, make or break overnight. Someone then quietly rewrites the guidelines to fit. This is the context in which designers work. If you have a serious yearning to invent UI rather than apply it well, you should move to the West coast.