If you take 2001: A Space Odyssey literally, then right about now, somewhere in Urbana, Illinois, an intelligent machine is stumbling through a pathetic version of the song: "Daisy, Daisy, give me your answer, do...." January 12th, 1997, is the birthday of HAL.
Four years later, after a hell of a lot of additional lessons, HAL and five human crew members are on the spaceship Discovery, approaching Jupiter. By that time, HAL has been charged with protecting his passengers and ensuring the successful completion of the secret mission. He even has the capability to complete the mission on his own, should something happen to the crew. "My mission responsibilities range over the entire operation of the ship, so I am constantly occupied," HAL confidently tells a BBC news presenter during a television interview. "I am putting myself to the fullest possible use, which is all, I think, that any conscious entity can ever hope to do."
That's when something goes wrong - terribly wrong - with Discovery's human crew. HAL detects a problem with the AE-35, a piece of equipment used to maintain contact with Earth. But after Dave Bowman goes on a spacewalk and brings the AE-35 back in, neither he nor Frank Poole can find anything wrong with it. So they blame HAL: they conclude that the computer is malfunctioning and decide to shut him off.
Realising that the humans' actions would jeopardise the mission, HAL does his best to defend himself against their treachery: he kills Poole during the next spacewalk, then traps Bowman outside the ship when he foolishly attempts a rescue. As a precautionary measure, HAL also terminates the life functions of the three hibernating crew members.
Outside the spaceship, Bowman argues with HAL over the radio, demanding to be let back in. The computer wisely refuses: "I'm sorry, Dave, I'm afraid I can't do that." That's when the wily Bowman manoeuvres his space pod over to Discovery's emergency airlock, blows the explosive bolts, scrambles inside, seals the door and repressurises the airlock. Finally, Bowman makes his way into the core of HAL's brain and disconnects his higher brain functions, one by one. Today the results of Bowman's actions are well known: he leaves the spaceship to face the alien artefact on his own. Discovery never returns to Earth. The mission ends in failure.
Still swinging clubs
When Arthur C. Clarke and Stanley Kubrick created the film 2001 almost 30 years ago, they subscribed to a kind of scientific realism. Repulsed by the space operas that had come before, they depicted spaceflight as slow and silent. Likewise, Clarke and Kubrick tried to make the HAL 9000 as advanced as they thought a computer could possibly be in the year 2001, while still remaining plausible.Though Clarke and Kubrick might have got the physics right, their technological time line was woefully inaccurate: we are far behind the film's schedule today. The story depicts a huge space station and space weapons in orbit around Earth, routine commercial spaceflight, and two colonies - one American and one Russian - on the Moon itself. Perhaps this will come to pass in another 30 years, but it seems unlikely. Today, we can't even return to the Moon.
Further, Clarke and Kubrick failed to predict the biggest advance of the past 20 years: miniaturisation and microelectronics. In the film, astronauts on the Moon use a still film camera to take pictures of the alien artefact; today we would use a digital video camera. Aboard Discovery, Bowman and Poole use pen and paper to take notes; there are no laptop computers or PDAs to be found anywhere. Likewise, the control panels of the film's spaceships are filled with switches and buttons; Kubrick and Clarke failed to anticipate the glass cockpits that are becoming popular today.
But what about HAL - a fictional computer that is still far more advanced than any machine today? Is HAL another one of Kubrick's and Clarke's mispredictions? Or were the two simply a few years early? Throughout the film, HAL talks like a person, thinks like a person, plans - badly, it turns out - like a person, and, when he is about to die, begs like a person. Indeed, HAL acts much more like a human being trapped within a silicon box than like one of today's high-end Pentium Pro workstations running Windows 95. It is HAL's ability to learn and his control of the ship's systems, rather than his ability to perform lightning-fast calculations, that make him such a formidable challenge for the humans when they try to disconnect him.
Is this 1960s vision of the future of machine intelligence still realistic today? Yes, to some extent, although not on the timetable Clarke imagined (and still less on the timetable in the finished film, in which the "birth" takes place in 1992).
"Certainly we can do some things as well as HAL," says David Stork, a consulting associate professor of electrical engineering at Stanford University who has made HAL one of his obsessions. "We have a chess program, Deep Blue, that beats all but a dozen people in the world, and it's improving every year. Likewise, we can build big computers. Building a computer with the power necessary for performing HAL's functions is within our grasp." Stork estimates that a network of a few hundred supercomputers could handle the computational requirements easily enough.
Some of the tasks that HAL performs in the movie are commonly done by computers today. HAL guides Discovery to Jupiter; almost 20 years ago, the United States launched two unmanned Voyager space probes - which were completely reliant on on-board computer systems - to Jupiter, Saturn, Neptune, Uranus and beyond.
Likewise, the past 20 years of research and development of artificial intelligence have had tangible benefits. Just look at AT&T, which has been phasing in a speech-recognition system that can understand the words "collect", "calling card", "third-number", "operator", "yes" and "no." That doesn't seem like a hard job - for a person. Try writing the program and you'll soon be coping with hundreds of kinds of accents, high levels of background noise, and different kinds of distortion introduced by different kinds of phone lines and telephone instruments.
Building a system that can recognise just those six words has taken AT&T more than 40 years of research. Now in place, the system is allowing the company to phase out some of its human operators - which means spending less on salaries, wages, benefits, real estate, maintenance, overhead and more. Total estimated savings? About US$150 million a year. It's possible that speech recognition saves AT&T more in a single year than the firm ever spent on the research.
Of course, there's still a big jump in intellect between the AT&T machine that can recognise six words and HAL. And we're still a long way from getting a machine to think and learn in any true sense. That's a problem that has been haunting the AI field for more than 50 years.
This month, MIT Press is publishing Stork's book, HAL's Legacy: 2001's Computer as Dream and Reality, a collection of essays written by some of today's top researchers in the field of computer science. Much of HAL's Legacy plays a kind of guessing game with the film: how close did Clarke and Kubrick come to getting the technology right? For example, Donald Norman, Apple Fellow and vice president for Apple's Advanced Technology Group, takes a punishing look at the lack of good ergonomics in the spaceship Discovery's control panels.
But flip the question around and the game can be even more interesting. Don't ask how close 2001 came to being right. Ask how close today's computers are to realising the promise of HAL. When will we be able to talk to a HAL-like computer and consider it nearly an equal? When will the dream of 2001 become reality? Perhaps the easiest way to answer that question is to take a step-by-step look at what it means to be HAL.
The ultimate chatterbot
Unlike today's computers, the primary way in which HAL communicates with Discovery's crew is through the spoken word. Bowman and Poole speak; HAL listens and understands. How far are we from a computer that can comprehend its master's voice?Voice recognition is a hard but largely solved problem. For more than five years, two companies in the Boston area - Dragon Systems and Kurzweil Applied Intelligence - have sold programs that let you command a personal computer using your voice. These programs get better every time PCs get faster. Today they can recognise more than 60,000 words and control a wide variety of PC applications, including word processors and spreadsheets. The Dragon and Kurzweil programs are widely used by people who can't type because of a physical disability. Increasingly, they are finding a market among people who simply haven't learned to type or haven't learned to spell.
But the Dragon and Kurzweil systems can be difficult to use. Unlike HAL, which could listen to people speaking in a continuous flow, today's systems require you to pause between each word. The programs use the pauses to find where each word begins and ends. The computer then looks up the word in a phonetic dictionary, creating a list of possible matches. An elementary knowledge of grammar helps these programs pick the right word and resolve the difference between homonyms suchas "write" and "right".
Both Janet Baker, president and co-founder of Dragon Systems, and Ray Kurzweil, founder and chief technical officer of Kurzweil Applied Intelligence, claim they have systems in their respective laboratories that don't require the speaker to pause between words. "We demonstrated the first continuous recognition machine a few years ago," says Baker, who maintains that her continuous speech system could handle a vocabulary of 5,000 words. Kurzweil's labs, meanwhile, have built a system that can recognise a thousand different commands used by Microsoft Word. "You could say, 'Go to the second paragraph on the next page and underline every word in the sentence,'" says Kurzweil.
Both Baker and Kurzweil believe that commercially viable continuous voice recognition systems are just around the corner - say, another two or three years off. Already, both of their commercial products allow continuous voice recognition of numbers. You can, for example, give a phone number without pausing between the digits. But neither company would demonstrate its continuous speech system for a reporter. Presumably, they're not quite ready for prime time.
Bottom line: we're close to reaching HAL's level of speech recognition, and progress is picking up. By 2001, we should have it.
Read my lips
HAL can do more than understand spoken words - the computer can also read lips. In one of the film's pivotal scenes, Bowman and Poole retreat to one of Discovery's sealed pods to have a private conversation. HAL watches their lips through the window and realises that the two humans may attempt to disconnect his brain.Is computerised lip-reading possible? Arthur C. Clarke didn't think so - not by 2001, not ever. "He thought there was just not enough information in the image of the talker," says Stork, who worked with Clarke on HAL's Legacy. Clarke didn't even want the scene in the film. It was inserted only at Kubrick's insistence for dramatic effect.
Thirty years later, the debate over the efficacy of pure lip-reading - even in humans - is still largely undecided. Wade Robison, a professor of philosophy at the Rochester Institute of Technology, where 1,000 of the school's 9,000 undergraduates are profoundly deaf, is sure that lip-reading is possible because human intelligence can master it. Robison remembers one student in particular: "I hadn't a clue she was deaf until one day I happened to be talking one-on-one with her in my office. I finished up a sentence as I turned to answer the phone, and she had to ask me to repeat the sentence. As I turned, I almost jokingly mouthed: 'Can you hear what I am saying now?' She said, 'Yes, but I'm reading your lips.'"
Other researchers disagree that the image of the speaker is enough. "We have tested people who supposedly could get by without anything else besides visible speech," says Dominic Massaro, a professor in the department of psychology at the University of California at Santa Cruz. "Unfortunately, a lot of them don't really get everything that is coming by."
In any event, work in computer lip-reading - or rather speech-reading, since the computer looks at the person's jaw, tongue and teeth, as well as the lips - has been steadily progressing for more than six years. David Stork, one of the principal researchers in the field, explains that it could be a short cut to voice recognition. "Speech-reading promises to help for those utterances where acoustic recognition needs it most," he explains.
But even assisted speech-reading is still in its infancy. Researchers have estimated that it will be more than ten years before commercial speech-recognition systems use video cameras to improve their accuracy.
Bottom line: Clarke was probably right - pure speech-reading is probably unrealistic. But within ten years it's likely that computers will progress to the point where they can get the gist of a conversation by speech-reading.
Speak to me
From the moment HAL utters his first words, it is clear to the moviegoer that the 9000 series is a superior architecture: HAL's voice is decidedly nonmechanical.For Kubrick, creating HAL's voice was easy. Kubrick simply handed a script containing HAL's words to Douglass Rain, a Shakespearean actor now based in Ontario, Canada, and asked him to read the words into a tape recorder. It took Rain a day and a half. (Three decades later, HAL remains Rain's most memorable role. Perhaps or that reason, the actor refuses to discuss HAL or the film with the press.)
After nearly a century's research by scientists into synthetic speech, Kubrick's technique still dominates the industry. Most games that have "computer voices", for example, actually use digitised human speech that has been electronically processed to make it sound more machine-like. Likewise, most computers that speak over the telephone construct what they are trying to say by pasting together phrases from hours of recorded human speech.
For example, blind people could have this article read to them by DECtalk, a speech synthesizer from Digital Equipment Corporation. More than 10 years old, DECtalk is still one of the best-sounding voice synthesizers on the market. Others could listen to this article on their Macintosh: Apple's System 7.5 comes with a speech synthesizer called MacinTalk; an even better synthesizer, MacinTalk Pro, can be downloaded from the company's Web site.
Listen to HAL's voice and you'll discover why synthesizing speech is such a hard job. Despite being told to read the words in an emotionless monotone, Rain nevertheless crafted minute timing modulations, pitch shifts and amplitude changes into the words as he said them. That's because the actor understood the meaning behind the words, and part of that understanding was encoded into those minor variations. He couldn't help himself.
"As the field has been maturing, we are realising that you can't treat a speech synthesizer the way you treat an old-style line printer," says Kim Silverman, Apple's principal research scientist for speech synthesis. "The way you say some-thing depends on what it means, why you are saying it, and how it relates to what the listener already knows."
As a result, much of the research on speech synthesis today has turned into research on understanding natural languages. Joe Olive, department head of text-to-speech research at Bell Labs explains it this way: "If you just talk, it is a lot easier than if you have to read aloud something that somebody else wrote. The reason is that when you are talking, you know what you want to say."
Bottom line: today's computers speak pretty well when they operate within narrow parameters of phrases, but sound mechanical when faced with unrestricted English text. Real breakthroughs will require a much better understanding of natural language. Give it five years.
The vision thing
The HAL 9000 comes equipped with a general-purpose video system that follows Poole and Bowman around Discovery. When Poole goes on his spacewalk to replace the AE-35 unit, HAL presumably uses his vision to guide the pod's robotic arm and sever the spacesuit's air hose."Vision systems today are getting very good at tracking people," says Eric Grimson, a professor at the MIT Artificial Intelligence Laboratory. Several labs in the United States have built instrumented rooms, which Grimson says have "small, embedded cameras on the walls, ceilings and desktops that can pan, tilt, do motion tracking, keep track of how many people are in the room, deal with them as they walk past each other, and maintain a pretty good knowledge of where the people are."
Likewise, says Grimson, there are now many face-recognition systems both in the laboratory and the marketplace. These systems cannot pick out a terrorist walking around an airport as seen from a security camera, but they can identify someone using a full-frontal image from a database of a few hundred people. Some can even identify a person turned at an angle. "Systems perform in the 90% range on face recognition," says Grimson.
HAL does more than recognise faces: the computer even has aesthetic sensibilities. When HAL finds Bowman sketching, the computer says: "That's a very nice rendering, Dave. I think you've improved a great deal. Can you hold it a bit closer? That's Doctor Hunter, isn't it?"
While artistic appreciation escapes today's computers, another scientist at the MIT AI laboratory, Tomaso Poggio, has developed a program that can identify a specific person within a group photograph and another that can recognise objects and faces from line drawings. That program can even say how close the sketch is to a stored template.
"If you look at individual components - for example, locating human beings in a scene - I think that there are several good programs," says Takeo Kanade, director of The Robotics Institute at Carnegie Mellon University. But none of these systems can do it all. HAL, on the other hand, is a general-purpose intelligence that can understand whatever it sees.
For example, says Kanade, HAL realises that Bowman has ventured outside Discovery without his space helmet. "If you just tell me that particular problem, and tell me what the helmet is, and the colour, I can probably write the program," says Kanade. Detecting any kind of helmet, in any colour, is much more difficult. "We can recognise a particular helmet," says Kanade, "but not 'helmet' in general."
That sort of general-purpose recognition is a far more complex task. It goes beyond image processing and crosses the boundary into common-sense understanding and reasoning about the scene itself - tasks that are beyond today's state of the art.
Bottom line: today, we can build individual vision systems that perform each of the tasks HAL performs in the film 2001. But we can't build a single system that does it all. And we can't build a system that can handle new and unexpected environments and problems. To achieve that level of sophistication, we need some-thing extra.
Ingredient X
The extra something that all of these technologies need to work is natural-language understanding and common sense. Indeed, for many people progress in these fields defines AI today. Consider the famous Turing Test, which postulates that a machine will be truly intelligent if you can communicate with it by teletype and be to un-able tell if the machine is a human being or a computer. According to Alan Turing, language skills and common sense are the essence of intelligence.There's just one problem: language understanding and common sense are two things we don't know how to do.
Of the two, much more work has been carried out on natural-language understanding, or comprehension of language rather than merely the recognition of speech. One of the leaders in this field is Roger Schank, director of the Institute for the Learning Sciences at Northwestern University. In the late 1970s, Schank and his graduate students at Yale University built a computer program called CYRUS, which was programmed to learn everything it could about former US Secretary of State Cyrus Vance by reading the daily newswires. Each time the program read an article about Vance, it would digest the facts of the article and store the information in a conceptual database. You could then ask CYRUS a question in English - say, has your wife ever met the wife of the prime minister of Great Britain? The program was actually asked this question and answered - yes, at a party hosted in Israel.
Since then, Schank has focused on a technique he calls "case-based reasoning". Schank believes that people have a repertoire of stories that they want to tell you. When you ask them a question, it triggers a story. And people use these stories to reason and make decisions about what to do in their lives. In recent years, Schank's institute has built a number of corporate training systems, which are really large databanks filled with stories from dozens or even hundreds of people who work for the organisation. Got a problem? Ask the computer your question; the machine finds the appropriate story and plays it back to you.
The problem with Schank's systems is that using them is like having a conversation with a videodisc player. You get the feeling that no matter what you say, the response was previously recorded - like a trashy daytime television show.
Of course, HAL can clearly do things that Schank's systems can't do: HAL is curious. HAL can learn. HAL can create his own plans. It is doubtful that one of the cases programmed into HAL was a recipe for eliminating the crew.
For nearly two decades, another AI researcher, Doug Lenat, has been working on a different approach to teaching computers to learn and understand. "Almost everything that we would characterise as HAL, almost everything that separates HAL from the typical PC running Windows 95, hinges around this word 'understanding'," says Lenat. "It hinges around the totality of common knowledge and common sense and shared knowledge that we humans as a species possess."
As Lenat sees it, the difference between HAL and your PC isn't a magic program or technique, but a huge "knowledge-base" filled with rules of thumb, or heuristics, about the world. One entry might be: "When you are sleeping, you can't perform actions that require volitional control," says Lenat. Another might be: "Once you are dead, you stay dead."
HAL would need such facts to run the ship and care for the crew. And he'd need them to figure out how to dispose of the humans when they start to jeopardise Discovery's mission.
Today there is only one database of common sense in the world. It's Cyc, the core technology used in the products of Lenat's company, Cycorp, based in Austin, Texas. Lenat and his fellow developers have been working on Cyc for more than 13 years. The knowledge-base now contains more than two million bits of assertions. All of the information is arranged in a complicated ontology.
Right now, says Lenat, Cyc is making progress in natural-language understanding - it can understand commonsensical meanings in written text. Consider these two sentences: "Fred saw the planes flying over Zürich" and "Fred saw the mountains flying over Zürich." Though a conventional parser would say that these sentences are ambiguous, Cyc knows that it is the planes that are doing the flying in the first sentence and Fred who is doing the flying in the second.
Cyc can make these discriminations because the words "planes" and "mountains" are more than just plural nouns: they are complex concepts with many associations. Lenat believes that it's this sort of deep understanding that's necessary for the majority of jobs that HAL does. And Lenat thinks that it is only a small step from a Cyc-like database to true machine intelligence.
"Cyc is already self-aware," says Lenat. "If you ask it what it is, it knows that it is a computer. If you ask who we are, it knows that we are users. It knows that it is running on a certain machine at a certain place in a certain time. It knows who is talking to it. It knows that a conversation or a run of an application is happening. It has the same kind of time sense that you and I do."
This is a lot more than simply programming a computer to say "I am a computer." Cyc knows what a computer is, and can use that knowledge to answer questions about itself. Like a person, Cyc can perform a chain of reasoning. But Cyc can't learn by itself. All of the heuristics in the Cyc knowledge-base have been painstakingly entered by Lenat's developers, or "ontologisers", as he calls them.
Lenat's dream has always been to create a computer program that could learn on its own. His PhD thesis was a program called AM - Automatic Mathematician - which was designed to discover mathematical patterns. Over hundreds of runs, AM discovered addition, multiplication, and even prime numbers. But the program always stopped working after a few hours. Why? AM learned by making experimental modifications to itself and keeping mutations that were interesting. Invariably, something important in the program got modified out of existence. This taught Lenat that there had to be more to learning than trial and error.
Lenat started the Cyc project in an attempt to get away from the boring world of abstract maths. Immediately he had a problem: the system couldn't learn about the world in general because there was too much that it didn't know. This is where Lenat got the idea of "priming the pump" by giving Cyc a conceptual understanding of the world. Once that framework was large enough, Lenat reasoned, the computer would be able to start learning on its own - for example, by reading and conversation.
So how much priming does Lenat think is needed? In 1983, Lenat believed that it would take ten years to get Cyc to the point that it could start to learn English on its own, unsupervised. Today, "I'd like to say we will get there by 2001," Lenat says. "We think that we are right at the knee of the curve." Lenat says that if he is right, then by 2001 the Cyc program will start being a "full-fledged creative member of a group that comes up with new discoveries. Surprising discoveries. Way out of boxes."
Bottom line: understanding is the key to AI. More than anything else, it's the one technology that eludes science. With true understanding, all of the other AI systems would fall into place. And without it, none of them will ever achieve their potential. Give it 10 to 30 years.
Bottom bottom line
In the years after the making of 2001, an interesting rumour began to circulate: HAL's name was a play on the computer maker IBM - the letters H, A and L occurring one letter in the alphabet before the initials I, B and M. Arthur C. Clarke vigorously denied the rumour. The name wasn't a play on IBM - it was an acronym, of sorts, standing for the words "heuristic algorithmic."Back in the 1960s, heuristics and algorithms were seen as two competing ways of solving the AI puzzle. Heuristics were simple rules of thumb that a computer could apply for solving a problem. Algorithms were direct solutions. HAL presumably used both.
Was Clarke fudging? Perhaps more than a little. The real truth is that nobody had a clue how to build an intelligent computer in the 1960s. The same is largely true today.
Looking back, the early advan-ces in artificial intelligence - for example, teaching computers to play tic-tac-toe and chess - were primarily successes in teaching computers what are essentially artificial skills. Humans are taught how to play chess. And if you can teach somebody how to do something intellectual, you can probably write a computer program to do it as well.
The problems that haunt AI today are the tasks we can't program computers to do - largely because we don't know how we do them ourselves. Our lack of understanding about the nature of human consciousness is the reason why there are so few AI researchers working on building it. What does it mean to think? Nobody knows.
"I think the hardware that is necessary for what HAL has is available," says Stanford's David Stork. "It's organisation, software, structure, programming and learning that we don't have right."
That's a lot of stuff. And it's a dramatic ideological reversal from the 1960s, when AI researchers were sure that solutions to the most vexing problems of the mind were just around the corner. Back then, researchers thought the only things they lacked were computers fast enough to run their algorithms and heuristics. Today, surrounded by much more powerful computers, we know that their approaches were fundamentally flawed.
When I started working on this article, I thought that real breakthroughs in AI were just five to ten years away. Today I still think we'll see some breakthroughs in that time, but I doubt they'll culminate in a sentient machine for another 30 years.
Sooner or later, we will build a computer that can think and learn. Then we'll be able to stand back and let it reach for the stars. But whatever we do, we'd better not threaten to turn it off.
Simson Garfinkel is spending six months at the University of Washington in Seattle exploring the future of computers and society.
Talk with author Simson Garfinkel live on Tuesday, January 14th 1997, at 2pm PST at www.wired.com/5.01/hal/.