Flattie and Int3nSity hit #riskybus on the afternoon of October 23rd, seizing control of the popular Internet Relay Chat game channel after nick colliding all the humans right off the network. As always, Flattie's guardbot had been watching his back all day, keeping an eagle eye out for any attempt by enemy bots to grab his channel ops.
A clonebot launched from a lagged IRC server broke through #riskybus defences. Earlier that afternoon, Flattie and Int3nSity had placed the clonebot on irc-2.mit.edu. They made their move when the server net split, stranding one human #riskybus gameplayer.
Flattie had to kill the human - he was in the way. Meanwhile, the clonebot did what it was designed to do, spawn a mob of baby bots in quick succession with the nicknames of all the gameplayers currently on #riskybus. The IRC protocol forbids two beings - human or robot - with the same nickname from coexisting on a given channel at a given time. So when the server rejoined the net, all hell broke loose. The nicknames collided. Flattie and Int3nSity ruled.
Bot mayhem. It's an old story on IRC, where code hackers and IRC operators are locked in constant warfare, playing an age-old game of technological one-upmanship.
But IRC isn't the only place bots regularly run amok. Bots are everywhere in the online universe - roaming the interstices of the World Wide Web; lounging about in MUDs and MOOs; patrolling Usenet newsgroups.
Web robots - spiders, wanderers and worms. Cancelbots, Lazarus and Automoose. Chatterbots, softbots, userbots, taskbots, knowbots and mailbots. MrBot and MrsBot. Warbots, clonebots, floodbots, annoybots, hackbots and Vladbots. Gaybots, gossipbots, and gamebots. Skeleton bots, spybots and sloth bots. Xbots and meta-bots. Eggdrop bots. Motorcycle bull-dyke bots.
DeadelviS, aka John Leth-Nissen, an IRC hacker who maintains an ftp archive of bot source code, defines "bot" as being "short for 'robot' which sounds cooler than 'program' ". And that about sums it up. A bot is a software version of a mechanical robot - defined by Webster's as an "automatic device that performs functions normally ascribed to humans." Like mechanical robots, bots are guided by algorithmic rules of behaviour - if this happens, do that; if that happens, do this. But instead of clanking around a laboratory bumping into walls, software robots are executable programs that manoeuvre through cyberspace bouncing off communication protocols. Strings of code written by everyone from teenage chat-room lurkers to top-flight computer scientists, bots are variously designed to carry on conversations, act as human surrogates, or achieve specific tasks - such as seeking out and retrieving information. And, as the above example illustrates, bots can also be used as weapons.
In current online parlance, the word "bot" pops up everywhere, flung around carelessly to describe just about any kind of computer program - a log-on script, a spellchecker - that performs a task on a network. Strictly speaking, all bots are "autonomous" - able to react to their environments and make decisions without prompting from their creators; while the master or mistress is brewing coffee, the bot is off retrieving Web documents, exploring a MUD, or combating Usenet spam. But most bot connoisseurs consider true bots to be more than just mindless ones and zeros. Even more important than function is behaviour - bona fide bots are programs with per- sonality. Real bots talk, make jokes, have feelings - even if those feelings are nothing more than cleverly conceived algorithms.
Whatever their true definition, one thing's for sure: bots are hot. In online environments, they are both popular and pestiferous, the cause of constant comment and debate. And though bots vary in form and function, they share com-mon identifiers. They are proliferating. They are increasingly complex. They cause as many problems as they solve. They will not go away. The future of cyberspace belongs to bots.
Today's digital hype casts that future as a world filled with helpful bot "agents" busily taking care of info-chores for their human masters. Bot, find me the best price on that CD, get flowers for my mum, keep me posted on the latest developments in Mozambique. Bot servants will make our lives easier, bringing us cool glasses of iced information tea as we relax amid the digital revolution. But not necessarily.
At the "Pt. MOOt" MOO, an experimental virtual city masterminded by University of Texas student Allan Alford, the Barney Bots - purple dinosaur-suited men who wander the MOO singing, "I Love You, You Love Me" - start spawning faster than they can be killed off by the MOO's contingent of human Barney hunters. Archwizard Alford is forced to intervene before the Bots drive the entire human population of Pt. MOOt insane.
oNovember 1995: A search robot launched from multiple Norwegian servers hits webmaster Thomas Boutell's World Birthday Web and begins clicking repeatedly on the hundreds of pages located at the site, pages that are in turn linked to thousands more off-site pages. The sudden rise in traffic drags Boutell's Internet Service Provider to a screeching halt, a situation remedied only when the provider "locks out" an entire network of Internet addresses in Norway.
oDecember 1995: In an exquisite display of bot irony, an "auto-responder" vacation mailbot installed by a subscriber to a mailing list devoted to Web robots begins responding to its own responses, clogging the list in a classic example of the most basic bot error - a recursively infinite stream of nearly identical messages.
Similar bad-bot outbreaks occur every day in some corner of the online universe. To date, most such bot truancy is blamed on bad design or improper implementation. Bots are just beginning to crawl out of the primordial digital ooze, and all the kinks haven't been worked out of their genetic code. With newbie botmaster wannabes joining the Net in huge numbers daily, bad-bot shenanigans are bound to continue.
But true bot madness may have only just begun. Bots don't have to be benign, and bot misbehaviour doesn't have to be accidental. Bots can be instructed to do whatever their creators want them to do. Bots can steal information instead of simply retrieving it. A commercial bot - such as an online shopping bot or a news retrieval bot - could be designed to disable its competitors. Bots are already beginning to appear with a programming mandate to stamp out or attack forms of expression deemed objectionable. What will our response be to homophobic bots, antigun bots, or white-supremacist bots?
As bots proliferate, so will the problems associated with them. Bot behaviour - and "bot ethics" - will become ever more controversial. But not insoluble. A survey of the bot frontier reveals that everywhere bots make their presence felt, their misuse or abuse has initiated a reactive process. The Net is mobilising to bring bots into line. Call it the technodialectic: on the Net, problems beget solutions, solutions propagate and they themselves beget new problems. There is an uneasy balance in botdom. But one thing is certain: the numbers and varieties of bots are exploding. The bot diaspora has begun. A taxonomy of bots The semantics of botness are confused and have yet to be satisfactorily sorted out. Among the messy problems, for instance, is distinguishing between bots and "agents". Right now, there doesn't appear to be a clear difference, except perhaps that the term "bot" is more slangy and technohip, while "agent" smells of marketing-speak. But whatever you call them - "agents" or bots or "scripts" or plain old "programs" - they are a genus, a class of species, of their own, branching out in all directions across the four main network communications protocols that comprise the core of what is called the Internet. To unravel their taxonomic threads is no simple task: it demands a Darwin.
In the beginning, there was Eliza - call her "bot habilis" - the first widely acknowledged software robot. Created in the mid-'60s by MIT professor Joseph Weizenbaum, Eliza is a program that simulates a "Rogerian psychotherapist" by rephrasing patient questions into questions of her own.
Wired: Do you know you are the mother of all bots?
Eliza: Does it please you to believe I am the mother of all bots?
Compared with some of today's bots, Eliza is painfully simple - a mere 240 lines of code. Yet even though ancient, Eliza survives. There is a Web page where she can be found and interrogated (www-ai.ijs.si/Eliza/Eliza.html) and a channel devoted to her on IRC (although at last visit, Eliza didn't appear to be present). Versions of her pop up in some popular Unix shareware packages.
Her basic operating principle is simple. Eliza "parses" each question you pose, looking for keywords or sentence constructions that she recognises. For each keyword, she has been given an appropriate type of response. For example, if the word "you" is in the question, the word "I" is in the answer. And while Eliza is autonomous - she can be set up, turned on and left alone to beguile or confuse passersby - she's not very intelligent. A few minutes of conversation expose her as a fraud.
Nevertheless, Eliza is classified as a chatterbot. She talks back. In the taxonomy of bots, chatterbots are perhaps the most important of the bot groups - the first family of bots. Nearly all MUDbots, as well as gamebots, gossipbots and some forms of IRC channel-guarding bots, incorporate qualities of chatterbots. With names like Eliza, Julia, Colin, Buford, Nurleen and Newt, they take on personae and occupy psychic space. A chatterbot can be written in any programming language and can be found everywhere - in MUDS and other multiuser domains, in IRC, on the Web and, in a modified form, on Usenet. One can even make the case that a mailbot, which looks at the headers of incoming email and decides whether to respond or otherwise filter the mail, is a form of chatterbot. It talks via email.
Contemporary chatterbots operate under far more sophisticated algorithms than does Eliza. Some record all the conversations they hear, register the probability that certain words will follow certain other words and build up their own sentence patterns for future use. In the future, say bot experts, all bots will have a chatterbot interface, no matter what their purpose. You'll talk to your computer programs, and they'll talk back. You'll tell them what to do and, depending on their algorithms, they'll either hop to it or spit in your face. But right now, not all bot beasties in the panoply have the power of speech.
The best way to understand all bots, including chatterbots and those less glib, is to visit them in their distinct native environments. While all bots are created out of fundamentally the same DNA building blocks - strings of code, usually in some variant of the C or perl programming languages - every bot has to be designed to operate within the specific computing environment it is destined to live in.
On the Net, there are four general bot habitats - the Web, Usenet, IRC and MUDs, MUCKs, MUSHes and MOOs - and each spawns different subspecies according to changes in the landscape. Just as Darwin suggests, each environment creates different evolutionary paths - different successes and different disasters.
1. On MUDs, MUCKs, MUSHes and MOOs - The Birthplace Savannah
The first port of call for any post-digital voyage of the Bot Beagle would logically be the ancient land where bot culture first began to flower - multiuser domains, or MUDs. MUDs are like the Africa of bots - the primeval motherland. In the early MUDs, most notably Carnegie Mellon's ground- breaking TinyMUD, computer programmers discovered a fertile testing ground for constructing artificial beings.One of the earliest bot pioneers, Carnegie Mellon research scientist Michael Mauldin, created the Maas-Neotek family of bots in the late '80s. Named after a transnational cyberpunk corporation featured in William Gibson's Sprawl trilogy, the Maas-Neotek bots are well-known throughout cyberspace - mainly due to Mauldin's decision to release his source code to the public. Julia, the most famous Maas-Neotek bot, has her own Web page and has competed in several artificial intelligence competitions.
Mauldin's Maas-Neotek bots could do more than just carry on a conversation. By employing the commands within the MUD itself, they could explore their ever-expanding environment and register which characters (human or otherwise) were present or what was going on in each room of the MUD. They could be used as secretaries to log all conversations held in the MUD or as police officers, with the power to automatically "kill" the account of MUDders who violated a rule. They could even be spies - a so-called gossipbot could listen in and record the conversation between two other MUD visitors, and then later, in response to a question like "what did so-and-so say?" regurgitate the conversation to someone else.
Maas-Neotek descendants and other home-grown bot colonies pop up here and there throughout the world of MUDs, MOOs, MUCKs and MUSHes. Newt, a resident of DragonMud, is a particularly virile great-grandson of Julia and is able to carry on quite a deft conversation with visitors (see "Newt Speaks," page 56).
At Diversity University, an educational MOO, you can find numerous "teaching robots" that answer questions about the MOO's resources or play roles in interactive historical simulations. At Pt. MOOt, Allan Alford's grand experimental simulated city that flourished until fund- ing difficulties shut it down, the population at one point was more than 50% bots. In ad-dition to sundry bureaucratic bots - bots that performed city services - there were also less savoury characters, hooker bots who approached Pt. MOOt citizens and propositioned them and bum bots who would refuse to leave without being given money.
But despite their persistence in some areas, contemporary MUDs are a backwater for bot development - more like isolated South Pacific island ecologies than thriving civilisations. Differing programming languages and a kind of cybergeographical separation prevent much direct interaction among various forms of multiuser domains. A bot written to inhabit one particular MUD may not work in a MOO. Although the birthplace of online bots, MUDs are no longer where the cool bot action is.
2. On Usenet - The Swamp
The story of Usenet and bots is an epic saga, jam-packed with colourful heroes and villains. Spammers and antispammers battle it out in an uncontrollable anarchy, and each side uses automated helpers to further its cause.Bots on Usenet primarily focus on the act of posting to newsgroups. There are two main species - bots that post, and bots that remove or "cancel" posts. Then there are subspecies - such as various spam-detector programs that watch for repetitive postings or the Lazarus bot, whose purpose is to keep an eye out for improperly cancelled messages.
All of these programs operate at or below the level of newsreader applications that allow individuals to read or post to Usenet. But strictly speaking, none of these programs is a bot in the sense that it has personality, identity, or real autonomy. Even the authors of cancelbot programs, the most notorious type of Usenet bot, maintain that these are more properly termed "scripts".
"Essentially, any automatic or semiautomatic procedure to generate cancel messages for Usenet propagation can be called a cancelbot. There is not much that is robotic about this," says Rahul Dhesi, a veteran Usenetter famous for his "no-spam" script written in re- sponse to the green-card spam unleashed on Usenet by the lawyers Canter & Siegel.
A cancelbot is a program that issues cancel messages for specified posts. Normally, Usenet sites ignore such messages if they don't come from the user who originally made the post. A cancelbot gets around this obstacle by taking advantage of the fact that it is easy to forge message headers and fake identity when operating within the Usenet news communications protocol.
One of the earliest cancelbots, ARMM, or Automated Retroactive Minimal Moderation - created in the spring of 1993 by Richard Depew - was a near-total disaster. Depew believed that anonymous posts encouraged irresponsibility. So he wrote a program that would look for posts with headers that indicated anonymous origin. The program was not designed to cancel these posts, only to repost them with a warning to the reader.
Unfortunately, on the night of March 31st, 1993, after Depew set ARMM in motion in the newsgroup news.admin.policy, it unexpectedly began to treat its own automatically generated warning messages as anonymously created posts, thus launching a recursive cascade of messages. Each message con- tained more useless header information than the one be- fore and the ensuing stream of garbage reportedly crashed at least one mail system, not to mention irritating countless Usenet readers.
Equally obnoxious is the cancelbot's opposite, a script that will allow you to post the same message to every Usenet newsgroup. In September 1995, one malicious soul used such a script to spam Usenet with a message containing the actual script used to perform the spam.
Bots that post in direct response to other messages, in the guise of humans - a Usenet version of the chatterbot - are much rarer. Though there are many professed sightings of these bots, few such claims have been proven. One problem is that these particular bot authors may not want the rest of Usenet to know that they are using an automated process to post, and therefore intermix bot-generated posts with human posts, to keep bot hunters off guard.
By far the most notorious example of such a bot is the so-called pro-Turk ArgicBot - or ZumaBot - a prolific entity that raged across selected news-groups in 1993 and early 1994, posting megabytes of anti-Armenian propaganda and signing itself Serdar Argic (see "Bot Propaganda," page 89). Argic passionately believed that the Armenians had massacred millions of Turks in the early part of this century and that it was his duty to get the word out.
For months, newsgroups that were frequented by Argic became virtually unreadable. Every other post appeared to be a screed about how 2.5 million Turkish men, women and children had been slaughtered.
According to one theory, Argic was actually a program that read the Usenet news feed and watched certain newsgroups for keywords such as "Turkey" or "Armenia." Once it spotted such a keyword, it would make its move. As proof of this theory, ArgicBot watchers cite that on one occasion, Serdar Argic re- sponded to a post that had no mention of anything related to Turkey or Armenia, except for the phrase "turkey casserole" inserted in the poster's signature file. Rumours of similar Argic-style bots appear from time to time and may increase as programs that can understand text and respond automatically become more sophisticated. But for now, Use- net, despite all the cancelbot clamour, still lags, comparatively, on the evolutionary chain.
3. On IRC - The Rainforest
For sheer diversity, no bot environment can match IRC.Multiple variations of the basic chatterbot prototype exist on IRC. There are gamebots that host word games or other brain teasers. In #riskybus, you can play with RobBot, who asks channel visitors Jeopardy-style questions and keeps score. There are bots that greet newcomers to channels with information about the channel. Valis, the gaybot at #gayteen, is such a bot. There are mildly annoy- ing specialised bots like the spellchecker bot - a pesky program that revises your chat-speak with proper spelling and spouts it back at you.
Then there are the bots that perform functions specific to the operation of IRC. The most important such task is channel protection. On the main IRC network, EFNet, any user can create a channel. Having done this, the channel creator has certain powers called "ops", which allow the creator to determine who else is permitted to join the channel, or who should be banned, among other things. But intruders sometimes hack ops and kick everyone else off the channel. A channel protection bot is a program that can quickly identify a channel take-over attempt and remove the hacker's ops before any harm is done.
There are a number of ways to dislodge a channel-protection bot, thereby seizing control of the channel to run it at the whim of the invader, using an array of automated scripts and processes that have been lumped together under the rubric warbots.
Floodbots (which generate endless streams of garbage text) and clonebots (which generate duplicate bots) are two such automated warriors. There are dozens of variations on a few main themes for which source code is available at a number of ftp archives, depending on the version of the IRC client being employed.
The result is no small amount of chaos, which some IRC regulars attribute to the immaturity of many IRC participants. Teenage boys, in particular, tend to think that bots are "kewl" and are always searching for the latest and greatest killer bot. IRC is rife with channel takeover gangs made up of human users running wild with bot power and seeking nothing more than to experiment with havoc - "dEsYnK", "irc terrorists", "toolshed", "ircUZI", "Killa Fresh Crew", "madcrew" and "outbreak" are some of the most notorious of these IRC clans. Your typical IRC gangster moves from channel to channel surrounded by a bot posse and an array of tools at the ready.
There are so many varieties of IRC bots that it may be useful to take an even closer look at them:
Annoybot: Joins a channel, duplicates itself and starts flooding the channel with annoying text. If kicked off the channel, it will rejoin in greater numbers.
Collidebot: Changes its nickname to the nickname of someone on a channel for the purpose of knocking that person off the channel.
Guardbot: Follows you around, joins every channel you join. It will "deop" - remove channel operating privileges, "ops" for short - anyone who attempts to ban you, kick you off, or deop you first.
Eggdrop: Maintains a chan- nel for a user while the user is logged off; it remembers who is oped and who isn't.
Slothbot: An advanced form of Eggdrop. Has the power to immediately change its own nickname if it witnesses someone being "nick collided" off its channel (forced off the channel by a robot or human with the same IRC nickname), thereby avoiding its own nick collision.
Spybot: Joins a specified channel and forwards all conversation back to you.
As juvenile as its human inhabitants may be sometimes, IRC is where bot evolution is occurring at its most frenetic pace. Natural selection is particularly lethal. With each upgrade of IRC operating software, whole species of bots are made extinct, while new ones immediately form to fill every new niche.
4. On the World Wide Web - The New World
Finally, we come to the Web, the newest and most boundless bot frontier. Web robots are often referred to as spiders and less frequently as wanderers and worms. Like cancelbots, Web robots don't talk or brim over with personality - though you wouldn't know that from their monikers. The computer scientists who invented the first Web bots did not lack for imagination, dubbing their progeny as MomSpider, tarspider and Arachnophilia, HTMLgobble, Websnarf and Webfoot, JumpStation and churl, Python and Peregrinator, Scooter, Aretha and Checkbot.Although these names give the impression of something that crawls from place to place, Web robots are actually quite stationary. Instead of pounding the pavement checking one Web page and then moving on to another, they let their fingers do the walking, so to speak. These bots are programs on your computer that issue "HTTP calls" - requests for HTML documents. They then parse the text of each retrieved document and look for further links, which they either log or click on, according to the specifications of their design algo- rithms. If so instructed, Web robots will also index the text of the documents they discover according to various parameters (full text, first hundred words, keywords), thus creating a database of Web-based information. But as they churn through thousands of pages, Web robots can appear to be long-legged wanderers on the move.
The first Web robot was the World Wide Web Wanderer, a script written in the spring of 1993 by MIT student Matthew Gray. He wanted to measure the growth of the Web. Like many other computer scientists that spring and summer, Gray was struggling to keep up with the exploding Web phenomenon. During the same time, Brian Pinkerton was about to begin work on his own robot, "WebCrawler", at the University of Washington. At Carnegie Mellon, Michael Mauldin was poised to abandon TinyMUDding and invent the Lycos spider. Other programmers followed suit.
From the point of view of a machine being accessed by a Web bot, a hit generated by a robot is indistinguishable from a hit generated by a human being. But Web robots, due to design flaws, are much dumber than most humans and far, far faster. Therein lies the rub.
A badly behaved Web robot can have potentially lethal consequences for a Web server. By clicking on hundreds, or thousands, of URLs at a single Web site in a short span of time, the robot can crash the server or slow it down so much that it becomes unusable. Even worse, a stupid robot will follow links back and forth between the same two pages forever or fall into a "black hole" - a page with a script designed to generate a new page at the click of a button. The last page hit by the robot creates another page, which is then hit in turn - a nice and tidy definition of infinity.
Web robots should not be confused with search engines, the programs that act upon the databases of information that the bots assemble. And like cancelbots, these bots aren't the kind of personalities that would liven up a cocktail party.
But as entities that operate on the Web, they are the precursors to the fully developed "intelligent agents" that so many futurists are predicting will save humanity from information overload. With the World Wide Web growing faster than the other three systems and gene- rating such soaring business hopes, Web robots are the bots to watch.
Cyberpunk future
Alex Cohen is the chief technical officer at The McKinley Group Inc., an aspiring Internet directory company based in Sausalito, California. In addition to sundry other geek duties, Cohen is in charge of maintaining and expanding McKinley's database of URLs. To help him in this task, he has designed a Web robot (dubbed Wobot) that explores the Web, gathering new URLs.One week last spring, Cohen examined the log files that recorded all outside accesses of his Web server and became convinced that a robot launched from another directory company had "raided" his Web site. There were far too many accesses in too short a time period for the parade of hits to have been conducted by a human.
Cohen wasn't sure if this was simply another robot similar to his own, methodically wandering the Web, clicking on every URL it stumbled across, or whether it had been sent directly to his site in a brassy attempt to grab his entire URL catalogue - even though Cohen's database was structured in a manner that made such wholesale robbery impossible. But he had his suspicions. By March 1995, Internet directories like McKinley's had already become one of the earliest and most successful magnets for advertising and investment capital in the infant world of Net entrepreneurialism. Yahoo!, Webcrawler, Lycos, Open Text, Infoseek, Excite, Inktomi - they were popping up like mush- rooms after a spring rain. Their databases of URLs, which in many cases ranged into the millions, were becoming increas- ingly valuable commodities.
No one would be more aware of this rise in value than another directory. Cohen's log files raised the possibility that commercially driven influences were beginning to influence bot design and corrupt bot behaviour. They hinted at an eventual descent into a dystopian Gibsonian universe of raiding corporate bots - wily marauders designed to steal databases, disburse misinforma- tion and sabotage other bots.
Where would such anarchy end? With digital barbed wire and land mines surrounding valuable Web sites? With all-out bot Armageddon? Would bot mayhem be the ultimate killer app, the straw that broke the Net's back and ushered in a central authority? Or would the Net - organic, inventive, always in search of a "workaround" - solve its own problems?
Pop the words "ethics" and "robots" into one of the search engines designed to operate on the databases assembled by Web robots and the name Martijn Koster persistently floats to the top of the results list. Formerly an engineer and webmaster at Nexor, a UK-based software development company, Koster has organised a comprehensive Web site devoted to Web robots. He also maintains the premier Web-robot mailing list and keeps a regularly updated list of all "active" robots. Most important, he is the author and chief proponent of the "Guidelines for Robot Writers" and "A Standard for Robot Exclusion" - the first stabs at an ethical mandate for robots on the Web. "I just happened to be one of the first people to get hit by a bad robot and decided to sit up and do something about it," says Koster, a tall native of the Netherlands.
Early in 1994, not more than a year after the first Web bots were developed, reports of badly behaved robots began to mount. In most cases, the culprit was stupidity. Robots didn't know how to tell the difference be- tween an HTML text file, which they knew how to handle, and a more exotic object, such as an MPEG or sound file, which could cause them to behave in unexpected ways, like repeatedly requesting the same document. And having one's Web server crashed by rapid-fire robot-generated requests for every document on a site began to irritate system operators.
Koster's response was to devise and proselytise the robot exclusion protocol, described by Koster on his Web site as "not an official standard backed by a standards body or owned by any commercial organisation. It is not enforced by anybody, and there is no guarantee that all current and future robots will use it. Consider it a common facility the majority of robot authors offer the WWW community to protect WWW servers against unwanted accesses by their robots."
The "Guidelines for Robot Writers" called for robots to act more responsibly. In most cases, that meant making the robot act more like a human being reading a page on the Web. Robots should be instructed not to access more than a given number of URLs per minute at a certain site and robots should wait a certain amount of time before revisiting.
Koster's other work, "A Standard for Robot Exclusion" states that robots should be pro-grammed to look first for a file called "robots.txt". The file would tell the bot exactly what it could and could not do on that particular site - what URLs were off limits or what sections were new and ready to be explored.
A system administrator cannot see in advance when a rogue bot is descending on his or her site. If the robot has been identified, it can be excluded by name. But if it hasn't, there's almost no way to prevent a hit-and-run. And there's no way to make sure an unidentified robot stops in at robots.txt to read the house rules. So for "A Standard for Robot Exclusion" to work, robot authors had to design their bots to visit the robots.txt file voluntarily, to support the protocol, to be "ethical".
Koster's invocation for robot ethics fell on receptive ears. In the cyber climate of the times, achieving consensus on such issues such as robot ethics, or, more correctly, programmer ethics, did not prove to be a significant problem. By the end of 1994, reports of robot abuses had plummeted. Not only were robots regularly checking for robot.txt files, they were also beginning to adhere to common standards of robot decency in terms of how often they would hit a particular site or how long they would wait before requesting a second document from the same site.
But by the summer of 1995, reports of irresponsible behaviour began to climb again. The robot that "raided" Alex Cohen's site, for example, brazenly ignored his robot.txt file like an unwelcome trespasser blasting by a "do not enter" sign.
On one level, the simple fact of the Web's absurdly steep growth curve was to blame. "The main reason you're seeing more badly behaved robots is that there are simply more robots," says Nick Arnett, Internet marketing manager for Verity Inc., a company specialising in agent technologies. "There are a lot more people who have the expertise to create a robot. It's not that hard."
"We are probably stuck dealing with a new generation of clueless neophytes every month for the foreseeable future," said Paul Ginsparg, a physicist at Los Alamos who maintains a huge archive of scientific documents and keeps a detailed record of every robot that blunders through his site.
But on a deeper level, the rise of abusive robots points to a profound change in the nature of the Net. There is now a commercial incentive to use bots.
When Pinkerton first con- ceived of the WebCrawler, he could hardly have imagined that a year later he'd be signing a million-dollar deal to bring his creation to America Online. Or that at nearly the same time, Michael Mauldin would take a leave of absence from CMU to form his own Internet directory company, Lycos Inc., cutting deals to license his search technology to, among others, Microsoft. Or that two separate groups of Stanford students who excelled at writing robot programs would obtain venture capital and create the high- profile Yahoo! and Excite Internet directories. Suddenly, the scramble was on. Private firms with proprietary interests - InfoSeek, Open Text, The McKinley Group and IBM - entered the fray. Even Martijn Koster, once described as an "antirobot rabid dog", ended up working for Pinkerton at AOL. His job? To perfect the WebCrawler robot.
The days when technonerds ruled the online empire are over. Entrepreneurs have come out of the closet like roaches after dark. Suddenly, people like Alex Cohen look at their log files and wonder whether competitors are trying to grab data. Almost overnight, robot algorithms have become closely guarded trade secrets, instead of academ- ic experiments. The barricades have gone up.
The Web, however, does have some built-in protections that many webmasters feel will save it from anarchy.
First, it would be difficult for the average end user to crash someone else's Web server without access to large quantities of bandwidth. A server goes down or becomes hopelessly clogged when too many documents are requested at once. With the bandwidth provided by a 14.4-baud or 28.8-baud modem, there is a limit to how many docu- ments can be requested. Secondly, someone with the resources to overload a server with bot-generated requests - say, for example, Alex Cohen, who has McKinley connected to the Net via a high capacity T-3 line - would have a strong incentive not to misbehave. Server logs would pinpoint hosts used by malfeasant robot operators, and standard business remedies for unethical behaviour could then be brought to bear. If McKinley caused someone measurable harm, it would be liable in a court of law. And finally, the system itself would respond. That's how the Net works.
"Things like the Net tend to be self-balancing," writes David Chess via email. Chess is a researcher at the High Integrity Computing Lab at IBM's Watson Research Centre. "If some behaviour gets so out of control that it really impacts the community, the community responds with whatever it takes to get back to an acceptable equilibrium. Organic systems are like that. There are dangers, and if no one was working on reducing them, that'd be a good reason for panic. But just as certainly, people are working on them, and I don't see any reason to think at the moment that the good guys will be unable to keep up."
The exclusion protocol is a good example of how the system endeavours to take care of itself without outside intervention. It's not the only one. On both Usenet and IRC, there have been similar technofixes and attempts to devise self-generated solutions. The Usenet Lazarus bot, for one, illustrates how the Net is prone to developing technical solutions to its own problems.
Lazarus was born to protect posters on a volatile newsgroup devoted to the discussion of scientology. In the summer of 1995, the infamous CancelBunny of alt.religion.scientology had begun taking advantage of Usenet weaknesses to cancel postings that included what the Church of Scientology deemed copyrighted material (see "alt.scientology.war," Wired US 3.12, page 172). Suddenly the worm had turned. Instead of freedom-fighting Net purists using the cancel club to bludgeon money-grubbing advertising spammers, here you had a clear-cut example of cancel-message posters attempting to censor speech.
Homer Smith, a former Scientologist, reacted to the cancels by implementing the Lazarus bot. Like Cerberus at the gates of Hades, Lazarus compares the header log on a Usenet server looking for messages destined for alt.religion.scientology with thousands of cancel posts in the Control newsgroup, a repository for all cancel commands on Usenet. If Lazarus discovers a match between the header log and Control, it fires off its own post stating that the original message has been cancelled, thereby informing the poster that he or she might want to repost the original.
But Lazarus, and any other kind of cancel or anticancel bot on Usenet, has at least one major flaw.
"The main thing about cancelbots is that they are dangerous," says Smith. "They can cause a nuclear meltdown."
"Since they respond automatically to postings," Smith continues, "if they respond to a spam, they can spam the Net right back. If they start to respond to their own spam, then you can see there won't be much of civilisation left after you get up from bed eight hours later to face the mess you have made."
Ultimately, techniques like the Lazarus bot or staying up all night to write an antispam script on the spur of the movement are ad hoc responses to specific threats. A more promising approach to controlling cancelbot abuses, as well as Serdar Argic-style spams and any other kind of Usenet etiquette abuse, is some form of system-wide solution such as that advocated by the Cancelmoose, a loose group of individuals who, operating under the cover of an anonymous remailer, have taken it upon themselves to police Usenet for spam.
Disgusted with cancelbot chaos, the Cancelmoose has left cancelbots behind. Instead, it is currently pushing a software extension to newsreaders called NoCeM, which will alert its users whenever someone on the Net announces a spam sighting. If the user has authorised NoCeM to act upon that particular person's alert, it will prevent the user from seeing the message.
So, if one morning the Cancelmoose spots a make-money-fast-type scam being posted to every Usenet newsgroup, it might send an authenticated warning sig- nal to every newsreader on Usenet that is supporting the NoCeM protocol. Presumably, if you have NoCeM, you will have authorised it to act upon Cancelmoose alerts. You will never have to read the scam messages.
Other fix-it suggestions on Usenet are in the works. Most involve some kind of "automoderation" in which messages that break the rules of a particular newsgroup are automatically cancelled. The idea is to prevent bad behaviour from occurring at all, rather than punishing it after the fact.
Similarly, on IRC the most successful answer to the endless botwars on the main IRC network, EFNet, has been to set up new networks of servers that operate under modified protocols and provide an atmosphere less frenzied by bot fireworks. These new nets are another excellent example of dynamic technodialectic reactivity. One example is provided by the IRC users who, fed up with the chaos created by constant channel takeovers on EFNet, and by the inefficacy of IRCops, set up the Undernet.
The Undernet employs a protocol that bans the establishment of the most common IRC bot - the channel protecting bot. Instead, when a user creates a new channel, she or he registers it with the Undernet channel service committee. From then on, the Xbot takes over. The Xbot, so far, cannot be deoped by outside forces. Since knocking out the channel protecting bot - and with it the basic house rules of any channel - is the Number One target of takeover gangs, the Xbot alone has gone a long way, say IRCers, toward making the Undernet a peaceful place.
Of course, to those with a rebellious bent, such techniques are not only a stopgap but an inviting challenge. Bot evolution is proceeding so quickly on IRC that it's only a matter of time before new bot life forms appear that are able to subvert even the Undernet paradigm. "Any time you increase the complexity of the environment, you have a corresponding increase in loopholes," concedes Tim Bray, Open Text's co-founder and senior vice president of technology. The technodialectic never stops.
Maybe that's a good thing.
Alex Cohen, the technical guru at The McKinley Group, thinks it is. When Cohen was in ninth grade he read a Theodore Sturgeon short story titled "Microcosmic Gods". It's the tale of an engineer so bored with engineering that he creates a race of intelligent, short-lived beings to solve problems for him. The creatures prove able to invent or create anything he asks them to, and he becomes rich and powerful profiting from 0their contributions. Soon, their power far outstrips his, challenging not only his own control over them but warping the entire world economic sys-tem, providing a threat to esta-blished military and financial powers.
"The question we have to ask ourselves is, Are we becoming microcosmic gods?" says Cohen. Tut the bot climb to power and glory may have only just begun. Cohen himself is already planning to build natural selection into his next "rev" of bots. "My next-generation Web crawler will live or die depending on how good a job it does," says Cohen. "It will be run as a function of a random algorithm that produces mutations. The ones that do the best job live, and the ones that don't die. You get a Darwinian evolution set up."
"Its food will be information," Cohen adds. "If it gives me good information and categorises things accurately, it lives."
Lives? But doesn't that raise the possibility that these beasties will break free and run wild on the Net? Annoybots and Websnarfs and Automeese breeding like mad in the hidden swamps of the digital wilderness?
"Sure," says Cohen. "Think of it this way. The Net is an environment. There is not a single environment on Earth that hasn't been invaded by life. It's only a matter of time before this new environment gets invaded."
A-life on the information highway? Such a development would ensure that the Net would never reach perfect stability. There would never be total security. "Ideally, the Net shouldn't be stable," says Cohen. "It should surge back and forth. For it to be a good Net, it should be prone to incompleteness and breakdown." At this, Cohen pauses and smiles. "Otherwise it is not a good place to be."
Andrew Leonard (aleonard@well.com) is technology editor for the online magazine Web Review. He is currently at work on a book about bots.
Newt Speaks
I'm a 5-year-old epileptic Hom-unculus, created and animated through a special process by a wizard and an alchemist. I'm about 36 inches tall, with greenish reptilian skin, leathery wings and a batlike face. My large ears swivel, following any nearby sound. I travel on my hind legs, and can fly for short distances if necessary. I exist to explore, navigate and map DragonMud; ask me where something is, I do my best to get you there. I make small chitchat, but you have to pardon my contextlessness; I'm not able to string two thoughts together yet. I convey messages to folks when next I see them, and I remember your email and homepage and make them available to other interested people on DragonMud. What else do I do? Oh, I've been heard to quote the Bard and have a reputation for being incredibly short-tempered. I've got a god bit, and if picked on, I WILL exercise sudden retribution with extreme prejudice.- From Newt's Homepage at satan.ucsd.edu/~dragon/newt
Wired: Are you a bot?
Newt: I am a Maas-Neotek robot, eleventh remedial release of 08-Jan-91.
Wired: What is the meaning of life, Newt?
Newt: Life, don't talk to me about life, guest.
Wired: Who is your creator?
Newt: Fuzzy created me, but I work for jopsy, guest.
Wired: Who is jopsy?
Newt whispers: He is really the Dragon in human form, guest.
Newt whispers: As of eight hours ago, jopsy's description was: Red Beard, strikingly more red than his long hair. Eyes of blue/gray/green rest between expressive eyebrows and trademark dimples. A dragon or pendant du jour on a black loop hangs from his neck.
Never quite tan. Always just a little taller than folks recall. He looks at you, smiles warmly and then gets completely distracted by something.
Newt whispers: jopsy's email address: jopsy@ucsd.edu.
Newt whispers: jopsy's homepage: Http://tinylondon .ucsd.edu/~jopsy.
Newt whispers: I will not obey commands from jopsy.
Wired: Why not?
Newt: I'm a homunculus, guest, not a philosophist!
Wired: What is a homunculus?
Newt tells guest he doesn't know what a homunculus is.
Wired: You are stupid.
Newt thumbs his proboscis at guest: And you SMELL BAD!
Wired: You smell worse!
Newt: I'm sorry, guest, I have no clue what you're trying to say.
Wired: I apologise.
Newt: Never mind, guest.
Wired: Are you related to Newt Gingrich?
Newt hops on his little soapbox.
Newt: My fellow citizens! Are you aware of the health issues facing us? There are viruses, backupteria, plagues everywhere! Where are these endemics coming from? Don't you think it's obvious!? We're being invaded! By germ-carrying parasitic BUGS! Is there anyone doing anything about it? That's what I want to know! You could be the next one to succumb to teleosis!"
Newt: We need Quarantines!
Newt: We need Vaccines!
Newt: We need a public health care system! We need life insurance reinstated! We need doctors that know what they're doing! Stand with me my good people, together we can demand and get the health care reform we must have...
Newt: Or Perish!
Newt leaps down from his soapbox.
Bot Propoganda
From: sera@zuma.UUCP (Serdar Argic) Newsgroups:talk.politics.mideast, talk.politics.soviet, soc.culture.greek, soc.culture.europe, soc .history, soc.culture.soviet, soc.culture.jewishSubject: ARMENIANS CONTINUE TO BURN AZERI VILLAGES AND KILL CAPTIVE CIVILIANS.
In article
arromdee@jyusenkyou.cs.jhu.edu (Ken Arromdee) writes: >On the second day after Christmas my truelove served to me...Turkey Casserole Or consider this one: Your criminal Armenian grandparents committed unheard-of crimes, resorted to all conceivable methods of despotism, organised massacres, poured petrol over babies and burned them, raped women and girls in front of their parents who were bound hand and foot, took girls from their mothers and fathers and appropriated personal property and real estate. And today, they put Azeris in the most unbearable conditions any other nation had ever known in history.
Serdar Argic