F E A T U R E S    Issue 1.06 - October 1995

IPhone

By Fred Hapgood



Every form of human communication, from body language to improv, can be sorted into categories of real time or store-and-forward. In real time, both ends of the communication are chronologically tuned, meshed, and running on the same clock. In store-and-forward, both sender and receiver are talking from different time scapes. Writing a letter is store-and-forward; talking on the street is real time. Making love is real time while the sexual transmission of genes is an extreme case of store-and-forward (the communication having been written over a billion years and forwarded in seconds). Store-and-forward represents a virtual IQ booster, since more time is invested in creating the message than consuming it, making the sender seem smarter than the receiver (often just what the receiver wants). Real time draws its powers from a direct access to the consciousness. It's messy, domestic, impulsive, subliminal, warm, fast, and emotional; it's the medium of gossip, anger, and intimacy. Store-and-forward is formal, highly conscious, hierarchical, cool, slow, and concentrated - the perfect medium of control.

For reasons that run straight to its technological roots, the Internet has always been a store-and-forward medium organised around e-mail, newsgroup postings, ftp sites, and Web pages; its primary focus is on content and efficiency. From the point of view of this culture, the fate of the Net hangs on the battle between "clueless newbies" drowning the Net in a flood of meaningless babble and the in-groups who understand the importance (as much spiritual as practical) of not wasting bandwidth.

Still, there never was a wall that nobody wanted to break down. For at least 10 years, hobbyists have been experimenting with the idea of bringing the telephone, the core real-time communications technology, to the Net. Last spring, several programs were released to support this concept. These programs did not represent some half-and-half integration with conventional telephony, but rather an entirely parallel and independent system. If Alice in New York wanted to speak to Bob in Los Angeles, she now had a choice of either dialing his number or clicking on his name. If she used the Net she would speak into a microphone instead of a conventional handset, whereupon her speech would be sampled, digitised, compressed, turned into packets, and piped through her access provider and over the Net to Bob's computer, just like a text file or a program. Bob's system would then gather up Alice's packets and reconstitute her speech through a loudspeaker or headphones.

This technology was more than well received; it was as if a dam had broken. VocalTec Inc. - a company based in Israel but with operations in Northvale, New Jersey - reported 150,000 downloads of its "Internet Phone" demo (a k a the IPhone) in a three-month period. Despite some fairly stringent hardware constraints (the program needs a fast 486, 8 Mbytes of RAM, a 16-bit sound card, and a SLIP or PPP connection), the reaction was intense, if somewhat mixed. While tens of thousands of users eagerly downloaded the program as soon as it became available, others, like one member from alt.irc., complained, "What the hell do you think the Internet is for? It isn't a replacement for radio, TV, and telephones. It's for exchanging information, not free phone calls." Another commented: "This is not new technology, it is taking an idea that has been around for a long time (read telephone) and putting it on an infrastructure that was meant for something totally different." From a store-and-forward perspective, the Internet was a new chapter in the history of human intelligence. It was supposed to lead us somewhere higher, to something better. That some company could then turn this spiritual adventure into another vehicle to support eighth-grade schoolgirls babbling about Luke Perry was nothing short of criminal. Yet all this grousing delayed the spread of the technology only a few hours.

The first group of IPhone customers found each other through Internet Relay Chat (a chat-by-typing forum on the Net), the nearest the Internet comes to mass real-time communication. Users log on to a server for a list of topics to join. After signup, they get access to members worldwide already hooked into these chosen topics, delivered by a global network of servers. The user can then pick any name from the list and call for a "chat." Those particularly interested in using the IPhone simply join the IPhone topic and click on a name. The program takes it from there.

A few weeks after VocalTec's début, IRC network administrators, to general newsgroup applause, began installing patches to prevent VocalTec clients from "interrogating" the IPhone feature. (A conflict of commercial values was one reason given for the action: "We get SPARE cycles on computer systems to run these servers; VocalTec makes money from this," one sysop complained.) IPhone users barely missed a beat. Jeff Pulver, a New York infoworker, quickly installed a new IRC server dedicated exclusively to the application. Soon after, VocalTec added a second and began automatically connecting its clients to the new network. (There are now 14 IPhone servers.) Once again, the floodgates were open to the chattering hordes.

Events during the last few months suggest those gates are open to stay. Camelot Corp., of Dallas, expected to release the DigiPhone in August. (According to project vision director Kevin Corson, the product will be a general-purpose Web browser that will include Internet telephony along with the standard browser features.) The Electric Magic Company of San Francisco is selling a Macintosh client called the NetPhone, and VocalTec has recently upgraded its software to full duplex (the original version was half duplex, giving it a distinctly CB radio feel), as well as signing bundled agreements with Motorola Inc. and Crystal Semiconductor Corp., a leading manufacturer of sound-card chips.

Just as the high priests of the store-and-forward culture had warned, the introduction of the Internet Phone changed the look and feel of the Net. Night after night I fired up the IPhone and connected with callers from around the world - callers who with no long-distance bills to worry about had no other purpose than to be neighbourly. You could hear them lean over the backyard fence, hold out their hands, and introduce themselves. Not once did I hear anyone assailed for being a "clueless newbie" or stigmatised for wasting bandwidth or told to "Get a life."
I hardly recognised the place.

The media reaction has been as dramatic: according to VocalTec's PR firm, 1,000 stories were published about IPhone in the first month of its existence. For a moment the offline media stopped reheating their leftover Internet reportage about nerds, spies, hackers, porn merchants, mad bombers, and child molesters and struggled to write what was for many their first serious Internet news stories. "The Internet, now a boon for telephone companies, could well become their bane," The Wall Street Journal ruminated gravely. "If I had stock in the telephone company," the Journal quoted one user, "I would sell it."

The reasoning here is easy enough to follow. A prime-time call from New York to Los Angeles is about £10 (US$15) per hour; from New York to Paris it runs about £40 per hour. Many countries bill originating long-distance calls at even higher rates. (Howard Jonas of IDT Communications, a call-back services company, quotes the price of a call from Tahiti to France at £180 per hour.) Internet access costs no more than £2 to £3 per hour, often much less. Many access services offer hundreds of hours of 28.8-Kbps modem connections for as little as £13 to £16 a month. So, we're talking about making international phone calls for £2 to £3 per hour. The arbitrage prospects seem to be on the same order as dealing cocaine - only legal.

These cut-rate opportunities are not new: Internet communications have always been cheaper than conventional telephony, partly because of their different political and industrial environments, partly because Internet architecture is inherently more efficient. Conventional telephony establishes circuits that run the whole distance between callers and are reserved for the duration of the call. On the Internet, files are broken up into hundreds or thousands of packets and then mixed with packets from dozens of other connections. Internet communications share resources, which lowers costs. For years, techies, especially techies with families in distant countries, have been trying to circumvent telco pricing, but until recently, none of these lash-ups have worked especially well.

The brute-force solution to Net telephony is to record an audio file, e-mail it to someone, wait for that person to send one in return, play it, and so on. (Internet VoiceChat, a shareware program released in May 1994 and since discontinued, did basically that.) This store-and-forward system offers any desired level of audio quality - theoretically at CD levels - and works at any bandwidth, but you pay for this higher quality with significant response delays. These variable delays are acceptable using some routes like e-mail or the postal system, but apparently not with voice communications. It was a millisecond response delay that forced long-distance users off geosynchronous satellites and onto undersea cable. Response delay is what people hate most about speakerphones. Though silences appear to be semantic vacuums, they are also read as communicating almost any emotional state, from engaged interest to puzzlement to boredom to contempt to rejection. Imposed delays only increase the burden of processing a conversation by dragging silences that really don't mean anything through the sensitive detector. Two people interacting by voice set up a dance, a beat, that starts the center of the conversation swinging like a pendulum. They think as much like one person maintaining an interior dialogue as two. They are joined at the clock. Response delays unhinge that delicate relationship.

Research into Internet telephony has focused on finding tolerable ways to trade bandwidth and signal quality against response delay. The most intractable part of the problem runs to the core of what makes the Net so cheap in the first place: being able to break a file into packets and intermingle them with packets from many other communications. By mixing communications like this, there's always the risk that the packets of different communications will interfere with each other: a given packet might run into a region clogged with packets from another source or find itself stuck behind a long line of packets in need of extra routing time. Either way, the packets are delayed, which means the file has to sit and wait instead of heading straight to the sound card. Using a conventional phone is like transporting a group of cars cross-country by train rather than having a group of college students drive them. Driving might be cheaper, but shipping means that the cars will arrive together on time.

Each of the various Internet telephony programs announced this year has mounted its own attack on delay response. VocalTec's is clearly the program to beat, using fast processing to bombard the problem with machine cycles. When a burst of packets arrive at an IPhone client, the program lines them up and compiles a list of the gaps. If too many packets are missing (more than about one in five), the connection is scrubbed. If the number of missing packets seems manageable, the packets lying on either side of the no-shows are used to interpolate substitute packets, which are slotted into the right positions. The signal can then be pushed into the sound card and out through the speaker without waiting for the delayed packets to arrive.

From the perspective of those who know how hard it is to run real-time voice over packet routing - who are amazed to see this dog playing chess at all - the program works well. Higher frequencies seem even more faithfully preserved than over some conventional phone connections, giving speech a warmer, closer feel. However, anyone expecting Internet telephony to become a seamless replacement for telco-supported calls is likely to feel that the technology still has a road to walk. Using a 14.4-Kbps modem, only about 80 per cent of the calls get through, and during some of these, the speech might break up (gaps appear in the phonetic structure) or may sound choppy. Because the contrast ratio between successive phonemes seems slightly lower with IPhone than with regular phones, low-redundancy information like a string of digits or a URL may need to be repeated two or three times. And IPhone technology uses voice activation, which cuts off those low-volume cues that create the "atmosphere" of an ordinary phone call (you can't listen to people "listening" anymore). The final hurdle is still that second or two delay, which only adds to the distraction.

For the short run, the Internet telephony market seems to be defined by low-cost applications that can tolerate less than 100 per cent reliability. And the accompanying suite of apps is potentially large: since Internet telephony is end-to-end digital, all the "integrated services" (represented by the first two letters in the much-vaunted, yet-to-be-delivered ISDN service) can be ported into the technology. Before the year's out, many Internet telephony programs should support conference calling, call answering (caller ID is built in from the beginning, so answering systems can be targeted to specific users), voice-data integration (two or more callers can play a game or mark up a text while discussing it vocally), Web site setups (users can call one another from Web sites instead of using IRC-type networks), and other interesting tricks like voice avatars, which allow you to design a voice you like. Probable Internet markets include Internet ham radio, casual social and family communications, event telephony (using an open mike during a meeting so those not present can still participate), games (improved by a real-time voice link), Web tours, talk radio, Web shopping, and "hoot-and-holler circuits." Right now, hoot-and-hollers act like wide-area intercoms for companies that want to alert people in a field to important developments. The lower costs mean that public spaces can be defined at will. For instance, a group of artists might want to connect their respective studios into a group atelier.

These applications do not raise an obvious threat to telco revenues, since they represent new business. The size of the "potential" leak in the tire of conventional voice revenues, however, is dependent on three factors - all of them unpredictable. The first is how willing users will be to trade reduced reliability and sound quality for price. We should know the answer to this toward the end of the year, when some commercial Web sites should be offering users the choice of connecting to their sales and customer service departments using either a conventional phone or one over the Internet. (This is particularly significant as the overseas markets figure heavily in Internet commerce: 30 per cent of the Internet Shopping Network's customer base lies outside the US.)

The second factor is Net bandwidth - not end-user or access-provider bandwidth, but backbone or service-provider bandwidth, available at network exchange points. Will it increase with demand or simply choke? On one hand, the Net is awash in investment, and the technology required to increase Net capacity up to tenfold (for instance, by replacing 1.5-Mbps T-1 technology with 45-Mbps T-3 connections) is well understood and widely available. On the other hand, the spread of 28.8-Kbps modems, ISDN installations, video- and teleconferencing, graphics-intense Web surfing, tens of thousands of Internet users participating in voice chat, voice-enhanced games, and all-day open phone links between friends suggests that several seasons of Net gridlock lie ahead. To some extent this is a self-correcting problem: as described earlier, real-time communications on a packet-based network are very sensitive to congestion. As the Net tightens up, however, these applications will be the first rendered unusable (quite rightly, the store-and-forward crowd might grumble). File-oriented Web surfing will just slow down.

The third factor affecting voice revenues is how fast the technology improves, including the penetration of support technologies like faster modems, ISDN lines, the redesign of corporate security systems to transfer telephony packets, and better speech compressors and telephony-optimised sound-card drivers. These developments should go hand in hand with the routine incorporation of telephony into signal-processing chips, multimedia computers, sound cards, modems, and browsers. Camelot's Kevin Corson points out that it's possible to patch conventional and Internet telephony together such that a person could dial an Internet access provider in his or her local calling area and dial in a city code. The provider could then set up a connection with another provider in that city - all for the cost of a local call plus change. Using area Internet providers in this way allows the caller to bypass the long-distance circuits completely. A technology like this may well capture a significant slice of business.

If all these pieces fit together, the telcos might indeed start sweating over lost revenue. Industry analysts have speculated that if the telcos come under fire, they'll respond by trying to persuade governments to ban Internet telephony altogether. And, let's face it, any foolishness is possible when dealing with a legislature. (IPhone customers from Ireland, Switzerland, and South Africa have reported disapproving noises from their state telecommunications monopolies.)

Doug Ashton, a telecommunications analyst at Hancock Institutional Equity Services of Boston, believes the arrival of Internet telephony is only likely to push the telephone companies a little faster in a direction they want to go anyway - the Internet-access business. Danny Briere, president of telecom industry consulting company TeleChoice Inc. in Verona, New Jersey, suggests the telcos might even create an alternate Internet, one that bills by traffic instead of bandwidth. They would then market this alternative as the "uncongested" Internet where applications like Internet telephony could reach their full potential. Paradoxically, Internet telephony might give the telcos just the leverage they need to take control of the Net. And Ashton finds it hard to see how the Internet can become a consumer base with hundreds of millions of users without the involvement of the phone companies.

A little-noted but exceptional feature of Internet telephony technologies is that the connections established by these devices range from difficult to - with encryption add-ons - flat-out impossible to tap. Filtering all the packets for a specific conversation out of a router-based network is doable, but nowhere near as straightforward as a conventional phone tap. VocalTec's voice compression algorithms require session-specific keys to uncompress the packets. These are generated and exchanged in an instant at the beginning of each session: if a tapper misses these, he has at least made a lot of extra work for himself. And by the end of the year, a wide range of programs, both commercial and shareware, including one from Phil Zimmermann, will be available to support public key encryption, the most secure code around.

In theory, governments could simply ban encryption, but there's also another way to protect communications. "The government believes the real threat comes from encryption, but they're wrong," says Kevin Hales of Cogon Electronics Inc. in Warrenton, Virginia. (Cogon produces the AquaFone, a highly secure phone system that connects modem-to-modem rather than over the Internet.) "The enabling technology is speech coding, or just data compression in general," says Hales. Data sufficiently compressed, he says, "can be hidden such that classic encryption isn't even required." Hales suggests the best way to ensure a completely secure conversation is to scatter the bytes carrying the voice through a completely unrelated audio or video file, thus hiding the fact that the conversation is going on at all. Computers at both ends would filter the "real" communication from the packaging, but anyone tapping the line would hear only (for example) two old friends playing a duet. Such a scheme defeats the key escrow idea because of the difficulty in identifying which communications are even encrypted and which should be required to produce an escrowed key.

In short, it seems likely that in a year, anybody is going to be able to buy a machine that with a minimum of configuring will give them absolute protection against wiretaps. And, no doubt, governments from Syria to Singapore will react poorly to this idea. Inevitably, though, nothing can be done: given good speech recognition and voice-synthesis algorithms running on either end of a connection, the bandwidth required by a phone conversation can be shrunk to a few hundred baud or less, and the signal carrying the communication can be hidden in the faintest trickle of line noise. Pessimists, in the interim, might feel this gives governments license to act stupidly in their continuing war against the Internet. But there are optimists, too. I asked Elon Ganor, president of VocalTec, how he thought the Israeli government would react to his company's technology expanding throughout the West Bank. "What can they do?" he said. "Maybe they'll have to make peace."

A version of VocalTec's IPhone with 60 seconds of talk per session can be downloaded at www.vocaltec.com Registration is $69 (£45). IPhone aficionados can be found chatting at www.pulver.com. Two major players in Internet telephony are Camelot Corp. (www.ikon.com/digiphone) and The Electric Magic Co. (www.emagic.com). If you'd like more information about AquaFone, they can be reached at cogon@aol.com.

Fred Hapgood is a freelance writer specialising in science and technology. He can be reached at fhapgood@user1.channel1.com.