The World Wide Web (cf. 2.1.14) was born at the international laboratory for particle physics CERN in Geneva. CERN’s scientific community is made up of several thousand people. They use a wide variety of different hardware and software. Keeping track of this organism is difficult; exchanging documents electronically is even more difficult because the incompatibilities between the systems are manifold. Furthermore many physicist are not located in Geneva itself. They work remotely from all over the world.
Tim Berners-Lee and Robert Cailliau propose a distributed hypertext system to solve these problems. Information Management: A Proposal is written in 1989 [Berners-Lee 89], WorldWideWeb: Proposal for a HyperText Project a year later [Berners-Lee/Cailliau 90]. The first Web server becomes operational by end of 1990 with the URL address
Fig. 2.2 Tim Berners-Lee’s diagram for the World Wide Web, then named “Mesh”, 1989
Fig. 2.2 is taken from the cover of Berners-Lee’s proposal. It depicts one of the reasons for the success of the World Wide Web. The Web unifies existing information networks as early Web browsers are capable of accessing services like UUCP newsgroups, FTP and WAIS. Just one applications program is needed to access all the diverse sources – a boost in ease of use compared to the many different programs on many different platforms that were necessary before.
Furthermore the Web builds on existing technology, i.e. the hypertext transfer protocol (HTTP) builds on top of TCP/IP. Web pages are encoded with the hypertext markup language (HTML), a simple application of SGML, with the intended side effect that only ASCII characters are used. This makes it easy to edit and transfer HTML files with existing software between all platforms.
During the following years browsers are developed for all main operating systems, i.e. Mosaic for X-Windows by Marc Andreessen at NCSA released early in 1993, Macintosh and Windows versions follow by November the same year. Ten years after the first Web server has started at CERN the number of Web servers reaches ten million world wide.** The mark of 10 million is passed in February 2000. In July 2001 we have reached over 30 million Web servers. Source: Hobbes’ Internet Timeline v5.4 [Zakon 2001].
«Vague but exciting…», commented Mike Sendall† on Tim Berners-Lee’s proposal of the World Wide Web in 1989. Sendall, then head of CERN’s on-line computing group, decided to support the project.
Tim Berners-Lee and Robert Cailliau present the key concepts of the Web in The World-Wide Web for the Communications of the ACM [Berners-Lee/Cailliau et al. 94]. They introduce an universal address system, a network protocol for Web servers, and a markup language.
The address system defines Universal Resource Identifiers (URI). They should be unique and globally persistent for each object they refer to. URIs are concatenated strings of network protocol, server name and parameters to identify the object. For example
is the URI to the file ‘proposal.HTML’ on the Web server ‘www.w3.org’. Slashes represent a hierarchy space in the file directory structure of the server. It is possible to attach further information to the URI to specify a predefined anchor position.
URIs do not necessarily have to identify a specific file. They can also depict a query to a server. The result is recalculated each time the URI is used.
is an example to show how the ‘?’ is used to separate the query from the server address.
The network protocol defined for Web servers is the Hypertext Transfer Protocol (HTTP). Other protocols for valid URIs might be FTP or NNTP for news. But HTTP offers some features not otherwise available. It is «a protocol for transferring information with the efficiency necessary for making hypertext jumps» [Ibid., p. 78]. HTTP is original stateless, which means, that the tcp connection is closed after each transmission.* * States are introduced for HTTP 1.1. GET and POST methods are implemented that allow to transfer files from the Web server and to upload new files to the server.
Web pages are encoded with the Hypertext Markup Language. HTML is a formal language with a conforming sgml Document Type Definition (DTD). Special HTML elements are used to tag headlines, lists, tables, etc. Media files like images can be included and hyperlinks can be attached to pieces of text. The markup for a link would be:
<a href="HTTP://www.mprove.de">Matthias‘ Home Page</a>
The text “Matthias’ Home Page” is the visible marker for a link to the URI ‘HTTP://www.mprove.de’.
Fig. 2.12 The Web browser WorldWideWeb on NeXTStep as it looked in 1993. In contrast with the original version from 1990 this version is capable of displaying images within the text flow. The first version was opening separate windows for images.
A Web client is used to display HTML pages. The first is The WorldWideWeb browser by Tim Berners-Lee [Berners-Lee 93]. Not all of the features implemented in the early 1990s have survived the evolution to the present browsers Netscape Communicator and Microsoft Internet Explorer. The next paragraphs will outline the capabilities of the browser WorldWideWeb. In order to avoid confusion with the abstract information space of the World Wide Web the browser program was later renamed to Nexus.
Still today links are typically blue and underlined. But unlike today URIs are not displayed by WorldWideWeb/Nexus. They are considered as too technical and disgusting to be revealed to the average user [Gillies/Cailliau 2000, p. 206]. If a link is followed the resulting page opens in a new window. Initially also images are displayed in separate windows. This has the advantage that they stay visible whilst the user scrolls through the text.
The Navigate menu has BACK and NEXT and PREVIOUS commands. The last two should not be mixed up with the FORWARD command of today’s browsers. They do not reverse the BACK operation – they mean to go back a step and then take the next or previous link from the same page [Berners-Lee 93]. NEXT and PREVIOUS commands make it possible to leaf through the Web like a book page by page with a single keystroke. In Hypertext in the Web – a History [Cailliau/Ashman 99], Robert Cailliau and Helen Ashman compare this feature with Memex’s trails. A Web page stores a sequence of arbitrary references to other pages, that can be displayed in succession with ease.
Tim Berners-Lee’s browser offers no bookmark capabilities. It has a different notion of ‘home page’ instead. A home page used to be a private HTML page where references can be stored. URIs are captured for later use with COPY & PASTE from any Web page currently displayed to the personal home page. Our current understanding of a home page was known as a ‘welcome page’ back then.
WorldWideWeb/Nexus is an integrated environment for browsing and editing Web pages following the wysiwyg paradigm. It is as easy to read pages as to write them. For example just one command with the shortcut Cmd-Shift-N is needed to create a new HTML page and link to it from the previous text selection. The user is prompted to specify a new URI and the page content is sent to the server with the HTTP POST method.
Especially the last presented feature is a missing quality of all commercial browsers today. Just Amaya is still a combination of Web browser and Web editor. Amaya is being developed by the World Wide Web Consortium W3C as an Open Source project.
Recent research activities at W3C strives to separate content, structure and style of Web pages from each other. Cascading Style Sheets (CSS) and the Extensible Markup Language (XML) are two specifications that lead into this direction. Also linking information should be stored outside of the Web pages in separate files. XPointer is the formalism developed by W3C for this purpose.