The DACS is currently studying tools and techniques for electronic publishing. Our focus is on creating information products for the Internet, using the World Wide Web (WWW) as a distribution medium. This article is an introduction to the construction and publication of documents for the WWW, based on our experience and research results so far.
As the DACS expands our Internet services, we encourage you to explore using the Internet to communicate with others in the software engineering community, and to publish information about your own software experiences. This introduction is provided to help you get started.
The World Wide Web, or Web for short, is an Internet resource discovery service that combines hypertext capabilities with information discovery techniques, allowing users to access hypertext information remotely. The Web is implemented by a set of servers, where each server provides access to its own documents. Users access and retrieve information from the Web by using WWW clients called "browsers." The most common browsers are Mosaic, Lynx, Cello and the WWW Line Mode Browser.
Hypertext is a document organization paradigm that results in non-linear documents. The distinguishing characteristic of hypertext is the incorporation of links to other parts of a document, or to other documents entirely, at appropriate places. So instead of a footnoted reference at the bottom of a page, for instance, there may be a link to the reference itself. When a hypertext document says, "as discussed in Section 3," the reader can push a button and immediately jump to Section 3. On the Web, hypertext documents can also be linked to other Internet tools, such as Gopher or FTP, and to non-textual material, such as graphics, images, sound and video.
The destination of a hypertext link is specified by its Uniform Resource Locator (URL). URLs are the standard way to specify the location of objects on the Internet. The components of a URL indicate the access method (protocol), the host name and port number of the server where the files are located, and the directory path and file name of the target resource. The protocol used for hypertext files on the Web is the Hypertext Transfer Protocol, indicated as http.
Web browsers read and display hypertext documents that are coded in the hypertext markup language, HTML. HTML specifies the logical organization of a document, including the hypertext links. HTML includes tags that specify basic formatting commands as well. The tags in an HTML file are interpreted by the WWW client browser, which follows any link references to retrieve other HTML files or resources as needed, and interprets the formatting tags to determine how to display the files on the user's screen.
In HTML files, the tags are set off by angle brackets <...>. Most of the formatting tags occur in pairs, to mark the beginning and end of a text block. For example, the title of a document is coded as: <TITLE>Internet Tools</TITLE>. Paragraphs of HTML text are separated by the single tag <p>. Other basic tags indicate various levels of headings: <H1>..</H1>, <H2>..</H2>, <H3>..</H3>, etc.; emphasis: <STRONG>..</STRONG>; bulleted lists: <UL><LI>..<LI>..<LI></UL>; and so forth. A page containing the complete syntax of HTML that can be used for testing or just exploring, is available, currently at http:/submit/test-pattern.html.
Here is an example of how the beginning of this article would be coded in HTML:
<TITLE>Internet Publishing with HTML</TITLE>The DACS is currently studying tools and techniques for electronic publishing. Our focus is on creating information products for the Internet, using the World Wide Web (WWW) as a distribution medium. This article is an introduction to the construction and publication of documents for the WWW, based on our experience and research results so far.<p>
HTML tags that define hypertext links are called anchors, and are identified by <A>...</A>. The text between the tags is displayed on the screen as a link (in Mosaic, these are underlined). The URL of the link destination is coded in the first half of the anchor tag, thus <A HREF="http:/">DACS</A> would be used to specify a link to the DACS Home Page.
The HTML syntax for links to images includes the URL of the image file, but only a single tag is needed. Thus <IMG SRC="http:/newsletters/images/home.gif"> indicates a graphical interchange format file in the DACS Newsletters' images directory.
Here is an example of a page of information from the WWW that includes images and links. This page can be reached from a link on the DACS Home Page. It is located at the URL http://www.dacs.dtic.mil/forms/userform.shtml.
From the URL, we see that the file from which our browser produced this page is called "Newsletter Reigstration Form" located on the DACS' Web server. The file, marked up with HTML tags.
The DACS has developed a tutorial on the use of HTML which is a handbook that provides design and maintenance guidelines for creating and publishing Web documents from a software engineering perspective. It is called Electronic Publishing on the World Wide Web; An Engineering Approach.