Authoring Web Pages
This is a historical document and does not necessarily reflect current practice.
PRODUCING HTML DOCUMENTS
HTML is the simple markup system used to create hypertext documents.
HTML is not intended to be a comprehensive page-layout system.
Instead, HTML aims to let you describe the _structure_ of your
document by indicating headings, emphasis, links to other documents
and so forth. The more you work with HTML rather than against
it, the happier you'll be.
You can include images and other multimedia objects in your documents,
but it should be remembered that not all web users have graphical
clients, and many web users voluntarily turn graphics _off_ to
save downloading time! If you try to spite such users, you will
only lose readers (and customers).
You can in fact specify a great deal about the appearance of your
document in the latest web browsers. There is no harm in taking
advantage of these features, but as a rule of thumb, always make
sure your document looks good in a text-based browser such as
Lynx as well as in the graphical browser of your dreams.
This is more than a simple matter of taste. Keep in mind that
not all users can see!
There are three ways to produce HTML documents: Writing them yourself,
which is not a very difficult skill to acquire, using a HTML editor,
which assists in doing the above, and converting documents in
other formats to HTML. The following three sections cover these
possibilities in sequence:
- Writing HTML yourself
- HTML editing tools
- Conversion tools
WRITING HTML DOCUMENTS YOURSELF
You can write a HTML document with any text editor. Try the "source"
button of your browser (or "save as" HTML) to look at the HTML
for a page you find particularly interesting. The odds are that
it will be a great deal simpler than you would expect. If you're
used to marking up text in any way (even red-pencilling it), HTML
should be rather intuitive.
A beginner's guide to HTML is available at the URL http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html
. You can also find a compressed Postscript version (at the URL
ftp://ftp.ncsa.uiuc.edu/ncsapubs/WWW/HTMLPrimer.ps.Z). (Since
the latter two are FTP URLs, you can fetch them by hand using
FTP if you do not yet have a web browser.)
There is also an HTML primer by Nathan Torkington at the URL http://www.vuw.ac.nz/who/Nathan.Torkington/ideas/www-html.html
.
HTML EDITORS
Some editors are WYSIWYG (What You See Is What You Get), or close
to it; others simply assist you in writing HTML by plugging in
the desired markup tags for you from a menu. The latter are surprisingly
useful, and the former surprisingly limited. As a rule of thumb,
if you are keenly interested in using the very latest new HTML
feature, you will probably be disappointed with WYSIWYG editors.
Some WYSIWYG editors do support entry of unfamiliar tags, however.
A few can even display them in the color or style of your choice.
This document covers editors for the following systems:
HTML Editors for the Mac
Web Warrior
- Web Warrior is a free HTML editor which features user-definable
tags, command key equivalents, HTML correctness checking, and
"semi-WYSIWYG" editing. [URL:http://www.bact.wisc.edu/WebWarriorTop.html]HTML-HyperEditor
- [URL:http://www.lu.se/info/Editor/HTML-HyperEditor.html] HTML-HyperEditor
allows for European non-English characters, imports existing HTML
files, and has built-in FTP compatibility for easy installation
of your finished HTML. HTML-HyperEditor also provides a facility
to convert tab-delimited text files to HTML tables (most spreadsheets
can "save as" ASCII, and this program can be used to convert the
result to a table).HTML Editor
- A near-WYSIWYG package URL is [URL:http://dragon.acadiau.ca/~giles/HTML_Editor]
). A stand-alone program.ANT_HTML
- ANT_HTML [URL:http://mcia.com/ant] is a Microsoft Word for the
Macintosh template designed to convert Word documents into HTML
documents in a WYSIWYG environment. It includes a utility to convert
existing HTML files for editing within the system. ANT_PLUS also
converts HTML files to ASCII, RTF, or any other format possible
in Word. ANT_HTML works in all versions of Word, including international
versions. ANT_HTML supports customization; when new tags appear,
the user can add them even though they did not exist when ANT_HTML
was installed. Also available for Windows. See [URL:http://mcia.com/ant]
for more information.GT_HTML
- GT_HTML [URL:http://www.gatech.edu/word_html/] is also a Microsoft
Word 6.0 (or higher) for Windows or Macintosh template to create
HTML documents under Word. Limited support for tables is included.BBEdit
HTML extensions
- This package of extensions allows the BBEdit and BBEdit Lite text
editors for the Macintosh to conveniently edit HTML documents.
(URL is [URL:http://www.uji.es/bbedit-html-extensions.html] .)
You can also obtain the extensions package by anonymous ftp from
sumex-aim.stanford.edu as info-mac/bbedit-html-ext-b3.hqx. Also
see below.BBEditTools
- There is an alternative BBEdit extension package available as
well (URL is [URL:http://www.york.ac.uk/~ld11/BBEditTools.html]
). it is available by FTP from ftp.york.ac.uk in the directory/pub/users/ld11/BBEdit_HTML_Tools.sea.hqx.SoftQuad
HoTMetaL
- SoftQuad's HoTMetaL is a WYSIWYG HTML editor designed from the
ground up to edit HTML. Unlike HTML modes for existing word processors,
every aspect of HoTMetaL reflects this purpose. [URL:http://www.sq.com/products/hotmetal/hmp-org.htm]Web-Knitter
- Web-Knitter is an HTML editor for the Macintosh which claims ease
of use. [URL:http://www.suba.com/~chicago/web]html-helper-mode
for EMACS
- Users of the EMACS editor will want to consider html-helper-mode,
an EMACS "mode" for HTML editing (see[URL:http://www.santafe.edu/~nelson/tools/]
).NaviPress
- NaviPress is a combination WYSIWYG HTML editor/Web browser with
remote save functionality, an unusual convenience. Version 1.1
supports much of HTML 3.0, and it includes site and link management
features. [URL:http://www.navisoft.com/index.htm]
HTML Editors for Microsoft Windows
ANT_HTML
- ANT_HTML [URL:http://mcia.com/ant] is a Microsoft Word template
designed to convert Word documents into HTML documents in a WYSIWYG
environment. It includes a utility to convert existing HTML files
for editing within the system. ANT_PLUS also converts HTML files
to ASCII, RTF, or any other format possible in Word. ANT_HTML
works in all versions of Word, including international and 32-bit
versions. ANT_HTML supports customization; when new tags appear,
the user can add them even though they did not exist when ANT_HTML
was installed. Also available for the Macintosh. See [URL:http://mcia.com/ant]
for more information.GT_HTML
- GT_HTML [URL:http://www.gatech.edu/word_html/] is a Microsoft
Word 6.0 (or higher) for Windows or Macintosh template to create
HTML documents under Word. Limited support for tables is included.TC-Director
- TC-Director is a standalone HTML editor for Windows. TC-Director
supports all standard HTML 2.0 tags and allows insertion of new
tags as well. "Creation wizards" are provided to assist in the
correct entry of the more complex tags. [URL:http://www.euronet.nl/users/rpe/readme.html]Internet
Assistant
- Microsoft has released Internet Assistant, a Word for Windows
template which can edit HTML in a WYSIWYG manner, including the
capability to load existing HTML documents. It also includes simple
browsing capabilities, intended to assist in editing. [URL:http://www.microsoft.com/msoffice/freestuf/msword/download/ia/default.htm]WebMania
- WebMania is an HTML editor with unusually strong support for frames,
JavaScript, and forms. Most notable is the ability to painlessly
review the output of forms submitted via mailto: links, sidestepping
the need for CGI programming. [URL:http://www.q-d.com/]DiDa
- DiDa is a free HTML editor by Godfrey Ko. The editing window is
non-WYSIWYG, but a WYSIWYG previewer is included, and the previewer
lets you see how your page will look with or without a Netscape-compatible
browser. [URL:http://www.mcs.net/~grossman]NaviPress
- NaviPress is a combination WYSIWYG HTML editor/Web browser with
remote save functionality, an unusual convenience. Version 1.1
supports much of HTML 3.0, and it includes site and like management
features. [URL:http://www.navisoft.com/index.htm]Quarterdeck WebAuthor
- Yet another commercial Word for Windows HTML editing template
is available from Quarterdeck (URL is [URL:http://www.qdeck.com/webauthor/fact.html]
) and is rumored to be superior to Internet Assistant.HTML Assistant
- A non-WYSIWYG editor called HTML Assistant is available, with
features to assist in the rapid creation of HTML documents. A
good choice for experienced HTML authors wishing to save keyboarding
time. Available by anonymous FTP from ftp.cs.dal.ca in the directory
/htmlasst/. Read the README.1ST file in this directory for information
on which files to download. See also: [URL:http://fox.nstn.ca/~harawitz/index.html]HTMLed
- HTMLed [URL:http://www.ist.ca/htmled/] is a well-reviewed non-WYSIWYG
HTML editor. The Pro version features context-sensitive highlighting
of HTML tags, a near-WYSIWYG feature. The Pro version can also
directly import RTF documents for easy conversion of existing
documents.EdWin
- EdWin is a Windows-based non-WYSIWYG HTML editor which supports
a wide range of tags. [URL:http://www.vantek.net/pages/msutton/edwin.htm]Live
Markup
- ( [URL:http://www.mediatec.com/mediatech/] ) is a WYSIWYG HTML
editor for Windows which insulates the user completely from HTML.Excel
5.0 to HTML Table Creator
- Most HTML editing facilities leave out table-editing capabilities.
Fill that gap with Jordan Evans' Excel 5.0 to HTML Table Converter
(URL is [URL:http://rs712b.gsfc.nasa.gov/704/dgd/xl2html.html]
).WEB Wizard For beginners in search of a quick and easy way to
build a
- home page, consider WEB Wizard (URL is [URL:http://www.halcyon.com/webwizard/]
), a simple package which prepares a home page after a question-and-answer
session with the user. 16-bit and 32-bit Windows versions are
available.HTML Writer
- A simple, useful non-WYSIWYG HTML editor that cooperates closely
with most web browsers is HTML Writer, [URL:http://lal.cs.byu.edu/people/nosack/].
"Donationware."SoftQuad HoTMetaL
- SoftQuad's HoTMetaL is a WYSIWYG HTML editor designed from the
ground up to edit HTML. Unlike HTML modes for existing word processors,
every aspect of HoTMetaL reflects this purpose. [URL:http://www.sq.com/products/hotmetal/hmp-org.htm]Visual
HTML++
- Ellussion offers a basic, very easy-to-use WYSIWYG HTML creation
tool. Visual HTML++ can create attractive, simple HTML documents
but cannot edit existing HTML pages. Shareware. [URL:http://www.nfinity.com/~ellussion]WebEdit
- WebEdit is a non-WYSIWYG editor with a WYSIWYG previewer and a
WYSIWYG editor for HTML 3.0 tables. Spell-checking is standard,
and support is claimed for all HTML 3.0 features. See: [URL:http://www.nesbitt.com/]Emissary
- Wollongong's Emissary is a complete Internet software suite which
includes WYSIWYG HTML editing features (see [URL:http://www.twg.com/]
).html-helper-mode for EMACS
- Users of the EMACS editor will want to consider html-helper-mode,
an EMACS "mode" for HTML editing (see [URL:http://www.santafe.edu/~nelson/tools/]
).Gomer
- Gomer is a straightforward non-WYSIWYG editor with basic tag-match
checking. [URL:http://www.clever.net/gomer/]
HTML Editors for Unix (non-graphical)
asWedit(URL is [URL:http://www.advasoft.com/asWedit.html] ) asWedit
is a friendly, graphical editor for the X Window System on many
Unix platforms. asWedit validates HTML and does not allow tags
to be entered in the wrong context.TkWWW(URL is [URL:http://www.w3.org/hypertext/WWW/TkWWW/Status.html]
) TkWWW supports WYSIWYG HTML editing, and since it's also a browser,
you can try out links immediately after creating them.Phoenix[ftp://www.bsd.uchicago.edu/pub/phoenix]
A fully WYSIWYG HTML editor which insulates the user from direct
control of the HTML tags. Available by anonymous FTP from www.bsd.uchicago.edu
in the pub/phoenix subdirectory.ASHEA WYSIWYG HTML editor which
takes advantage of the NCSA Mosaic HTML "widget" (URL is [URL:ftp://ftp.cs.rpi.edu/pub/puninj/ASHE/README.html]
).htmltexthtmltext supports WYSIWYG HTML editing. More information
is available at the URL [URL:http://web.cs.city.ac.uk/homes/njw/htmltext/htmltext.html]
.html-helper-mode for EMACSUsers of the EMACS editor will want
to consider html-helper-mode, an EMACS "mode" for HTML editing
(see [URL:http://www.santafe.edu/~nelson/tools/] ).WebAuthorA
fully WYSIWYG commercial HTML editing product from Silicon Graphics
(URL is [URL:http://www.sgi.com/Products/WebFORCE/WebForceSoft.html]).SoftQuad
HoTMetaLSoftQuad's HoTMetaL is a WYSIWYG HTML editor designed
from the ground up to edit HTML. Unlike HTML modes for existing
word processors, every aspect of HoTMetaL reflects this purpose.
[URL:http://www.sq.com/products/hotmetal/hmp-org.htm]Miscellaneous editors
HTML Interactive Editor
- The HIE takes advantage of Netscape 2.0's JavaScript and frames
features to split the screen between a text editing window and
a rendered view of the HTML. Works on any platform that supports
Netscape 2.0. [URL:http://www.math.macalstr.edu/~smcguire/HIE/]html-helper-mode
for EMACS
- Users of the EMACS editor will want to consider html-helper-mode,
an EMACS "mode" for HTML editing (see [URL:http://www.santafe.edu/~nelson/tools/]
).jed
- jed, by John Davis, is a general-purpose text editor with a special
HTML mode which can highlight tags and perform other context-sensitive
tasks. Available by anonymous FTP from space.mit.edu in the directory:
/pub/davis/jedHTML DTD
- Another option, if you have an SGML editor, is to use it with
the HTML DTD (URL is [URL:http://www.w3.org/hypertext/WWW/MarkUp/DTDHeading.html]
).NCSA's List of Filters and Editors
- See [URL:http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/faq-software.html#editors]
for an another list of available HTML editing products.
CONVERTING OTHER FORMATS TO HTML
There is a collection of filters for converting your existing
documents (in TeX and other non-HTML formats) into HTML automatically,
including filters that can allow more or less WYSIWYG editing
using various word processors:
Rich Brandwein and Mike Sendall's List (URL is http://www.w3.org/hypertext/WWW/Tools/Filters.html
).
(Note that this URL contains uppercase and lowercase letters;
certain operating systems such as VMS require you to quote mixed-case
URLs when launching a borwser from the command line. This is NOT
a bug in the browser.)
CHECKING YOUR HTML FOR ERRORS
Tools to validate your HTML documents (check them for errors)
are available. There is a form at the URL [URL:http://www.halsoft.com/html-val-svc/]
which will check HTML documents for errors according to the latest
specification; note that you are encouraged to set up the program
on your own system if you make heavy use of the form. There is
also a tool which will check the links in your documents for links
to nonexistent resources, such as pages that have moved (URL is
http://wsk.eit.com/wsk/dist/doc/admin/webtest/verify_links.html
).
Also try weblint (URL is http://www.khoros.unm.edu/staff/neilb/weblint.html
), a Perl script that checks your HTML for errors; you can even
try it out over the web through an HTML form. The script is available
by anonymous FTP from ftp.khoros.unm.edu in the directory pub/perl/www.
Another such tool is htmlchek (URL is: http://uts.cc.utexas.edu/~churchh/htmlchek.html
), which checks HTML documents for errors, creates a cross-reference,
automatically expands entities (such as European characters) to
their proper HTML form, and performs other useful services. htmlchek
is available by anonymous FTP from ftp.cs.buffalo.edu in the directory
pub/htmlchek.
lvrfy is a simple, Unix-based link-checking program which checks
your pages for broken links (URL is [http://www.cs.dartmouth.edu/~crow/lvrfy.html]
).
Checker, at [URL:http://www.ugrad.cs.ubc.ca/spider/q7f192/branch/checker.html],
is another useful broken-link finder; binaries are available for
numerous systems.
HOW CAN I "INCLUDE" ONE HTML DOCUMENT IN ANOTHER?
Often HTML authors have a copyright notice, logo or other piece
of HTML which needs to be included on many different pages. Doing
this by hand is, obviously, painful.
One might think there would be an [HTML SRC=""] tag, much like
[IMG SRC], to include one document in another. But this has several
problems, one of which is that it would require opening a second
connection to the server. This is very inefficient (translation:
SLOW for your readers).
"So what can I do about it?"
The most common solution is the "server-side include" mechanism.
The NCSA web server, among others, can be configured to recognize
documents ending in ".shtml" instead of ".html" as documents that
it should scan for server-side include commands referencing other
documents or scripts. This is much more efficient because the
server, which presumably has direct access to the files involved,
does all the work. For details, see the NCSA server documentation
[URL:http://hoohoo.ncsa.uiuc.edu/]. Also see the documentation
of the WebCom server [URL:http://www.webcom.com/~webcom/help/inc/include.shtml],
which is particularly clear on this topic. Most servers do handle
server side includes in a manner consistent with these two.
CAN I PUT A BACK BUTTON IN MY HTML PAGE?
The short answer: "no." The user is well aware of the "back" button
in their web browser and will use it when appropriate.
The long answer: if you are writing a CGI program, you may be
able to take advantage of the HTTP_REFERER environment variable,
which sometimes contains the URL of the page the user came from.
This information is intended to help you discover which pages
link to your own. However, there is no guarantee that HTTP_REFERER
will be set, or that it will be set to the immediately previous
page. Also, if the user follows such a link, the real "back" button
will not behave as the user might reasonably expect afterwards.
Under most circumstances, you are best off exhorting the user
to "back up to the previous page." All web browsers offer such
a feature.
HOW CAN I CREATE A CUSTOM BACKGROUND AND SET THE COLORS FOR MY
WEB PAGE?
The capability to do this was introduced by Netscape in version
1.1 of that product. By now, many web browsers support it.
This is accomplished using attributes of the BODY tag. HTML authors
who are unfamiliar with the BODY tag need to know that the BODY.../BODY
tags should enclose everything in the document except for the
TITLE tag at the beginning. (The TITLE tag should be enclosed
in a HEAD tag, although this is optional. For further details
about the proper syntax of an HTML page, consult the w3 consortium
web pages [URL:http://www.w3.org/].)
The following short HTML document makes extensive use of this
feature. Try pasting this HTML into a file and opening it with
your web browser.
<HEAD>
<TITLE>Color Test<TITLE>
</HEAD>
<BODY BGCOLOR="FFFFFF" TEXT="000000" LINK="00FF00" VLINK="CC33FF"
ALINK="FF0000">
<H1>Color Test</H1>
<P>
This page contains black text on a white background.
<a href="http://www.boutell.com/">Links are displayed in green.
Once visited, links are displayed in yellow. When active,
links are be displayed in red.
</BODY>
The BGCOLOR attribute sets the background color. The TEXT attribute
sets the text color. The LINK attribute sets the color for links.
The VLINK attribute sets the color for links you have already
visited, and the ALINK attribute sets the color for links that
are active at that moment.
"Sure, but what do the numbers and letters mean?"
Those are hexadecimal digits: two digits for red, two digits for
green, and two digits for blue. 00 is black (absence of color)
and FF is maximum intensity.
_Fortunately,_ you don't need to understand that. Just use Doug
Jacobson's RGB Hex Triplet Color Chart [URL:http://www.phoenix.net/~jacobson/rgb.html],
which provides the appropriate values for lots of nifty colors.
_Please note:_ if your page is difficult to read, people will
not read it. Please use the background attributes tastefully unless
it is your intention to alienate your readership.
"How about background images?"
To set a background image, use the BACKGROUND attribute of the
<BODY> tag. For instance, a page with the body tag <BODY BACKGROUND="tile.gif"
BGCOLOR="FFFFFF"> displays the contents of the image file tile.gif
as its background. If the user prefers not to load images, which
is quite common, then the BGCOLOR attribute is used instead (see
above). It is important to set a reasonable BGCOLOR as an alternative
to your BACKGROUND attribute.
For More Information
A separate FAQ [URL:http://www.sci.kun.nl/thalia/guide/color/faq.html]
on the subject is maintained by Mark Koenen. Consult that document
for more information.
A library of background images, icons and replacements for the
normal <HR> tag is also available. [URL:http://jupiter.ufr924.jussieu.fr:1998/nabil.html]
HOW DO I GENERATE WEB PAGES FROM A PROGRAM (CGI)? ARE THERE LIBRARIES
TO MAKE IT EASIER?
Most web servers support one variation or another of a standard
for adding your own programs to the web server. The standard is
called CGI (Common Gateway Interface).
Marc Hedlund has written a FAQ on CGI programming (URL is [URL:http://www.best.com/~hedlund/cgi-faq/]
) which makes a good introduction to the subject. The standard
itself can be found at NCSA (URL is [URL:http://hoohoo.ncsa.uiuc.edu/]
).
For tips on overcoming common CGI problems, consult the CGI problems
section and the section on granting CGI access to users.
Perl CGI programmers will be interested in the CGI Perl modules
[URL:http://www-genome.wi.mit.edu/WWW/tools/scripting/CGIperl/],
which provide an elegant Perl 5 interface to CGI programming.
C-language CGI programmers will want to consider the author's
cgic library. [URL:http://www.boutell.com/cgic/] Another C-language
library for CGI programming is cgihtml [URL:http://hcs.harvard.edu/~eekim/web/cgihtml/]
from Eugene Kim. A third library is libcgi from EIT. [URL:http://wsk.eit.com/wsk/dist/doc/libcgi/libcgi.html]
Borland Delphi enthusiasts should check out the Delphi class library
for WIN-CGI programming offered by HREF Tools Corporation. [URL:http://www.href.com/]
Turbo Pascal for Windows users will be interested in a Turbo Pascal
WINCGI interface written by Markus Schlarmann. [URL:http://141.2.61.48/tpwcgi/tpwcgi.htm]
C-language CGI programmers of the Macintosh system should check
out Grant's C-language CGI framework for the Mac. [URL:http://arpp.carleton.ca/grant/mac/grantscgi/]
HOW CAN I KEEP "STATE" INFORMATION BETWEEN CALLS TO MY CGI PROGRAM?
Using Hidden Form Fields
One valid approach is to use hidden fields in forms. For example:
<INPUT TYPE=hidden NAME=state VALUE="hidden info to be returned
with form">
By now, most browsers can handle the hidden type, but understand
that some browsers will fail to hide the field (and probably confuse
the user). Note that "hidden" doesn't mean "secret"; the user
can always click on "view source."
The ugliness of a "hidden" field appearing on a browser that doesn't
understand hidden fields can be minimized by setting SIZE=0 for
that attribute.
Using PATH_INFO Another approach is to take advantage of the PATH_INFO
environment variable. PATH_INFO contains any additional text in
the URL that accessed the CGI program *after* the name of the
CGI program itself. For instance, if your CGI program's URL is:
http://mysite.com/cgi-bin/mycgi
But you open the following URL instead:
http://mysite.com/cgi-bin/mycgi/Bob/27
The program "mycgi" will still be executed -- and the environment
variable PATH_INFO will contain the text /Bob/27. You can take
advantage of this by always outputting URLs that contain the state
information you are trying to keep from one call to the next.
Keep in mind that URLs are limited to 1024 characters; browsers
are not required to cope with more than that. If you need more,
or dislike long URLs, simply keep the name of a temporary file
in the PATH_INFO section of the URL and store information about
that session in the temporary file.
Using HTTP "Cookies"
"Cookies" are a new mechanism, proposed by Netscape, which allows
the browser to keep state information supplied to it by the server.
The next time a request is made for a URL in a particular portion
of the server, the relevant "cookie" will be sent to the server
as part of the request! Cookies are currently implemented by Netscape
and by Microsoft's Internet Explorer (2.0). By the time you read
this more browsers may support them. But it is best to ensure
that your pages are still usable without them.
For example, your CGI program might output the following to set
a cookie. (Note that the Set-Cookie header must appear in its
entirety on one line.)
Content-type: text/html
Set-Cookie: cookiename=valueofcookie; expires=Saturday, 28-Feb-96
23:59:59 GMT;
path=/cgi-bin/mycgiprogram
<h1>Web page follows.<h1>
This sets a cookie which will always be sent _back_ to your server
along with every request for a document on your server with a
local URL beginning with /cgi-bin/mycgiprogram. The cookie will
continue to be sent until the expiration time. The expiration
time should be set using Greenwich Mean Time as shown above, but
note that the browser may have a poor idea of the local time zone.
For that reason it is best to set cookies to expire at least 24
hours in the future.
When your CGI program is accessed again by the user, the cookies
sent by the browser will appear in the HTTP_COOKIE environment
variable. each cookie will appear as a NAME=VALUE pair; pairs
will be separated by a semicolon followed by optional white space.
As with form submissions, unusual characters in cookies should
be escaped using the %xx notation (% followed by two hexadecimal
digits specifying the ascii code of the character).
See Netscape's Cookie Specification Page [URL:http://www.netscape.com/newsref/std/cookie_spec.html]
for more detailed and precise information.
HOW CAN I IDENTIFY THE USER WHO IS ACCESSING MY CGI SCRIPT?
Five important environment variables are available to your CGI
script to help in identifying the end user.
HTTP_FROM
- This environment variable is, theoretically, set to the e-mail
address of the user. However, many browsers do not set it at all,
and most browsers that do support it allow the user to set any
value for this variable. As such, it is recommended that it be
used only as a default for the reply e-mail address in an e-mail
form.REMOTE_USER
- This variable only set if secure authentication was used to access
the script. The AUTH_TYPE variable can be checked to determine
what form of secure authentication was used. REMOTE_USER will
then contain the name the user authenticated under. Note that
REMOTE_USER is only set if authentication was actually used, and
is not supported by all web servers. Authentication may unexpectedly
fail to happen under the NCSA server if the method used for the
transaction is not listed in the access.conf file (ie, [Limit
GET POST] should be set rather than the default, [Limit GET]).REMOTE_IDENT
- This variable is set if the server has contacted an IDENTD server
on the client machine. This is a slow operation, usually turned
off in most servers, and there is no way to ensure that the client
machine will respond honestly to the query, if it responds at
all.REMOTE_HOST
- This variable will not identify the user specifically, but does
provide information about the site the user has connected from,
if the hostname was retrieved by the server. In the absence of
any certainty regarding the user's precise identity, making decisions
based on a list of trusted addresses is sometimes an adequate
workaround. This variable is not set if the server failed to look
up the host name or skipped the lookup in the interest of speed;
see REMOTE_ADDR below. Also keep in mind that you may see all
users of a particular proxy server listed under one hostname.REMOTE_ADDR
- This variable will not identify the user specifically, but does
provide information about the site the user has connected from.
REMOTE_ADDR will contain the dotted-decimal IP address of the
client. In the absence of any certainty regarding the user's precise
identity, making decisions based on a list of trusted addresses
is sometimes an adequate workaround. This variable is always set,
unlike REMOTE_HOST, above. Also keep in mind that you may see
all users of a particular proxy server listed under one address.
MY CGI SCRIPTS DON'T WORK. HOW CAN I DEBUG THEM?
Several common causes are described here. Note that every web
server is different; your mileage will almost certainly vary.
In particular, Windows and Macintosh servers differ drastically
from Unix servers. See your server's documentation.
HOW CAN I MAKE SURE MY CGI-GENERATED PAGE IS NOT CACHED BY THE
CLIENT?
If your CGI-generated page is intended to produce completely different
content on each access, it is important to convince the web client
_not_ to display a cached copy the next time the user accesses
it.
One workaround is to make sure that all links the CGI program
generates to itself contain a unique, random piece of information
which is then ignored by the program when it arrives as part of
the PATH_INFO environment variable. But this is not ideal, since
the user will still see the same output again upon returning to
a bookmark.
However, consider the following alternatives:
Some browsers support the Pragma: no-cache header. In this case,
the following output at the beginning of your CGI program will
specify both the content type and the fact that the page should
never be cached:
Content-type: text/html
Pragma: no-cache
Note the two carriage returns at the end, always required before
the beginning of the actual document.
Alternatively, if the page is "good" for some fixed amount of
time, the "Expires:" HTTP header can be used to specify the time
after which the page must be fetched again. _Important:_ The Greenwich
Mean Time (GMT) must be specified, not the local time.
HOW CAN USERS SEND ME COMMENTS AND/OR EMAIL?
There are two ways:
Using a mailto: URL
- You can simply create a link which looks like this:
[A HREF="mailto:me@my.address"]Send Me Mail[/A]
This works great for browsers that support the mailto: URL. Perhaps
80% of web users will be able to use such a link. But not all
browsers support it.Installing a comment form
- If you have access to the server's configuration files, or if
your server administrator permits users to create their own CGI
scripts, you can create a form which sends mail to you from any
browser that supports forms. A really flexible package for this
is the mit-dcns-cgi package (URL is [URL:http://web.mit.edu/wwwdev/www/dist/mit-dcns-cgi.html]
). I've written a simple e-mail forms package (URL is [URL:http://www.boutell.com/e-mail/]
), which does it in ANSI C. There is also a package written in
Perl, known as the WWW Mailto Gateway (URL is [URL:http://www.mps.ohio-state.edu/mailto/mailto_info.html]
). GetComments (URL is [URL:http://seclab.cs.ucdavis.edu/~hoagland/])
is a more general package, also written in Perl, which can do
many different things in response to a form submission. Tcl programmers
may wish to try J.M. Ivler's TCL mail forms package [URL:http://www.crl.com/~ivler/jmi.html].
InfoMania offers a tool called Uniform which automatically formats
e-mail based on the input received from a form posting. This "one
size fits all" CGI program is a convenient alternative to writing
custom CGI programs. [URL:http://www.mornini.com/]
Macintosh users should check out forms.acgi [URL:http://www.biola.edu/cgi-bin/forms/],
a comment-handling package for Macintosh web servers.
If you want to learn how these forms actually work, see the entry
on CGI scripts.
WHERE CAN I LEARN HOW TO CREATE FILL-OUT FORMS?
Writing an HTML form is easy, but the form doesn't accomplish
anything until you write a CGI program to interpret the results
on the server side! For more information, see the section on CGI
scripts.
_"I know how to write CGI programs, I just don't know how to write
forms in HTML."_
See the w3 consortium pages [URL:http://www.w3.org/hypertext/WWW/MarkUp/html-spec/]
for the HTML specification. Also consult virtually any book about
HTML for information about forms. A good HTML reference guide
is a worthwhile investment.
See the section on e-mail forms for a simple solution to the most
commonly desired form.
HOW CAN I CREATE DECENT-LOOKING TABLES AND STOP USING <PRE>?
Tables are a standard feature in HTML Level 3, a new version of
HTML. Unfortunately, not all browsers implement them, although
they are supported by the latest versions of Netscape, NCSA Mosaic,
and Viola.
There is a way to use HTML Level 3 tables while writing your pages
and convert them automatically to HTML 2.0, allowing you to design
proper tables and install those pages directly when table support
arrives in whatever clients your users prefer. You can do this
using the html+tables package, by Brooks Cutter (bcutter@paradyne.com),
which is available for anonymous ftp from sunsite.unc.edu in the
directory pub/packages/infosystems/WWW/tools/html+tables.shar.
This package requires the shell language Perl, which is primarily
used on Unix systems but is also available for other systems (such
as MSDOS machines). html+tables accepts HTML Level 3 and outputs
html using the <PRE> construct to represent tables, allowing you
to write HTML Level 3 now, knowing that it will look better when
clients are ready for it. (This is less of an issue now that table
support is becoming widespread in better browsers.)
HOW CAN I USE INLINE IMAGES WITHOUT ALIENATING MY USERS?
If you pay any attention to comments from users of your web pages,
you will quickly learn that 500K GIFs are only pretty to the four
or five users who have a personal T1 line. I'm exaggerating, but
not all that much. It's astonishing how many web site producers
have never tested their site through one of the 14.4kbps modems
(that's only 1600 bytes per second on a good day, remember) that
the _actual customer_ is using.
But inline images can be useful, provocative and amusing. What
can be done to make them available to those who can wait for them
and unobtrusive to those who can't?
- _ALWAYS_ Provide alternatives to imagemaps Even users who run
Netscape often turn off image loading or don't want to wait long
enough for an interlaced GIF to become recognizable on their screen
in order to navigate your site. Always provide a set of text-based
links to the same destinations.
- Keep image file sizes modest For ways to make your images download
faster _without_ throwing away image quality, see the guidelines
maintained by the Bandwidth Conservation Society (URL is [URL:http://www.infohiway.com/way/faster]
).
- Provide a text-only page If you follow the guidelines above, you
may not need to provide a text-only version of your page, but
if you insist on having an image-heavy page, provide a plaintext
page as well. Please consider the needs of blind users as well
as those with limited bandwidth, and keep in mind that nearly
_all_ your users are in the latter category and will be for several
years yet!
- Use width and height <IMG> attributes Many browsers, especially
Netscape, support the width and height attributes to the <IMG>
tag. By indicating the size of your image in the [IMG] tag, you
let the browser format the entire page without waiting for that
image to start downloading. This allows the user to read your
page much sooner and makes images much less annoying. _Tip:_ although
Netscape supports scaling the image by specifying a width and
height that do not actually match the image size, using this feature
is not recommended. Not all otherwise compatible browsers handle
this scaling feature, and some older versions of Netscape have
trouble with it also.
HOW CAN I CREATE ANIMATIONS IN MY WEB PAGE?
There are now several ways to create animations in a web page.
These include server push, GIF89 multi-part image files, MPEG
and QuickTime plug-ins, Shockwave, and various Java applets designed
to present animations.
At the moment, nearly all of these options are limited to various
versions of the Netscape browser only. Right now, the best way
to create simple animations that can be seen by any user of Netscape
2.0 for any platform without breaking other browsers too severely
is probably GIF89 animation. This is an especially good solution
because these multi-part GIF files simply display their first
frame (and do not animate) on browsers that don't support animation.
This works because the GIF89 standard has included the little-used
multi-image feature for a long time and existing GIF viewers know
how to at least ignore the later frames. This is an easy way to
achieve backwards compatibility with older browsers.
One tool that can be used to create such GIF89 animations on the
PC platform is GIFMake [URL:http://www.fastlane.net/~samiel].
Tools are also available for other platforms (submissions welcome!).
One note of warning: Netscape always delays one-tenth of a second
between frames, regardless of the frame delay you set in your
GIF89 file.
HOW CAN I DISTRIBUTE AUDIO THROUGH THE WEB?
Not all web browsers have audio support built-in, but nearly all
can launch external "viewers" to handle audio. These player programs
are widely available as freeware or shareware for most architectures
(or standard with your operating system).
Audio is a particularly thorny case owing to the need to download
the entire audio program before it can be heard. Fortunately,
there are now systems available which avoid this problem by playing
the audio as it is downloaded.
RealAudio
- By Progressive Networks (URL is [URL:http://www.realaudio.com]).
The RealAudio player can communicate with a specialized RealAudio
server in order to play back audio as it is downloaded, eliminating
download delays even over long distances and slow modems. RealAudio
now supports a variety of quality levels and non-audio features
such as HTML pages displayed in synchronization with the audio.
RealAudio players are available for Microsoft Windows, the Macintosh,
and several Unix platforms.StreamWorks
- By Xing Technology (URL is [URL:http://www.xingtech.com/]). StreamWorks
also provides streaming audio playback, again in conjunction with
a special StreamWorks server. Video is also available and can
be played back over modems at reduced frame rates. StreamWorks
players are available for Microsoft Windows, the Macintosh and
several Unix platforms.Winplay
- Winplay [URL:http://www.iis.fhg.de/departs/amm/layer3/winplay3]
is unusual in that it offers very high-quality audio using MPEG
Level 3 compression, which is currently not available in other
products. For Windows only at the time of this writing. Winplay
does not currently offer a specialized streaming server.VocalTec
- VocalTec [URL:http://www.vocaltec.com/] also offers streaming
audio technology for the web. VocalTec's Internet Wave product
is available for the Microsoft Windows platform only.
HOW CAN I GENERATE GIFS ON THE FLY FROM MY CGI SCRIPTS?
If you want to generate GIF images on the fly as part of your
application, examine the gd library (URL is: http://www.boutell.com/gd/
). _Hint:_ your HTML page and your inline images are separate
documents with separate URLs. Generate them in response to separate
requests! (Yes, there are tricks to speed this up, but be careful
not to break inline images on HTML pages you didn't write that
refer to your gd-generated image.)
Adaptations of gd are available for Tcl, Perl, Python, and other
languages. See the gd page, listed above, for more information.
Macintosh users should look into clip2gif (available by anonymous
FTP from orathost.cfa.ilstu.edu in the directory /public/oratClasses/ART389.88Seminar/software
and from many other sites including Info-Mac in the directory
gst/grf). clip2gif supports assembling images from other images
via AppleScript.
Perl users may also be interested in pgperl [URL:http://www.ast.cam.ac.uk/~kgb/pgperl.html],
an extended version of Perl which supports GIF output and can
be used to good effect in CGI applications.
It's also possible to use gnuplot and the pbmplus utilities. These
approaches is slower, but can require less programming if gnuplot
is sufficient for your purposes. (See archie for both tools.)
WHAT IS HTML LEVEL 3 AND WHERE CAN I LEARN MORE ABOUT IT?
HTML Level 3, formerly known as HTML+, is an enhanced version
of HTML designed to address some of the limitations of HTML. HTML
Level 3 supports true tables, right-justified text, centered text,
line breaks that do not double space, and many other desired features.
However, most clients support only a handful of HTML Level 3 features
at the time of this writing. The most commonly implemented major
feature is table support. If you have access to a Unix system
with the X Window System installed, you can try out many features
of HTML Level 3 using the experimental Arena browser.
You can access information about new developments in HTML at the
CERN server (at the URL http://www.w3.org/hypertext/WWW/MarkUp/MarkUp.html).
(HTML Level 1 is the original version. HTML Level 2 is essentially
the same, but with the addition of forms.)
HOW DO I COMMENT AN HTML DOCUMENT?
Place <!-- at the beginning of EACH line commented out; close
this for EACH line with --> . Note that comments do not nest,
and the sequence "--" may not appear inside a comment except as
part of the closing --> tag.
You should not try to use this to "comment out" HTML that would otherwise be
shown to the user, since some browsers (notably Mosaic) will still
pay attention to tags inside the comment and close it prematurely.
HOW DO I SET UP A CLICKABLE IMAGE MAP?
There are really two issues here: how to indicate in HTML that
you want an image to be clickable, and how to configure your server
to do something with the clicks returned by Netscape, Mosaic,
and other clients capable of delivering them. Client side imagemaps,
a new feature of Spyglass Mosaic, Microsoft Internet Explorer
and Netscape (2.0 and beyond), change this picture by allowing
imagemaps to be created as part of an HTML document without the
need for special server software.
One of the best resources available on the subject is the Imagemap
Help Page [URL:http://www.hway.net/ihip/], maintained by Steve
Rogers.
You can read also about image maps and the NCSA server at [URL:http://hoohoo.ncsa.uiuc.edu/docs/tutorials/imagemapping.html]
). Also see Joseph Walker's collection of imagemap resources (URL
is [URL:http://www.ncsu.edu/bae/people/faculty/walker/hotlist/imagemap.html
] ).
Using imagemaps requires that you create a map file; you can do
this by hand or with a WYSIWYG tool.
VERY IMPORTANT: Creating server-side imagemaps, which all web browsers understand,
requires a real web server (not an FTP server) and a cooperative
web server administrator. _It is not usually as simple as wrapping
a link around an IMG SRC tag and adding the ISMAP directive;_
the server must also be told about the map file, and the way to
accomplish this varies from server to server. So _read your server
documentation,_ and don't waste time making maps before making
sure you have the necessary tools to deliver them.
What about "client-side" imagemaps?
Several of the newer browsers, notably Microsoft Internet Explorer,
Spyglass Mosaic and Netscape 2.0 and later, support "client-side"
imagemaps. This is a Good Thing, because the imagemap is part
of the HTML page and a server need not be involved. However, keep
in mind that not every user has a browser that supports such imagemaps.
Client-side imagemaps can be used side by side with old-fashioned
server-side imagemaps,which are used if the browser does not understand
the newer type. Many of the imagemap editors listed below support
the creation of both types of maps.
Programs to Edit Imagemaps
- Mapedit
The author's Mapedit (URL is [URL:http://www.boutell.com/mapedit/])
is a straightforward WYSIWYG imagemap editing tool for both Microsoft
Windows and the X Window System. Versions 2.0 and later support
client-side imagemaps, will open GIF, JPEG and PNG-format images,
and offer new editing controls. Mapedit makes it particularly
easy to create client-side imagemaps by opening existing HTML
documents directly. Shareware.
- Map THIS
Map This (URL is: http://galadriel.ecaetc.ohio-state.edu/tc/mt)
is a feature-laden WYSIWYG imagemap editing tool for Microsoft
Windows 32-bit environments (Win32s, Windows 95 or Windows NT
required; Win32s is available from Microsoft's FTP site, ftp.microsoft.com,
among other places). Supports both client-side and server-side
imagemaps. Free.
- Web Hotspots
Web Hotspots (URL is [URL:http://www.hooked.net/users/1auto] )
is a feature-rich imagemap editor for all Windows sytems, supporting
zoom, advanced shape manipulation, and multiple-document interface.
The latest version has support for client-side imagemaps, subtracting
interior regions from hotspots, live testing if you are connected
to the net, and other high-end features. Shareware.
- HoTTmapP
Another WYSIWYG imagemap editor for Windows. Features permanent
associations between images and map files for convenient reopening
and manipulation of existing shapes. The capability to merge data
from multiple MAPs is also provided. See [URL:http://www.tikipub.com/jc/]
for more information.
- MapMaker
For users of John Bradley's _xv_ image display software for the
X Window System, Mapmaker can turn the miniature images created
by xv's Visual Schnauzer into an imagemap. This is useful if you
would like to make an entire directory of images available (but
note that you should also make textual links to allow those with
text- based browsers to download the images for external viewing).
(URL is: http://icg.stwing.upenn.edu:80/~mengwong/mapmaker.html
)
- MacMapMaker
On the Macintosh, you may want to use MacMapMaker [URL:ftp://ftp.uwtc.washington.edu/pub/Mac/Network/WWW]
MacMapMaker produces both NCSA and CERN-compatible maps, which
can also be used with MacImagemap and a Macintosh-based server
(MacImagemap is found in the same directory).
- WebMap
The original macintosh Imagemap editor. [URL:http://www.city.net/cnx/software/webmap.html]
- Tkmapedit
For Unix systems and other systems on which the Tk/Tcl language
toolkit has been installed, Tkmapedit provides a WYSIWYG imagemap
editor which is capable of directly testing links if the tkWWW
web browser is available. Available by anonymous FTP from the
TCL archive on ftp.aud.alcatel.com.
- glorglox
For Unix systems, glorglox is a unique imagemapping tool which
allows color indexes in GIF images to be associated with URLs.
It's easier to use this than to describe it (or pronounce it),
so check out the glorglox home page (URL is [URL:http://www.uunet.ca/~tomr/glorglox/]).
HOW CAN I MAKE TRANSPARENT AND INTERLACED GIFS? AND WHAT ARE THEY?
Transparent GIFs are useful because they appear to blend in smoothly
with the user's display, even if the user has set a background
color that differs from that the developer expected. They do this
by assigning one color to be transparent -- if the web browser
supports transparency, that color will be replaced by the browser's
background color, whatever it may be.
Interlaced GIFs appear first with poor resolution and then improve
in resolution until the entire image has arrived, as opposed to
arriving linearly from the top row to the bottom row. This is
great to get a quick idea of what the entire image will look like
while waiting for the rest. This doesn't do much for you if your
web browser doesn't support progressive display as the image is
downloaded, but non-progressive-display web browsers will still
display interlaced GIFs once they have arrived in their entirety.
You can make transparent and interlaced GIFs through the web without
running any utility software on your own system through the Visioneering
image manipulation page (URL is [URL:http://www.vrl.com/Imaging/]),
which will access your image through the web and produce an enhanced
version for you to save.
To create transparent and interlaced GIFs under Unix, check out
David Koblas' giftool, a program which can manipulate those options
and many more aspects of your GIF file.
For Windows PCs, try Lview Pro, version 1A or later, available
by anonymous FTP from oak.oakland.edu in the directory SimTel/win3/graphics:
[URL:ftp://oak.oakland.edu/SimTel/win3/graphics]
As well as from many mirror sites.
You can also create transparent and interlaced GIFs using the
widely available NETPBM tools (an enhanced version of the older
pbmplus tools, which do _not_ support these options). The following
Unix shell script, contributed by Shane Castle, can make any GIF
image transparent if a recent version of the netpbm utilities
has been installed:
#!/bin/sh
if [ $# -lt 2 ]
then
echo "Usage: transparize gifname color"
echo " gifname - name of GIF file"
echo " color- color ID to make transparent"
exit 1
fi
giftoppm $1 | ppmtogif -interlace -transparent $2 ] /tmp/$$.gif
if [ $? -eq 0 ]
then
mv /tmp/$$.gif $1
else
rm /tmp/$$.gif
fi
Make the script executable using the chmod command. Usage is as
follows:
transparize [image.gif] [transparent-color]
In addition, there is a document explaining transparent GIFs available
at the URL http://melmac.corp.harris.com/transparent_images.html
. You can fetch the program giftrans by anonymous ftp from ftp.rz.uni-karlsruhe.de
at the path /pub/net/www/tools/giftrans.c.
There is also a Perl Script (URL is: [URL:http://www.wg.omron.co.jp/~jfriedl/perl/#transgif]
) which makes transparent GIFs.
There are also five utilities for the Macintosh, Transparency
([URL:http://www.med.cornell.edu/~giles/projects.html#transparency]
), Graphic Converter (available from the "usual Macintosh FTP
sites", such as mac.archive.umich.edu; see the Macintosh newsgroups
for general information on where to retrieve Macintosh software),
Imagery (again, available from many Macintosh FTP sites), and
clip2gif (available by anonymous FTP from orathost.cfa.ilstu.edu
in the directory /public/oratClasses/ART389.88Seminar/software
and from many other sites including Info-Mac in the directory
gst/grf).
A unique approach to the problem is offered by Imagizer (URL is
[URL:http://www.minet.com/minet/imagizer.html] ), which transforms
your images on the fly when sending them to the user, supporting
thumbnails and TIFF-GIF conversion as well as interlacing. (Of
course, there is a tradeoff between storage space and CPU usage.)
WHY DO MY TRANSPARENT GIFS LOOK (GRAINY, CHUNKY, NOT SO TRANSPARENT)...
This is usually due to a browser bug. Make them non-transparent
or accept that a particular browser will not display them very
well. In particular, some browsers do not display transparent
GIFs well in 16-color modes under Windows. You may also wish to
respectfully suggest to users of 16-color Windows that they try
switching to a 256-color mode. Most graphics cards support 256
colors, but a surprising number of machines are configured for
16 colors instead.
WHICH FORMAT IS BETTER FOR WWW IMAGE PURPOSES, JPEG OR GIF?
Almost all browsers can view inline JPEG, and there are free libraries
available to do that, so the remaining browser vendors are _very
short on excuses._ There is no need to avoid inline JPEG any longer.
So the proper question is, which format is better for your specific
purpose?
JPEG is for photographic images. GIF is for line-art images, such
as icons, graphs and line-art logos. You will very likely find
that JPEG produces smudgy line art and GIF produces large and
washed-out photographs. Use them accordingly.
However, _never convert GIF to JPEG_ if you can possibly help
it. Once your photograph has been reduced to the mere 256 colors
supported by GIF, it's too late. Go straight from a lossless 24-bit
format supported by your scanner, such as TIFF or PNG, to JPEG.
Since JPEG is an approximate representation of the image, you
shouldn't save things as JPEG and then edit them further later
and save them again. You can expect progressive loss of quality
each time you do that, especially with different JPEG quality
settings. If you must edit a photographic image, work with it
in TIFF or PNG format until it is ready for publication, then
convert it to JPEG for the web.
If your images can't tolerate being reduced to 8 bits for GIF
_or_ losing precise accuracy for JPEG, TIFF and PNG are your best
options. Web browsers are beginning to support the latter, and
many external viewers support both. The vast majority of web sites
should be using GIF for line art and JPEG for everything else,
and migrating from GIF to PNG as browser support for PNG becomes
more widely available.
Also see the Independent JPEG Group's JPEG FAQ [URL:http://www.cis.ohio-state.edu/hypertext/faq/usenet/jpeg-faq/top.html]
for more information about JPEG and software that can produce
JPEG-format images, including progressive JPEGs.
WHAT'S A PROGRESSIVE JPEG? HOW CAN I MAKE ONE?
"Progressive" JPEG is a new variation on the JPEG image format.
(This document also offers a comparison of the JPEG and GIF formats.)
Progressive JPEGs are like interlaced GIFs in that they "fade
in" gradually instead of being drawn from top to bottom. This
enables the viewer to understand what the image is about very
quickly.
Progressive JPEG is much smoother than progressive GIF.
There is one problem: most web browsers _don't support progressive
JPEG yet._ This means that progressive JPEG images will _not_
display in those browsers at all (they will appear to be "broken").
Netscape's Netscape Navigator 2.0 beta 1 and Spyglass' Enhanced
Mosaic 2.1f5 and later _do_ support progressive JPEG.
A note to browser authors: the independent JPEG group library supports progressive JPEG,
so get off your butts and implement it! This is impressive stuff,
and there are no fees to use the technology.
"OK, I understand that not everyone can see progressive JPEGs
yet. How do I make these nifty new images?"
InTouch Technology [URL:http://www.in-touch.com/pjpeg.html] offers
an informative page on the subject, including information about
their Image Transmogrifier software, which can produce progressive
JPEG images.
Also see the Independent JPEG Group's JPEG FAQ [URL:http://www.cis.ohio-state.edu/hypertext/faq/usenet/jpeg-faq/top.html]
for more information about JPEG and software that can produce
JPEG-format images, including progressive JPEGs.
CAN I BUY SPACE ON AN EXISTING SERVER?
Yes, you can. A list of sites offering WWW space for lease is
available (at the URL http://union.ncsa.uiuc.edu/HyperNews/get/www/leasing.html
).
HOW DO I MAKE A "LINK" THAT DOESN'T LOAD A NEW PAGE?
Such links are useful when a form is intended to perform some
action on the server machine without sending new information to
the client, or when a user has clicked in an undefined area in
an image map; these are just two possibilities.
A CGI script (see the CGI section) can accomplish this by outputting
just the following:
Status: 204 No Content
Followed by two line feeds (ascii 10 decimal). The web browser
will take no action.
HOW CAN I REDIRECT THE BROWSER TO A DIFFERENT PAGE?
It is possible to redirect the browser to a different URL, effectively
"forwarding the call" to a different page. To do this, either
take advantage of the redirection features offered by your web
server or write a CGI program which outputs the following:
Location: http://desiredsite.com/desiredpath/document.html
Note that two line breaks must follow this line.
A few older browsers may have difficulty following such directives.
You can combat this problem by outputting a short page of HTML
to the user after the above information, explaining that the page
has moved.
There are also a few browsers which expect to see a URI: header
as well as a Location: header. If you wish to be agonizingly thorough,
output both headers before the double line break.
HOW CAN THE USER DOWNLOAD BINARIES (SUCH AS .ZIP AND .EXE FILES)
FROM MY SERVER?
There is no specialized [DOWNLOAD] tag in HTML. Just do two things:
link to the binary in question using a normal <A HREF=...] tag,
and make sure that your web server is configured to output a reasonable
content type for .zip, .ZIP, .exe and .EXE files.
You probably do not want to edit the mime.types file of your server,
because new versions of this file regularly become available with
new versions of the web server. Instead, under the NCSA server
and its derivatives, use the AddType and AddEncoding commands
in your server configuration file. After making a change to your
configuration files, always signal the server process to reexamine
those files by using the kill -1 command.
Consider adding the following lines:
AddType application/zip zip
AddType application/zip ZIP
AddType application/octet-stream exe
AddType application/octet-stream EXE
In general, the content type application/octet-stream is an excellent
choice when there is no appropriate "external viewer." A typical
browser will then prompt the user to save the file. However, if
there is a more appropriate content type, you should of course use that
type instead.
On occasion, users may have encountered very badly behaved servers
which encourage users to set up a specialized external viewer
for the application/octet-stream content type. This makes life
difficult for everyone. _Programmers: please don't encourage users
to configure an external viewer for application/octet-stream.
That content type should be reserved for downloads. If you have
created an external viewer for a brand-new form of information,
invent a new and appropriate content type for your application's
data and configure your server to output that content type. Make
your content type known to the public aand to the authors of web
servers so it can be added to the mime.types file._
"How do I suggest a filename?"
To encourage the user to save the file under an appropriate filename,
do the following:
- Tell the user exactly what name to save the file under, in case
the web browser is not cooperative and suggests an absurd filename
or no filename at all.
- Make sure the URL the user is accessing actually ends in a reasonable
filename. (This warning is intended primarily for CGI programmers.
Under normal circumstances this is not a problem.)
"So I can use http to download binaries?"
Yes, and practically all browsers are bright enough to save them
properly if you follow the suggestions above. It is not necessary
to use the ftp protocol for binary downloads.
HOW CAN I MIRROR PART OF ANOTHER SERVER?
Scripts are available to do this, but at this time they are not
very friendly to the server you are attempting to mirror; their
behavior resembles that of the more poorly written WWW robots.
If you are trying to improve access times to a distant server,
you will likely find the "proxy" capabilities of CERN's WWW server
to be a more effective and general solution to your problem.
If you are interested in mirroring part of a server you control,
perhaps to improve reliability by providing alternatives, then
there are many available mirroring tools. Keep in mind such simple
possibilities as (under Unix) a cron job that tars and compresses
the web site on the "master" server at a particular time of day,
and another cron job on the "slave" server which takes advantage
of the lynx browser to retrieve that document, then uncompresses
and untars it. For many purposes, such simple mirroring setups
are effective and near-foolproof.
DO MAILTO: URLS WORK IN ALL BROWSERS?
The mailto: URL is a feature found in Lynx, Netscape, Spry Mosaic,
the latest NCSA Mosaics, Emacs w3 mode and many other browsers.
In general, about 80% of web browsers support mailto: at the time
of this writing. However, it is not in numerous older browsers.
It is of course also possible to set up forms which send mail
to you; see the entry regarding e-mail forms.
HOW CAN I SERVE [WORD DOCUMENTS, EXCEL SPREADSHEETS, DOUGHNUTS]?
In order to deliver documents of new and different types from
your server, you need to configure the correct "content type"
for each type of document, and use the proper extension when naming
the file on the server. If the document type is highly unusual,
you will also need to see to it that users know what content type
to configure their browsers for, and what application to launch
for that content type.
Presented below is a list of the better-known content types with
commentary on those the author is familiar with. This information
is drawn from appendix 2 of the author's book, CGI Programming
in C and Perl [URL:http://www.boutell.com/cgibook/]. The original
list of content types was taken from the public domain NCSA web
server [URL:http://hoohoo.ncsa.uiuc.edu/].
Please note: new media types are coming into existence regularly.
The official registry is often well behind actual practice. This
list is based on that included with NCSA's public domain web server
as of September 1995.
No attempt is made here to document the format of the data associated
with these mime types. This list is intended to make it easier
to determine what content type should be assigned to documents
produced by various well-known applications.
Media Content Type Comments
application/activemessage
application/andrew-inset
application/applefile
application/atomicmail
application/dca-rft
application/dec-dx
application/mac-binhex40
application/macwriteiiMacWrite Document
application/msword Microsoft Word Document
application/news-message-id
application/news-transmission
application/octet-stream Use for binary file downloads
application/oda
application/pdf Adobe Acrobat Documents
application/postscriptPostscript
application/remote-printing
application/rtf Rich Text Format
application/slate
application/x-mif
application/wita
application/wordperfect5.1 WordPerfect 5.1 Documents
application/wordperfect6.0 WordPerfect 6.0 Documents
application/x-csh Potentially dangerous [1]
application/x-dvi TeX/LaTeX Output (not TeX source)
application/x-hdf
application/x-latexLaTeX Source
application/x-netcdf
application/x-shPotentially dangerous [1]
application/x-tcl Potentially dangerous [1]
application/x-tex TeX Source
application/x-texinfo
application/x-troffTroff Formatter Source
application/x-troff-man Troff Source, -man argument assumed
application/x-troff-meTroff Source, -me argument assumed
application/x-troff-msTroff Source, -ms argument assumed
application/x-wais-source
application/zip Many users have ZIP helper apps
application/x-bcpio
application/x-cpio cpio tape format (Unix)
application/x-gtar gnu tar tape format (Unix)
application/x-shar Potentially dangerous [1]
application/x-sv4cpio
application/x-sv4crc
application/x-ustar
audio/basic Sun-style .au format audio
audio/x-aiff Amiga-format .aiff audio
audio/x-wav Microsoft Windows-format .wav audio
image/gif Compuserve GIF 8-bit lossless images
image/ief
image/jpegJPEG lossy photographic images
image/png w3 consortium PNG lossless images
image/tiffTIFF format images
image/x-cmu-raster
image/x-portable-anymap netpbm/pbmplus images (any subtype)
image/x-portable-bitmap netpbm/pbmplus black and white images
image/x-portable-graymap netpbm/pbmplus grayscale images
image/x-portable-pixmap netpbm/pbmplus truecolor images
image/x-rgb
image/x-xbitmap X Window System black and white images
image/x-xpixmap X Window System color images
image/x-xwindowdumpX Window System screen dump format
message/external-body
message/news
message/partial
message/rfc822
multipart/alternative
multipart/appledouble
multipart/digest
multipart/mixed Server push
multipart/parallel
text/html HTML documents
text/x-sgml SGML documents, not limited to HTML
text/plainPlain ASCII text
text/richtextThis is not RTF (see above)
text/tab-separated-valuesUseful for spreadsheet interchange
text/x-setext
video/mpegMPEG video format; common on PCs, Unix
video/quicktime Apple video format
video/x-msvideo Microsoft/Intel AVI video format
video/x-sgi-movie
[1]: Browsers should almost never be configured to execute shell
scripts. This is a dangerous practice as the script in question
could simply consist of rm * or another harmful command. Those
interested in sending code to the browser should consider safe
scripting languages such as Java, Safe-TCL and PGP-SafePerl.
HOW DO I PUBLICIZE MY WORK?
There are several things you can do to publicize your new HTML
server or other offering:
- Post to comp.infosystems.www.announce. PLEASE READ THE CHARTER
POSTING FIRST. In general, always read a newsgroup first to familiarize
yourself before posting to it.
- Submit it to Yahoo (URL is [URL:http://www.yahoo.com/] ), an impressive
index of the web which expands its knowledge automatically but
permits the direct submission of URLs as well.
- Submit it to a large number of different catalogs using Submit
It [URL:http://www.submit-it.com/], a service which allows you
to register with many indexes by filling out a single form.
- A similar one-step submission service is entitled wURLd Presence
[URL:http://www.ogi.com/wurld/].
- Submit it to the NCSA What's New Page at the URL http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/whats-new.html
(see the page for details on how to submit your listing!).
- Register your URL in the Lycos Database (URL is [URL:http://www.lycos.com/]
).
- Submit your URL to the maintainers of various catalogs, such as
the WWW Virtual Library (at the URL http://www.w3.org/hypertext/DataSources/bySubject/Overview.html
) and the ALIWEB index (at the URL http://web.nexor.co.uk/aliweb/doc/aliweb.html
).
- Read Gareth Rees' guide to publishing on the World Wide Web. (URL
is http://www.cl.cam.ac.uk/users/gdr11/publish.html ).
- Consult Pete Page's How to Announce your New Web Site (URL is
[URL:http://ep.com/faq/webannounce.html] ).
HEY, I KNOW, I'LL WRITE A WWW-EXPLORING ROBOT! WHY NOT?
Programs that automatically traverse the web can be quite useful,
but have the potential to make a serious mess of things. Robots
have been written which do a "breadth-first" search of the web,
exploring many sites in a gradual fashion instead of aggressively
"rooting out" the pages of one site at a time. Some of these robots
now produce excellent indexes of information available on the
web.
But others have written simple depth-first searches which, at
the worst, can bring servers to their knees in minutes by recursively
downloading information from CGI script-based pages that contain
an infinite number of possible links. (Often robots can't realize
this!) Imagine what happens when a robot decides to "index" the
CONTENTS of several hundred mpeg movies. Shudder.
The moral: a robot that does what you want may already exist;
if it doesn't, please study the document World Wide Web Robots,
Wanderers and Spiders (URL is: http://web.nexor.co.uk/mak/doc/robots/robots.html)
and learn about the emerging standards for exclusion of robots
from areas in which they are not wanted. You can also read about
existing robots there.
HOW CAN I PUT AN ACCESS COUNTER ON MY HOME PAGE?
First of all, don't. It defeats caching proxy servers, putting
more load on your server. It forces your server to run an external
program for every page with a counter on it, putting more load
on your server. And it advertises your demographics or lack of
them to the world.
_"Yeah, but I want to know how many people are accessing my page."_
Of course you do. Use one of the many statistics tools available
to analyze the access log of your web server. Even if you are
not the webmaster of your server, your admin will probably give
you read-only access to the log files.
Also check out Tim Drozinski's amazing FAQ on counting accesses
without the need for CGI programs [URL:http://erau.db.erau.edu/~tjd/log_faq.html].
Most of the techniques recommended there don't abuse the server,
which is a Good Thing.
_"I want an access counter anyway."_
In that case, in addition to Tim Drozinski's page above, consider
the index of access counter software at Yahoo (URL is [URL:http://www.yahoo.com/Computers/World_Wide_Web/Programming/Access_
Counts/] ). Keep in mind that you must have CGI access at a minimum,
and server-side includes must also be turned on unless you are
willing to build your entire page with CGI or use a program that
generates the access count as an inline image. None of the above
approaches are efficient.
|