Validating Australian Museum Online
How do you go about validating approximately 10,000 pages? The Australian Museum web team decided it was possible, and worthwhile. So how did we do it? Here are the steps we took:
GLOBAL CHANGES
These changes were done across the entire site. This meant that we sometimes changing over 10,000 pages at a time.
Character Set
We added the character set below to all files:
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
Doctype
Many of our HTML pages had the old Doctypes HTML 3.2 and HTML 4.0, the latter throwing errors with the use of the "name" attribute in image tags which is required for JavaScript image rollovers (and was among the reasons that W3C went to version 4.01), so we replaced this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
With this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
Javascript Tag
Our javascript tags often had no type included, which is invalid html. So, we changed this:
<script language="JavaScript">
To this:
<script language="JavaScript" type="text/javascript">
Bad body tags
We are removing the invalid body tag "hack" (commonly referred to as the "four horsemen of the apocalypse") that is used to tuck elements into the top left corner of the page in older browsers. This code in the body tag:
leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"
is deleted and replaced with an addition to the CSS file. Something like this:
body {color: #xxx; background-color: #xxx; font-family: xxx; padding: 0; margin: 0; }
Bold and Italic tags
As bold and Italic tags are deprecated, we did global changes across all of our sites and replaced this:
<b></b> with this: <strong></strong>
and this:
<i></i> with this: <em></em>
SECTION BY SECTION CHANGES
There were many that could not be done globally. These were generally done by downloading a section of the site at a time and doing mini global changes. Site-section changes included:
Image-based submit buttons
We had always added width, height and border to all image based submit buttons. We did this for older browsers. If you leave off a border tag for Netscape 4+ you get an ugly blue (active) border around the button.
To change this we simply removed all width, height and border tags within submit buttons.
Invalid characters
As we often pulled content from MSWord into HTML Editors we often found invalid characters that needed to be replaced or removed (as well as hundreds of horrid local styles). Some of the more common invalid characters include:
†- replace with spaceí- replace with'(single quote mark)&- replace with&(only in non html text)ñ- replace with-(dash)ë- replace with'(single quote mark)…- replace (three dots within single character) with...(three dots)‘replace with'(single quote mark)’replace with'(single quote mark)–replace with-(dash)
CSS CHANGES
Finally, there were many CSS files that needed to be edited by hand including:
- We added "background-color" whenever "color" was specified and visa versa. Often this meant setting the background colour to "transparent". This is recommended by WC3.
- We added quote tags around font names with white space. Without quote marks, white space in font names will be ignored.
There were other minor adjustments we made throughout our site during this process. However, it should be mentioned that our files were reasonably close to valid when we began. We always used quote marks around attributes and all images had "alt" tags, so our task was not as large as it could have been.
The bottom line is that making a large site 100% valid can be done. Apart from a few sections that we are still working on, our site is 100% valid. The proof is here.
Russ Weakley
15-Feb-03
![]()
Copyright © Webboy, 2008

