This is a copy of the article that I wrote for XML Magazine.  In order to view it from DEVX, you must have a membership since they have archived it and I guess that they charge for access to archived articles.  Because of this some of the download links may not work correctly.

 I provide it here as a courtesy for my students and prospective clients.  Enjoy!


DevX Home    Premier Club    Search    RFP Exchange    eLearning    Code Library    Help    Locator+    Shop DevX    
 
 
June / July 2001

 PRESENTATION

 

Take a Lesson
from a Class Act

Learn from this real-life XML example: Design patterns can improve
the efficiency and maintainability of your Web site

by Andrew C. Mayo

 

As an instructor of XML at a local community college, I thought it would be a good idea to create a Web site for my students that provides a practical example of how XML can be used in an Internet application.

I designed this site mainly in response to a question I hear often at the beginning of my course: "Why would I want to use a tagging language in application development?" However, the site goes beyond simply addressing this question. By using a handy mapping technique and design pattern, and transforming XML to HTML on the server side, I created a scalable Web site that's easy to maintain and that can be delivered using just two Active Server Pages (ASP). You could probably extend this technology to a more interactive Web site, such as one with heavy forms to process, or maybe even to an e-commerce Web site, but I will leave that decision and implementation up to you.

My goals for this site were that it be scalable, easily maintainable, and, of course, constructed using XML technology. Since most of the information is of a static nature, I decided that my project should take on the look and feel of a corporate Web site (here we are, this is what we do, here is some information, and so on). Thus the site needed a common menu system throughout for easy site navigation.

 
Figure 1. What a Site. Click here .

The Web site is an information resource, where class members can obtain a course outline, lab assignments, homework assignments, sample code, resources/links, and the course syllabus. This way, instead of photocopying many documents, taking them to class, and handing them out (which is environmentally incorrect and wrecks my back), I can direct students to the Web site. They make the decision whether or not to print the information.

I also wanted to be able to add information to the site as needed. For example, I wanted to make lab and homework assignments available on the Web the same day that I planned to hand them out in class. I did not want to edit an HTML page to do this. Basically, I wanted to "turn on" features of the site as they were needed.

The Epiphany
Have you ever noticed a similarity between an XML document and a Web site diagram like the one presented in Microsoft FrontPage? A site diagram is a hierarchy of the pages in a Web site, all of which stem from the home, start, or default page. An XML document is also a hierarchy, except it is a hierarchy of elements that stem from the document or root element. The only difference that I saw between the two was that the Web site diagram is typically viewed from the top-down-left-to-right fashion (see Figure 1), whereas an XML document is typically viewed from left to right and then down (see Figure 2).

 
Figure 2. Separated at Birth? Click here .

Because of the similarity between a site diagram and an XML document, I concluded that I could use an XML document structure to map the contents of a Web site-in other words, construct a site diagram using an XML document. The difference between a standard HTML-based site and the one I wanted to build was that each page in our site would be created using Extensible Stylesheet Language Transformations (XSLT). I decided to name the XML document containing the Web site hierarchy Master, or Master.xml.

A Web site is a collection of Web pages; one page is designated as the main or default page and all other pages are subordinate to it. The Master.xml document has a root element of Master and a subordinate element of Page that designates a page participating in the Web site. Each Page element is adorned with attributes designating the Page as a participant in the Main Menu for the site, and telling if the page is active (displayable), its Name (which is used in display), its ID or internal identifier, and if it is the default or start page.

The ID attribute would be used for site navigation to abstract the site contents from the user. This ID attribute is designated as an ID in the DTD, so its value must be unique within the XML document. A page can also be referenced via an IDREFS attribute, which means that the page has references to other pages within the site.

 
Figure 3. Follow the Map. Click here .

To fulfill my design goal of making aspects of the system available on an as-needed basis, I created an Active attribute that simply contains a value of "Yes" (available for display) or "No" (not available for display). This allows me to release pages for the site as they are completed. Each page must designate if it participates in the Main or common menu of the Web site. In addition, since the root element of an XML document is there only to anchor the others, I had to designate a page as the Default page for the system. I did this with the Default='Yes' attribute. Finally, since I was using server-side transformations to create an HTML document, each Page contained subelements describing the data and presentation parts of the rendered HTML page. I named this element the Doc element-each page has two.

The Doc element contains a Type attribute, which designates the element as content (xml) or presentation (xsl). A Doc element with a type of xml references an XML document that contains the data for the page. Conversely, an xsl type references an XML document that contains the transformations necessary to present the page to the user of the Web site. The references to the discreet name and location of the XML document are contained within the URL attribute of the Doc element. For a diagram outlining the hierarchy of an XML document structure in a Web site, see Figure 3.

This sample of the contents of Master.xml contains the default page for the site, which is appropriately named Home:

<Master>
<Page active='Yes' default='Yes' 
   MainMenu='Yes' id="P00" Name="Home">
<doc Type="xml" URL="\xml\home.xml" />
<doc Type="xsl" URL="\xsl\home.xsl" />
</Page>

So, a Web site contains pages that contain XML documents that are used to create the HTML page using XSLT.

The Transformation
Now that we can structure a Web site in the form of an XML document, how do we present it to the user? The XML specification provides for the transformation of one XML document into another, better known as XSLT. The XSLT spec is at recommendation status; therefore it is considered final and application vendors may safely write code to the specification without fear of a syntax change. Until the Extensible Stylesheet Language (XSL) reaches recommendation status, the only way to present XML documents in the browser is through XSLT transformations.

Note that the XML document containing the XSLT instructions can also contain HTML tags. In our case, this is mandatory because we deliver the transformed document to a browser. Furthermore, since the transformation document is an XML document, it must be well formed. This means that each HTML tag within the XSLT transformation document must have an end tag. These HTML tags must have the same case and may not be embedded within another tag. For example, every <p> tag must have a </p> end tag, and you cannot use a <P> tag with a </p> end tag (incorrect case).

There are two ways to transform an XML document: embedding a processing instruction within the "to be transformed" or target document, and using DOM methods to perform the transformation. If you load an XML document into IE 5.0, you may notice that the document is color-coded and viewable as a formatted tree structure. This is because IE 5.0 uses a default style sheet to render the document. It will do this only if the XML document does not have a processing instruction specifying an XSLT style sheet. The processing instruction references an external document that has the XSLT commands to render the document. If a processing instruction is present that designates a style sheet, the XML document will be rendered in IE 5.0 according to that designated style sheet's instructions. A sample processing instruction to render the Home page in our system would be:

<?xml:stylesheet type=
   "text/xsl" href="..\xsl\home.xsl" ?>

Since I wanted the course Web site to be widely viewable-not only by individuals who have IE 5.0-I decided to perform server-side transformations of the XML documents. The server-side transformation will use DOM to transform the XML documents on the server and send them to the client as formatted HTML (see Figure 4).

 
Figure 4. XML to HTML Transformation. Click here .

The XML parser for the Microsoft platform, or MSXML.DLL, is a COM object, so it can be used to instantiate DOM objects within application code such as Visual Basic, Java, C++, and Visual Fox Pro. We will implement this using ASP. We will use VBScript to create our DOM objects. The DOM specification is defined by the World Wide Web Consortium (W3C). The purpose of the specification is to provide a common set of interfaces to process an XML document within application code. The specification is defined using Interface Definition Language (IDL) syntax therefore making it language neutral.

The specification includes no methods to load or open an XML document, but Microsoft has extended its DOM implementation to include a method to read an XML document (so has Sun in its Java implementation). Microsoft has two methods to load an XML document: Load, which uses the URL of the XML document, and LoadXML, which uses a string variable containing XML elements. We will use both of these methods in our code.

Since the XML parser is a validating parser, it loads and validates the document. Validation ensures that the document is well formed, and that, if the document has a DTD or Schema, it is also valid. Once loaded, the XML document is available to that application through the remainder of the methods and properties contained within the instance of the DOM object. Another method that we use is TransformNodeToObject. This method takes as its arguments two DOM objects. One DOM object contains transformation instructions and the other is an empty DOM object that becomes the recipient of the transformation. This method is performed on the DOM object that has loaded the XML data document to be transformed.

The Pattern
Now we have the Web site modeled as an XML document, with each page in the site containing two XML documents-one with the content and the other with the presentation logic. We know that we can present this to the user with XSLT. How do we do this without having to create an ASP page for each HTML page in our Web site?

To answer this question, I present to you the Director/Builder design pattern. This pattern uses the object-oriented technique of delegation, whereby specialized work is delegated to a specific component of a system that can perform the job. The theory is that by segmenting an application in this way, it becomes easier to construct and debug. In an object-oriented language such as Java, an application creates the Director and the Builder objects. The Director is responsible for validating and assembling the metadata to construct a Web page. It then communicates this information to the Builder, which constructs the Web page and presents it to the client.

Since I was using ASP, I ran into a problem with this pattern. ASP is not an object-oriented language. I got around this limitation by creating two Active Server Pages; one is the Director and the other the Builder. They communicate with one another through query strings (an argument to the page) and session variables (hidden values on the server that are unique to a client session). I also used other techniques specific to ASP to create and cache the common menu, which is a transformation of the XML document containing the site diagram (Master.xml), and cache the Master.xml file.

The implementation of the pattern in our application is straightforward and depends heavily on our site map, or Master.xml. The Director, through the DOM, searches the master file and locates the page requested by its ID. If no ID is supplied, the Director searches the Master.xml for the Page that is designated as the start page for the site (Default='Yes'). It then determines if this page is active (Page Active attribute is "Yes"). If it is not active, the Director displays an error message to the user. If the page is active, the Director accesses the Doc elements for the Page creating session variables for the Doc element that contains the XML file for the page content (Type='xml') and the Doc element that contains the XML file for the presentation (Type='xsl'). The Director then redirects to the Builder page, which uses the session variables to locate the content and presentation XML files. These are each loaded into DOM objects and then transformed into an HTML page.

Our implementation of the transformation process within the Builder creates three instances of the DOM-one to load the data document, another to load the transformation document, and a third to receive the transformation. The method used is TransformNodeToObject. It is a Microsoft extension to the W3C DOM specification. This transformation is then shipped back to the client through a Response.write command. This transformation can also be performed in other ways. You can add a processing instruction that references the presentation document to the content xml file and then issues a LoadXML command on an empty DOM object with the input coming from the target DOM object through the XML property. You can also perform the transformation with another Microsoft extension method to the DOM known as transformNode, which takes a DOM object containing the style sheet as an argument and outputs a string containing the transformation.

I chose the TransformNodeToObject method because it provides better error diagnostics. Specifically, I saw three areas where something could go wrong: the data XML document might not be valid or well formed, the presentation XSLT document might not be well formed, or the transformation just might not work. By creating three objects, one for data, one for presentation, and one for transformation, it's easier to diagnose an error should one occur.

Because we are using server-side transformations, the Director/Builder design pattern permits us to add as many pages as we desire without adding additional ASP pages to the system. This is power!

System Specifics
Let's discuss some features of this system's construction, namely the caching of the Master.xml, the creation of a common menu, and using multiple views on a single source document. A site designed in this manner depends highly on the Master.xml file, the XML document that lists the pages in the site and instructions on how they are to be built. It is accessed every time the Director.asp page is invoked, and the Builder uses it to create the common menu.

Because of this dependency, I decided to cache the contents of the Master.xml into global storage at application startup. This feature of ASP is implemented in the Application_OnStart event in the global.asa for the site. This event, as its name implies, is invoked once the application is started on the server. To save resources, and because of threading issues in the current implementation of the Microsoft parser, the Master.xml document was cached as a string variable and not as a DOM object. (A free-threaded instance of the DOM did not exist in version 2 of the MS parser but has been incorporated in version 3, which is available now from the MS Web site.) This caching as a string variable was done with the xml property of the DOM, whereby the contents of an XML document can be assigned to a string. Therefore, whenever the Director or Builder requires the Master.xml as a DOM object, it can create a DOM object and issue a LoadXML method with the global string variable as an argument.

One of the site's original design goals was to "turn on" features of the system as they became available, but with the Master.xml cached as a string variable on application startup, the only way this can be achieved is to recycle the application (shut it down and start it up). To get around this problem, when the Master.xml is cached, we also save its file creation and modification dates. Then, whenever a client session is started (with the global.asa Session_OnStart event) we compare the create and modify dates of the master file to the create and modified dates that were saved at application startup. If they don't match, the Master.xml has been altered, and we reload and recache it into the global string variable. Once it is recached, new and existing clients will get the latest version of the system. This technique also correctly builds the common menu for the system.

The Web site's common menu system is built from the Pages element of the site map contained in the Master.xml, specifically those elements having an attribute of MainMenu='Yes'. The common menu is built with a transformation of the master file into an HTML <Body> section. In our Web site, this section contains the Raritan Valley Community College graphic and a table containing links to the other pages in the system that are to be included in the main menu. This transformation is done in the same manner as all transformations are done for all pages in the site-with the Master Page element that contains a Doc element with a type attribute of xsl. Since the common menu is also heavily referenced throughout the application (every page has the common menu), we cache it similarly to the way we cache the Master.xml file, with the exception that the string variable contains the transformed Master.xml, or the HTML syntax that will render the common menu in the browser. The code in Listing 1 shows how this is done.

The string resulting from the transformation of the Master file contains the common menu HTML. As the Builder builds each page, this HTML transformation is loaded into a DOM object. That's right: the transformed document, even though it contains HTML, is also a well formed XML document that can be loaded, searched, and updated in the same manner as any XML document.

The Builder.asp loads the HTML and changes the caption to the title of the current page being rendered. It can do this with the Microsoft extension to the DOM specification, selectSingleNode. This method takes an XPATH expression and returns a Node that meets the expression criteria, or returns a NULL node if there isn't a match with the criteria. For our purposes, we search the XML document containing HTML tags for the first occurrence of the underline tag, or <U>, and change its text value to be the value of the name of the page. The Director sets the name of the page in a session variable. Listing 2 shows the code from the Builder that creates the common menu. Since this type of processing is specific to a Web site, the code contained within the buildCommonHeader function should probably be done with an include so that the Builder.asp is not tightly coupled to the site.

Finally we come to the concept of a single document with multiple views. One of the great things about XSLT is that you can separate content from presentation-or present one piece of content in several ways. I wanted students to be able to download homework assignments, lab assignments, or sample code/data. Instead of creating a separate XML document for each type of download, I created one Document with download types. I gave this element an attribute of Category that would describe the type of download it was (homework, lab, and so on) and then another attribute to designate it as being available (Yes/No). The Document element has subelements of Description, which contains a description of the document, and another element called URL to designate the location of the document to be downloaded.

The transformation process creates a table row for every Downloadable document for the category, lists its descriptions, and creates a hyperlink if the item is available. This facilitated my design goal of turning features on within the system. When I am ready to hand out an assignment, I just set the available attribute to Yes and the transformation process creates the hyperlink using the value of the URL tag. All the data relating to a download is placed in one document named Downloadable.xml. Figure 5 shows separate transformations that I created based on the category.

 
Figure 5. One on One. Click here .

Weighing the Benefits
After implementing these concepts, I took a step back and tried to determine if it is worthwhile. How will implementing a Web site like this positively affect the way I work? Has this increased my ability to maintain a Web site? I realized that I now need two types of documents to create one HTML page. Is this better? In my opinion, yes. Separating content from presentation can only increase site maintainability.

In general, if you have done a good job analyzing the problem domain (the data needs or structure of the XML documents that will be used in your site) the application will be fairly stable, requiring only the inevitable adjustment for changes in business processes. With a separation of presentation from content, the presentation is more likely to change than the content. However, separating content from presentation in the XML world doesn't come without a price. You have to learn to code and debug the XSLT language with its template and match commands. The Microsoft MSDN Web site (msdn.microsoft.com/xml) has many examples of XSLT. Debugging XSLT can be difficult without a tool. The one that I use is Microsoft's XSL Debugger, downloadable from http://msdn.microsoft.com/workshop/c-frame.htm?/workshop/xml/index.asp.

Site maintainability has increased dramatically. Because data is contained within XML documents, content can be added, removed, or adjusted without ruining the presentation of that content. If you want to add a page to the site, all you have to do is create a page entry in the Master.xml and set its active property to Yes. Of course, you must build the corresponding XML and XSLT pieces of the page before turning this on. To change the presentation of a page, just redeploy the XSLT document that transforms it into HTML. Because of the common menu concept, a change in the menu structure of the site is automatically propagated throughout every page in the site. Finally, because of the design pattern, you can deliver the entire content of your site with just two ASP pages.

One extra benefit: the combined VBScript code of the application, including comments, is less than 500 lines. More importantly, other than the customization of a common site menu, these very same ASP pages can be used to produce any site that you want to deploy. That is impressive. You can visit the course Web site and check it out for yourself at http://hol-nt1.raritanval.edu/cis227y/director.asp. Or, of course, you can build and deploy your own site. You'll go to the head of the class.




Andrew C. Mayo is an adjunct professor at Raritan Valley Community College in New Jersey and the principal of Carlton Software Solutions, Inc., an information technology consulting company that provides customized development, mentoring, and training to corporate clients on the practical application of XML technology. In his spare time, he can be found skiing in Vermont with his wife Joan. Reach Andrew at XMLProfessor@CarltonSolutions.com.

 
Download the code for this entire issue here.
Download the code for the Magazine issue in which an article appears. Get the code for this article here.
  Join the Premier Club

  FEATURED SOFTWARE:
dtSearch Web
Add power searching to your web site.
Buy Now!

VBtoXML
Create Visual Basic class modules from a database.
Buy Now!

FEATURED BOOK:
VB2TheMax VBMaximizer
Extend the VB IDE with 70+ new commands and 500+ ready-to-use routines.
Buy Now!


. Design Patterns: Elements of Reusable Object-Oriented Software Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides (Addison Wesley, 1994, ISBN 0-201-63361-2)


. MSDN Online Developer Center
. MSDN Online Web Workshop
. XML CIS227Y Course Web site
. XSL Transformations (XSLT)
. Document Object Model (DOM) specification



 



Sponsored Links
SPECIAL REPORT: Winning with Web Services
Need Enterprise Security in Web Services? Download this FREE White Paper!
Learn to dramatically simplify development and deployment with .NET
Tips from a successful wireless developer ...
Wireless Developers! Get it all--from tools to support--with BREW

 
DevX Home | .NET Zone | Java Zone | Get Help | CoDe Magazine
VB Zone | C++ Zone | XML Zone | Enterprise Zone | Database Dev Zone
Wireless Zone| Security Zone | ASP Zone | DHTML Zone | UML Zone
MarketPlace | RFP Exchange | Discussions | Newsletters | Tech Tips | Sourcebank
Advertise | Help | Copyright | Privacy Statement