<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE qandaset PUBLIC "+//Silmaril//DTD FAQ based on DocBook 4.4//EN//XML"
  "faq.dtd" [
<!ENTITY xmllogo PUBLIC "+//Silmaril//NONSGML XML Logo//EN" 
  "xmllogo-crop.png" NDATA PNG>
]>
<?PSGML nofill programlisting literal?>
<qandaset revisionflag="changed" revision="2010-02-27">
  <blockinfo>
    <titleabbrev>The XML FAQ</titleabbrev>
    <title>Frequently-Asked Questions <?site br?>about the Extensible Markup
      Language</title>
    <graphic entityref="xmllogo" format="PNG"/>
    <editor>
      <firstname>Peter</firstname>
      <surname>Flynn</surname>
      <affiliation>
	<orgname>Silmaril Consultants</orgname>
	<orgdiv>Textual Therapy Division</orgdiv>
      </affiliation>
      <email>http://silmaril.ie/cgi-bin/blog</email>
    </editor>
    <edition use="current.date" for="date"><conref
    use="current.version"/></edition>
    <authorgroup id="contributors" role="Contributors">
      <collab>
	<collabname>The following people helped with the original
	contributions, plus many other members of the W3C XML SIG as
	well as FAQ readers around the world.</collabname>
      </collab>
      <othercredit>
	<firstname>Terry</firstname>
	<surname>Allen</surname>
      </othercredit>
      <othercredit>
	<firstname>Tom</firstname>
	<surname>Borgman</surname>
      </othercredit>
      <othercredit>
	<firstname>Tim</firstname>
	<surname>Bray</surname>
      </othercredit>
      <othercredit>
	<firstname>Robin</firstname>
	<surname>Cover</surname>
      </othercredit>
      <othercredit>
	<firstname>Bob</firstname>
	<surname>DuCharme</surname>
      </othercredit>
      <othercredit>
	<firstname>Christopher</firstname>
	<surname>Maden</surname>
      </othercredit>
      <othercredit>
	<firstname>Eve</firstname>
	<surname>Maler</surname>
      </othercredit>
      <othercredit>
	<firstname>Makoto</firstname>
	<surname>Murata</surname>
      </othercredit>
      <othercredit>
	<firstname>Peter</firstname>
	<surname>Murray-Rust</surname>
      </othercredit>
      <othercredit>
	<firstname>Liam</firstname>
	<surname>Quin</surname>
      </othercredit>
      <othercredit>
	<firstname>Michael</firstname>
	<surname>Sperberg-McQueen</surname>
      </othercredit>
      <othercredit>
	<firstname>Joel</firstname>
	<surname>Weber</surname>
      </othercredit>
    </authorgroup>
    <keywordset>
      <keyword>xml</keyword>
      <keyword>sgml</keyword>
      <keyword>html</keyword>
      <keyword>markup</keyword>
      <keyword>structure</keyword>
      <keyword>xslt</keyword>
      <keyword>latex</keyword>
    </keywordset>
    <abstract id="index">
      <title>Summary</title>
      <para>This is the list of Frequently-Asked Questions about the
	Extensible Markup Language (XML). It has answers to most of
	the common questions people ask about XML. If you are seeking
	answers to questions about related areas such as HTML, SGML,
	CGI scripts, PHP, JSP, Java, databases, or penguins, you may
	find some pointers, but you should probably look elsewhere as
	well.</para>
      <para>The FAQ is intended as a first resource for users,
	authors, developers, and the interested reader. Details of its
	organisation, contributors, availability, translations, and
	revisions are in the Admin sections. Updates to the FAQ are
	notified to the mailing lists and newsgroups listed in <link
	  linkend="discussions"></link>.</para>
      <para>The full document is available for download in many
	different formats: see <link linkend="availability"></link>
	for a list.</para>
      <note>
	<title>WTF</title>
	<para><ulink url="http://seanmcgrath.blogspot.com">Seán
	    McGrath</ulink>&nbsp;<ulink
	    url="http://seanmcgrath.blogspot.com/#112988775713608464">suggested</ulink>: 
	  <quote>It would be great if FAQs had a WTF section to direct
	    the eyes of the exasperated to Q's with a high desperation
	    index <literal>:-)</literal></quote>, so here are the top
	  dozen most-wanted:</para>
	<simplelist>
	  <member><link linkend="whatisxml"></link></member>
	  <member><link linkend="style"></link></member>
	  <member><link linkend="dtds"></link></member>
	  <member><link linkend="browsers"></link></member>
	  <member><link linkend="whatissgml"></link></member>
	  <member><link linkend="specials"></link></member>
	  <member><link linkend="markup"></link></member>
	  <member><link linkend="whatfor"></link></member>
	  <member><link linkend="software"></link></member>
	  <member><link linkend="schemas"></link></member>
	  <member><link linkend="namespaces"></link></member>
	  <member><link linkend="glossary"></link></member>
	</simplelist>
      </note>
      <section id="organisation">
	<title>Organisation</title>
	<para>This FAQ was originally maintained on behalf of the
	  World Wide Web Consortium's XML Special Interest Group. It
	  is divided into four sections: <link linkend="basics"
	    xreflabel="simple">Basics</link>, <link xreflabel="simple"
	    linkend="users">Users</link>, <link xreflabel="simple"
	    linkend="authors">Authors</link>, and <link
	    xreflabel="simple" linkend="developers">Developers</link>.
	  The questions are numbered independently within each
	  section. As the numbering may change with each version,
	  comments and suggestions should refer to the version number
	  (see <link linkend="revisions"></link>) as well as the section
	  and question number. See <link linkend="cite"></link> for
	  details of citation and reference.</para>
	<para>Please submit bug reports, suggestions for improvement,
	  and other comments about <emphasis>this FAQ only</emphasis>
	  to <ulink url="xmlfaq@silmaril.ie">the editor</ulink>.
	  Questions and comments about XML should go to the relevant
	  <link linkend="discussions" xreflabel="simple">mailing list or
	    newsgroup</link>. Comments about the <link xreflabel="simple"
	    linkend="spec">XML Specification</link> itself and related
	  specifications should be directed to the <ulink
	    url="http://www.w3.org/">W3C</ulink>.</para>
	<note>
	  <title>Updates</title>
	  <para>In minor updates the following symbols are
	    used:</para>
	  <itemizedlist>
	    <listitem>
	      <para xreflabel="added"><emphasis>Additions</emphasis>
		since the last version are indicated with a plus
		sign.</para>
	    </listitem>
	    <listitem>
	      <para xreflabel="changed"><emphasis>Changes</emphasis>
		since the last version are indicated with a plus/minus
		sign.</para>
	    </listitem>
	    <listitem>
	      <para xreflabel="deleted"><emphasis>Deletions</emphasis>
		retained temporarily for information are indicated
		with a minus sign.</para>
	    </listitem>
	  </itemizedlist>
	  <para>In major updates these are not used because almost
	    every question will have been changed.</para>
	</note>
      </section>
      <section id="availability">
	<title>Availability</title>
	<para>This XML document is at <ulink
	    url="http://xml.silmaril.ie/"></ulink>. It is XML served
	  converted to HTML by Saxon, so what you read online is
	  HTML in your browser.</para>
	<itemizedlist>
	  <listitem>
	    <para>You can <ulink
		url="http://xml.silmaril.ie/faq.sgml">download the
		unconverted file</ulink> (avoiding the
	      <filename>.xml</filename> filetype which
	      over-enthusiastic browsers want to usurp&mdash;just
		rename it after downloading);</para>
	  </listitem>
	  <listitem>
	    <para>The <ulink
		url="http://xml.silmaril.ie/faq.dtd">DTD</ulink> is a
	      lightly modified version of <ulink
		url="http://www.docbook.org/">DocBook</ulink>;</para>
	  </listitem>
	  <listitem>
	    <para>There is a MindMap version available by clicking on
	      the MindMap logo in the banner at the top of the page.
	      This is an XML format used by <ulink
		url="http://freemind.sourceforge.net/">FreeMind</ulink> 
	      amd other MindMap software.</para>
	  </listitem>
	  <listitem>
	    <para>There are <ulink
		url="http://xml.silmaril.ie/webfaq.xsl.tar.gz">XSL
		stylesheets</ulink> for the conversion to HTML and
	      <LaTeX/> to make the PDF and PostScript versions;</para>
	  </listitem>
	  <listitem>
	    <para>A notification of new versions is posted
	      periodically to the <ulink url="comp.text.xml"
		type="news"></ulink> Usenet newsgroup, the <ulink
		url="http://listserv.heanet.ie/xml-l.html">XML-L</ulink>, 
	      <ulink
		url="http://lists.xml.org/archives/xml-dev/">xml-dev</ulink>, 
	      and <ulink
		url="http://www.mulberrytech.com/xsl/xsl-list">XSL-List</ulink> 
	      mailing lists, and to the <ulink
		url="http://www.linkedin.com/groups?gid=664967">XML/XSL 
		forum</ulink> on LinkedIn.</para>
	  </listitem>
	  <listitem>
	    <para>for printed copies there are versions for <ulink
		url="http://xml.silmaril.ie/faq_a4.ps">A4
		PostScript</ulink>, <ulink
		url="http://xml.silmaril.ie/faq_a4.pdf">A4 PDF</ulink>,
	      <ulink url="http://xml.silmaril.ie/faq_letter.ps">Letter
		PostScript</ulink> and <ulink
		url="http://xml.silmaril.ie/faq_letter.pdf">Letter
		PDF</ulink> available.</para>
	  </listitem>
	  <listitem>
	    <para>WAP (if anyone's still using it), OEB (eBook), and
	      cHTML versions have been proposed for your handheld
	      devices, and I'm open to offers if anyone wants to write
	      app code.</para>
	  </listitem>
	</itemizedlist>
	<para>The FAQ is also available in carbon-based toner on
	  flattened dead trees by sending &euro;10 (&dollar;15 or
	  equivalent in any convertible currency) to the <ulink
	    url="xmlfaq@silmaril.ie">editor</ulink> (email first to
	  check rates and postal address).</para>   
      </section>
      <section id="translations" remap="langs languages">
	<title>Translations</title>
	<para>Those I know about are in:</para>
	<itemizedlist>
	  <listitem>
	    <para><ulink
		url="http://www.oreilly.de/xml/xml_faq_fragen.html">German</ulink> 
	      (partial translation of some questions) [<personname>
		<firstname>Karin</firstname>
		<surname>Driesen</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://www.senamirmir.com/xml/faq/xml_faq_amh.html">Amharic</ulink> 
	      [<personname>
		<firstname>Abass</firstname>
		<surname>Alamnehe</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://www.fxis.co.jp/DMS/sgml/cafe/library/etc/xmlfaq.html">Japanese</ulink> 
	      [<personname lang="jp">
		<firstname>Makoto</firstname>
		<surname>Murata</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://slug.ctv.es/~olea/sgml-esp/xfaq15.html">Spanish</ulink> 
	      (currently inaccessible) [<personname>
		<firstname>Jaime</firstname>
		<surname>Sagarduy</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://xml.t2000.co.kr/faq/index.html">Korean</ulink> 
	      (currently inaccessible). [<personname>
		<firstname>Kangchan</firstname>
		<surname>Lee</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://zxd.webjump.com/xml.html">Chinese</ulink>
	      (currently inaccessible) [Neko]. Also in <ulink
		url="http://weblab.crema.unimi.it/xmlzh/XML_FAQ.htm">Chinese</ulink> 
	      (also inaccessible) [<personname>
		<firstname>Jiang</firstname>
		<surname>Luqin</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://www.gutenberg.eu.org/pub/GUTenberg/publications/HTML/FAQXML/faqxml-fr.html">French</ulink> 
	      [<personname>
		<firstname>Jacques</firstname>
		<surname>André</surname>
	      </personname>];</para>
	  </listitem>
	  <listitem>
	    <para><ulink
		url="http://zvon.vscht.cz/ZvonHTML/Translations/xmlFAQ/front_all.html">Czech</ulink> 
	      [<personname>
		<firstname>Miloslav</firstname>
		<surname>Nic</surname>
	      </personname>].</para>
	  </listitem>
	</itemizedlist>
	<para>I would be grateful if the translators of those copies
	  which have become inaccessible would contact me with the new
	  URI.</para>
      </section>
    </abstract>
    <legalnotice id="legal" role="Legal Notice">
      <para>This document is joint copyright &copy; 1996&ndash;2011 by
      Silmaril Consultants and the editor and is released under the
      terms of the GNU Free Documentation License (see below).
      Quotations of the contributions of others remain copyright of
      the individual contributors. You may copy and distribute this
      document in any form provided you acknowledge this source and
      the individual (in the case of a contribution) [see <link
	linkend="cite"></link> for how] and don't try to pretend you
      or someone other than the author wrote it. If you want to
      republish or reprint the FAQ in bulk, or copy all or part of it
      onto another web site, please ask the editor first to make sure
      you get the right edition, to make provision for periodic
      updating, and to ensure you use the correct legal
      wording.</para>
      <para><quote>Permission is granted to copy, distribute and/or
	modify this document under the terms of the GNU Free
	Documentation License, Version 1.3 or any later version
	published by the Free Software Foundation; with no Invariant
	Sections, no Front-Cover Texts, and no Back-Cover Texts. A
	copy of the license is available <ulink
	  url="http://www.gnu.org/licenses/fdl.html">here</ulink>. 
	You are allowed to distribute, reproduce, and modify it
	without fee or further requirement for consent subject to the
	conditions in <ulink
	  url="http://www.gnu.org/licenses/fdl-howto-opt.html">the 
	  section on Modifications</ulink>.</quote></para>
      <para>The editor and contributing authors assert their right to
      be identified as the editor and contributing authors of this
      document.</para>
      <para id="cite">For citations of this FAQ, use:</para>
      <blockquote>
	<para>Flynn, P (Ed.), <citetitle>The XML FAQ</citetitle>
	  v.<userinput><conref use="current.version"/></userinput>, Cork,
	  <userinput><conref use="current.date"/></userinput>,
	<literal>http://xml.silmaril.ie/</literal>, Q.xxx <quote>[insert
	  the question title here]</quote></para>
      </blockquote>
      <para>In bibliographic referencing systems this would be
	something like this (using BIB<TeX/> as an example)</para>
      <blockquote>
	<programlisting>
@Booklet{xmlfaq,
  title =        {The XML FAQ},
  editor =       {Peter Flynn},
  howpublished = {Webpage},
  address =      {Cork},
  month =        {<userinput><conref use="current.date" format="month" start="6" length="2"/></userinput>},
  year =         <userinput><conref use="current.date" start="1" length="4"/></userinput>,
  edition =      {v<userinput><conref use="current.version"/></userinput>},
  url =          {http://xml.silmaril.ie/},
  pages = 	 {Q.#}
}</programlisting>
      </blockquote>
      <para id="citefrag">A suitable format for citing
	individually-authored fragments would be:</para>
      <blockquote>
	<para><userinput>AN Other</userinput>, <quote><userinput>Title
	    of question</userinput></quote>. In Flynn, P (Ed.),
	  <citetitle>The XML FAQ</citetitle> 
	  v.<userinput><conref use="current.version"/></userinput>,
	    Silmaril Consultants, Cork, 
	  <userinput><conref use="current.date" format="month"
	    start="6" length="2"/>&nbsp;<conref use="current.date"
	    start="1" length="4"/></userinput>, 
	  <userinput>Q.xxx</userinput>.
	  <literal>http://xml.silmaril.ie/<userinput>question</userinput>.html</literal></para> 
      </blockquote>
      <para>In bibliographic referencing systems this would be
	something like this (again using BIB<TeX/> as an example)</para>
      <blockquote>
	<programlisting>
@InCollection{xmlfaq,
  author =       {<userinput>AN Other</userinput>},
  title =        {<userinput>Title of question</userinput>},
  booktitle =    {The XML FAQ},
  publisher =    {Silmaril Consultants},
  month =        {<userinput><conref use="current.date" format="month" start="6" length="2"/></userinput>},
  year =         <userinput><conref use="current.date" start="1" length="4"/></userinput>,
  editor =       {Peter Flynn},
  volume =       {<userinput>section number</userinput>},
  number =       {<userinput>question number</userinput>},
  address =      {Cork},
  url =          {http://xml.silmaril.ie/<userinput>section</userinput>/<userinput>question</userinput>/},
  edition =      {v.<userinput><conref use="current.version"/></userinput>}
}</programlisting>
      </blockquote>
    </legalnotice>
    <revhistory id="revisions" role="Revision History">
      <revision>
	<revnumber>0.0</revnumber>
	<date>1996-12-27</date>
	<revremark>First test. Unpublished.</revremark>
      </revision>
      <revision>
	<revnumber>0.1</revnumber>
	<date>1997-01-31</date>
	<revremark>First draft. Sample questions devised by
	  participants.</revremark>
      </revision>
      <revision>
	<revnumber>0.2</revnumber>
	<date>1997-02-03</date>
	<revremark>Revised draft. Additional questions and
	  answers.</revremark>
      </revision>
      <revision>
	<revnumber>0.3</revnumber>
	<date>1997-02-17</date>
	<revremark>Extensive revision following comments from the
	  group. Changes to markup and organization.</revremark>
      </revision>
      <revision>
	<revnumber>0.4</revnumber>
	<date>1997-02-23</date>
	<revremark>Minor editorial changes</revremark>
      </revision>
      <revision>
	<revnumber>0.5</revnumber>
	<date>1997-04-01</date>
	<revremark>Added Multidoc Pro as SGML browser; question on XML
	  math; fixed ambiguity in explanation of NETs; added JUMBO;
	  ERB changes of March 26; more details of linking and tools;
	  adding element declaration minimisation to the forbidden
	  list.</revremark>
      </revision>
      <revision>
	<revnumber>1.0</revnumber>
	<date>1997-05-01</date>
	<revremark>Added reference to ToC and printed URIs; added
	  disclaimer at A6; combined old A11 with A5 to explain
	  SGML/XML/HTML; clarified explanation of XML not replacing
	  HTML at C1; added new course and conference at (new) A11;
	  clarified B1, C4, C8; added FPI server at C12; removed
	  examples in C13.</revremark>
      </revision>
      <revision>
	<revnumber>1.1</revnumber>
	<date>1997-10-01</date>
	<revremark>No more minimisation parameters in element
	  declarations; parsers must now pass all white-space to the
	  application; everything is now case-sensitive, including all
	  markup; a new proposal for stylesheets: XSL, which combines
	  DSSSL and CSS in an XML format; Java[Script] and and
	  metadata and their use in XML; updated list of software;
	  first XML book is published; new public mailing list
	  XML-L</revremark>
      </revision>
      <revision>
	<revnumber>1.2</revnumber>
	<date>1998-02-01</date>
	<revremark>Added a Mac icon (thanks to Martin Winter and
	  others); removed Draft from references to the spec; changed
	  revision colours; the RMD is gone: replaced references to it
	  with standalone; updated some broken URIs; [1.21] minor
	  edits to URIs and updates on translation; added XUA to
	  details of MIME types.</revremark>
      </revision>
      <revision>
	<revnumber>1.3</revnumber>
	<date>1998-06-01</date>
	<revremark>Removed the math plugin (Linux Netscape is broken
	  and refused to elide it); updated list of events (need
	  more); fixed some broken URIs; added Spanish and Korean
	  translations and the Annotated Spec; updated details of
	  MS/NS browser development; clarified the use of FPI vs
	  SysiD; updated link to Feb 10 Rec Spec; added pointers to
	  the SGML Decl for XML; updated references to XLink and
	  XPointer; corrected a reference to ancient Sumerian writing;
	  clarified the need for conversion of HTML DTDs to
	  XML.</revremark>
      </revision>
      <revision>
	<revnumber>1.4</revnumber>
	<date>1998-10-01</date>
	<revremark>Added maintainer's email address under
	  Availability; Added note about ISO representation and voting
	  on standards; added Greek translation; updated details of
	  conferences; changed the URI for the new SGML/XML Web Pages;
	  updated details of browsers; corrected reference to the SGML
	  omitted features from XML; updated details of converting
	  HTML to XML; added mention of comp.text.xml; extended the
	  questions on graphics and how to use XML with current
	  browsers; added questions on DOM, conformance testing, DTD
	  includes, SGML DTDs into XML, EDI; (1.41) corrected errors
	  in MIME types, URIs, SDD, and images.</revremark>
      </revision>
      <revision>
	<revnumber>1.5</revnumber>
	<date>1999-06-01</date>
	<revremark>Added new XML mailing lists in Italian and in
	  French; added details of developer resources in Chinese; two
	  more translations under way (Chinese and Czech); updated
	  links to the question on DTDs; added question on the use of
	  Java to generate and manage XML; added question on when to
	  use attributes and when to use element markup; added
	  question on the use of XML syntax to describe DTD data
	  (schemas); expanded on the explanation of the use of formal
	  language in the spec; added question on the difference
	  between XML and C++; separated information on XML versions
	  of HTML into a separate question.</revremark>
      </revision>
      <revision>
	<revnumber>1.6</revnumber>
	<date>2000-07-01</date>
	<revremark>Added French and Czech translations and a Finnish
	  mailing list, and reorganised the list of translations;
	  updated URIs for newsgroups; clarified reference to Unicode;
	  reworded question on terminology; added more links to the
	  question on conformance testing; corrected error in content
	  model example for mixed content; updates to the question on
	  stylesheets; Minor edits to the question on software; major
	  changes to the question on servers and media types; updated
	  question on XML Schemas; added new question on `executing'
	  XML `programs'; replaced the math example with one less
	  likely to distress the gentle susceptibilities of some
	  readers; added a new question on knowing SGML/HTML before
	  XML.</revremark>
      </revision>
      <revision>
	<revnumber>2.0</revnumber>
	<date>2001-06-01</date>
	<revremark>DTD changed from DocBook SGML to QAML XML; removed
	  query form due to abuse; most questions revised and in some
	  cases rewritten; updated references to new versions of
	  associated standards, recommendations, and working drafts;
	  added pointer to Jon Noring's Unicode test page and NIST's
	  XSLT/XPath test suite; updated Eve Maler's links to the DTD
	  for the spec; added warnings on speling and punk chew asian;
	  added question on namespaces; fixed bug in question on
	  stylesheets; inserted explanation of `document' vs `data'
	  software; added new mailing list on XSL:FO; updated Robin
	  Cover's URI throughout; updated the question on media types
	  for RFC 3023; Extended question of graphics to cover SVG.
	  For 2.01 there were minor typos, some updated links (to
	  recent versions of the standards, and in the section on More
	  Information), and a few wording changes. Thanks to James
	  Cummings for a very thorough proofread. Editing was done
	  using GNU Emacs and psgml-mode.</revremark>
      </revision>
      <revision>
	<revnumber>2.1</revnumber>
	<date>2002-01-01</date>
	<revremark>Added <link linkend="x-hum">Humanities mailing
	    list</link>; added more references for <link
	    linkend="dbarts">XML and databases</link>; added the
	  Namespaces FAQ; corrected some misunderstandings in <link
	    linkend="utf-16">character encodings</link>; changed the
	  editor's email address; added a new question on <link
	    linkend="rootelement">root elements</link>; updated the
	  <link linkend="linkspecs">XLink</link> to W3C
	  Recommendation; updated the <link linkend="whatissgml">SGML
	    FAQ address</link>; fixed some broken links; added
	  translations into <link linkend="translations">German</link>
	  and <link linkend="translations">Amharic</link>; minor
	  revisions to some wording. Editing this time was done in
	  <ulink url="http://www.epcedit.com">epcEdit 1.02</ulink>.
	  V2.11 includes new material on <link
	    linkend="browsers">expectations and XML browsers</link>,
	  the removal of a mailing list, and a few corrections to
	  typos and links. Thanks to Seán Cannon and Dave&ampers;Nikki
	  for debugging the CSS style-sheet.</revremark>
      </revision>
      <revision>
	<revnumber>3.0</revnumber>
	<date>2003-01-01</date>
	<revremark>Added information on <link linkend="officeapps">Office
	    Applications</link> including Corel, Microsoft, and Sun
	  (to keep alphabetical order :-); updated details of <link
	    linkend="moreinfo">conferences and training</link>; updated
	  <link  linkend="browsers">browser</link> details; reworded a
	  few ungainly sentences; removed some obsolete URIs (mostly
	  for <emphasis>nice idea</emphasis> sites which died);
	  changed the phrasing of the <link linkend="databases">question on
	    databases</link>; added details on how to do standalone
	  validation to <link linkend="parsers">the question on
	    parsing</link> (thanks to Bill Rayer); added question on
	  <link linkend="management">how to present XML to
	    management</link> (thanks to Tad McClellan); the questions
	  on APIs and the DOM have been subsumed into <link
	    linkend="software">the question on software</link>, which
	  has been extensively rewritten; added yet more explanation
	  to the <link linkend="characters">section on Unicode</link>;
	  3.01 fixes minor typos; 3.02 adds updated dates for 2004
	  events.</revremark>
      </revision>
      <revision>
	<revnumber>3.01</revnumber>
	<date>2004-01-01</date>
	<revremark>Minor typographic changes</revremark>
      </revision>
      <revision>
	<revnumber>3.02</revnumber>
	<date>2004-01-12</date>
	<revremark>Added updates for 2004 events</revremark>
      </revision>
      <revision>
	<revnumber>4.0</revnumber>
	<date>2005-01-01</date>
	<revremark>Went back to <ulink
	    url="http://www.docbook.org/">DocBook</ulink> markup using
	  <ulink
	    url="http://www.docbook.org/tdg/en/html/qandaset.html"><sgmltag>qandaset</sgmltag></ulink> 
	  instead of the QAML that has been used for the last two
	  major releases. Revised text in most sections for clarity in
	  wording, and recast some now-established explanatory
	  material into the past tense. Added new dates for 2005.
	  Added explicit references to the GNU FDL in the legal
	  section. Took the tip on types of XML out into <link
	    linkend="docdata">a new question</link>, and added new
	  questions on <link linkend="includes">file
	    inclusions</link> and <link linkend="cdata">the use of
	    CDATA Marked Sections</link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.1</revnumber>
	<date>2005-05-15</date>
	<revremark>Revised structure and new stylesheet for new
	  location at <ulink url="http://xml.silmaril.ie/"></ulink>.
	  The four main sections remain, but the text is served in
	  separate questions and sections rather than one huge file
	  (the PDF remains as a single document, of course). Removed
	  references to the now-defunct Balise language, added a Tip
	  on editor selection and some notes on WYSIWYG XSL[T]
	  editing.</revremark>
      </revision>
      <revision>
	<revnumber>4.2</revnumber>
	<date>2005-07-01</date>
	<revremark>Added new <link linkend="rng-list">RNG mailing
	    list</link>, updated section on <link
	    linkend="schemas">Schemas</link>, added links to the <link
	    linkend="sgmldec">XML Declaration for SGML</link>.
	  Retagged personal names for recognition, and ID'd related
	  FAQs. Expanded question on Why XML. Added link to email a
	  page to someone. Added and expanded the tips on ways of
	  getting typeset output, eg <LaTeX/>. Added new section on
	  special characters.</revremark>
      </revision>
      <revision>
	<revnumber>4.3</revnumber>
	<date>2005-09-05</date>
	<revremark>Added the notes culled from failed searches as a
	  <link linkend="glossary">Glossary</link>; updated some URLs,
	  and added one for XQuery to <link linkend="databases">the question
	    on databases</link> (thanks, Liam); updated <link
	    linkend="whatfor"></link>, <link
	    linkend="internals"></link>, <link
	    linkend="parsers"></link>, and <link linkend="cdata">the
	    question on CDATA Sections</link>. Added a new <link
	    linkend="conditionals">question on Conditionals</link>.
	  Tightened up on the indexing for searches, including the
	  removal of enclosing quotes, and added a bunch more
	  metadata.</revremark>
      </revision>
      <revision>
	<revnumber>4.31</revnumber>
	<date>2005-09-09</date>
	<revremark>Added notes on <link
	linkend="pipelines">Pipelining</link> and <link
	linkend="attribs">Attributes</link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.32</revnumber>
	<date>2005-09-10</date>
	<revremark>Added details of <sgmltag
	class="attribute">xml:id</sgmltag> to the <link
	linkend="attribs">note on Attributes</link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.33</revnumber>
	<date>2005-09-12</date>
	<revremark>Added more keywords, and a tip to the <link
	    linkend="asp">note on asp.net</link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.34</revnumber>
	<date>2005-10-01</date>
	<revremark>Split the question on CDATA into two: one for CDATA
	  per se, and one for other ways of handling embedded HTML.
	  Added some more keywords, and revised the questions <link
	    linkend="discussions"></link> and <link
	    linkend="programming"></link>. Fixed a minor date bug in
	  the search script.</revremark>
      </revision>
      <revision>
	<revnumber>4.35</revnumber>
	<date>2005-10-08</date>
	<revremark>Fixed some broken links and removed a couple of
	obsolete ones. Added a note about the BOM.</revremark>
      </revision>
      <revision>
	<revnumber>4.36</revnumber>
	<date>2005-10-16</date>
	<revremark>Updated dates of events in <link
	    linkend="moreinfo"></link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.37</revnumber>
	<date>2005-10-31</date>
	<revremark>Removed ambiguities in <link
	    linkend="includes"></link>.</revremark>
      </revision>
      <revision>
	<revnumber>4.38</revnumber>
	<date>2005-11-01</date>
	<revremark>Added personal views on patent, copyright, and
	  intellectual property.</revremark>
      </revision>
      <revision>
	<revnumber>4.39</revnumber>
	<date>2005-12-01</date>
	<revremark>Refined some keywords, changed presentations of
	some examples, reworded a paragraph on treatment of space, and
	added details of assigning a Schema to an instance.</revremark>
      </revision>
      <revision>
	<revnumber>4.4</revnumber>
	<date>2006-01-01</date>
	<revremark>Minor grammatical edits, major changes to the
	  indexing and DC metadata. Added glossary entry on data
	  export to CSV and expanded the description of nodes and the
	  grove. Fixed elusive bug in RSS feed. Added contributor
	  names to search index.</revremark>
      </revision>
      <revision>
	<revnumber>4.41</revnumber>
	<date>2006-01-07</date>
	<revremark>Fixed a cross-referencing bug in generated content.</revremark>
      </revision>
      <revision>
	<revnumber>4.5</revnumber>
	<date>2006-02-27</date>
	<revremark>Added more keywords taken from failed
	searches. Expanded on file URIs, the use of compiled DTDs,
	self-describing data, the boolean nature of parameter entity
	switches, how to get HTML features (forms, etc).</revremark>
      </revision>
      <revision>
	<revnumber>4.51</revnumber>
	<date>2006-02-28</date>
	<revremark>Added explanation of xml:is, xml:space, and
	xml:lang. Added new question on how to read (open) an XML file
	you have been sent.</revremark>
      </revision>
      <revision>
	<revnumber>4.52</revnumber>
	<date>2006-03-26</date>
	<revremark>Added more keywords and fixed a broken link to the
	  XSL FAQ.</revremark>
      </revision>
      <revision>
	<revnumber>4.53</revnumber>
	<date>2006-04-12</date>
	<revremark>Updated details of XML for Safari, and added a
	curious new enquiry.</revremark>
      </revision>
      <revision>
	<revnumber>4.54</revnumber>
	<date>2006-06-01</date>
	<revremark>Corrected an error in the description of
	xml-stylesheet. Added link targets to Quick Answers.</revremark> 
      </revision>
      <revision>
	<revnumber>4.55</revnumber>
	<date>2007-08-01</date>
	<revremark>Updated events for 2007&ndash;2008. Updated details
	of ODF and OOXML. Added section on broken software. Revised
	handling of failed searches. </revremark> 
      </revision>
      <revision>
	<revnumber>4.56</revnumber>
	<date>2007-08-08</date>
	<revremark>Added details and links for HTML5</revremark> 
      </revision>
      <revision>
	<revnumber>4.57</revnumber>
	<date>2010-02-27</date>
	<revremark>Updated events, added interim changes to formatting
	in preparation for extensive relaunch later in 2010.</revremark> 
      </revision>
      <revision>
	<revnumber>4.58</revnumber>
	<date>2010-04-24</date>
	<revremark>Updated events, removed XML Prague, added Balisage
	Symposium</revremark> 
      </revision>
      <revision>
	<revnumber id="current.version">5.00</revnumber>
	<date id="current.date">2011-01-09</date>
	<revremark>Removed obsolete information and links, reformatted
	  presentation (thanks to Parker and the PWA for help with the
	  CSS). Moved questions about Java and Javascript from Authors
	  to Developers, and question on running XML to Basics.
	  Renamed IDs appendix to appendices, contrib to
	  contributions, and revhist to revisions so that sectioning
	  can be done by ID rather than number and title. Rewrote
	  search script. Updated events. <LaTeX/> transformation also
	  updated and the PDFs reset.</revremark> 
      </revision>
    </revhistory>
  </blockinfo>
  <qandadiv id="basics" remap="FAQ-GENERAL, General">
    <title>Basics: general information about XML</title>
    <qandaentry id="whatisxml" remap="FAQ-ACRO, acro">
      <question>
	<formalpara>
	  <title>What is XML?</title>
	  <para>The Extensible Markup Language.</para>
	</formalpara>
      </question>
      <answer>
	<para>XML is the Extensible Markup Language. It improves the
	  functionality of the Web by letting you identify your
	  information in a more accurate, flexible, and adaptable
	  way.</para>
	<para>It is extensible because it is not a fixed format like
	  HTML (which is a single,
	  <emphasis>predefined</emphasis>&nbsp;<firstterm
	    linkend="markup">markup language</firstterm>). Instead,
	  XML is a <firstterm>metalanguage</firstterm>&mdash;a
	  language for describing other languages&mdash;which lets you
	  design your own markup languages for limitless different
	  types of documents. XML can do this because it's written in
	  <link linkend="whatissgml" xreflabel="simple">SGML</link>,
	  the international standard metalanguage for text document
	  markup (ISO 8879).</para>
      </answer>
    </qandaentry>
    <qandaentry id="markup">
      <question>
	<formalpara>
	  <title>What is a markup language?</title>
	  <para>A way of describing what's what in a document.</para>
	</formalpara>
      </question>
      <answer>
	<para>A markup language is a set of words and symbols for
	  describing the <firstterm>identity</firstterm> or
	  <firstterm>function</firstterm> of the component parts of a
	  document (for example <quote>this is a paragraph</quote>,
	  <quote>this is a heading</quote>, <quote>this is a
	    list</quote>, <quote>this is the caption of this
	    figure</quote>, etc). Programs can use markup with a
	  stylesheet to transform the document into output for screen,
	  print, audio, video, Braille, or reprocessable data formats.</para>
	<para>Some markup languages (especially those used in
	  wordprocessors) only describe
	  <firstterm>appearances</firstterm> instead (<quote>this is
	    italics</quote>, <quote>this is bold</quote>, <quote>this
	    has 3mm space below</quote>, etc), so these systems can
	  only be used for display, and are not easily re-usable for
	  anything else.</para>
	<para>XML is sometimes referred to as
	  <quote>self-describing</quote> because the names of the
	  markup elements can represent the type of content they hold
	  (eg <sgmltag>title</sgmltag>, <sgmltag>chapter</sgmltag>,
	  <sgmltag>link</sgmltag>, etc).</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-DEF, def" id="whatfor" >
      <question>
	<formalpara>
	  <title>What is XML for?</title>
	  <para>XML is for identification, transmission, and storage.</para>
	</formalpara>
      </question>
      <answer remap="examples roles web real world used for uses usage
      technology technologies single sourceing e-publishing publishing
      ecommerce"> 
	<blockquote xreflabel="spec">
	  <title>Goal</title>
	  <para>&hellip;to enable generic SGML to be served, received,
	    and processed on the Web in the way that is now possible
	    with HTML. XML has been designed for ease of
	    implementation and for interoperability with both SGML and
	    HTML.</para>
	</blockquote>
	<para>Despite <ulink
	    url="http://www.oasis-open.org/cover/sgmlwww.html">early
	    attempts</ulink>, browsers never allowed other SGML, only
	  HTML (although there were <link linkend="panorama"
	    xreflabel="simple">plugins</link>). Browser vendors also
	  allowed (even encouraged) HTML to be corrupt or broken in
	  order to make it <wordasword>easier</wordasword>. This
	  enabled HTML to become widespread, but held development back
	  for over a decade by making it impossible to program for it
	  reliably. XML fixes that by making it compulsory to stick to
	  the rules, and by making the rules much simpler than
	  SGML.</para>
	<para>But XML is not just for Web pages: in fact it's very
	  rarely used on its own for Web pages because browsers still
	  don't provide reliable support for it. Common uses for XML
	  include:</para>
	<variablelist>
	  <varlistentry>
	    <term>Information identification</term>
	    <listitem>
	      <para>You can define your own markup, so you can define
		meaningful names for all your information
		items.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>Information storage</term>
	    <listitem>
	      <para>Because XML is portable and non-proprietary, it
		can be used to store information across any platforms.
		Because it is backed by an international standard, it
		will remain accessible and processable as a data
		format.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>Information structure</term>
	    <listitem>
	      <para>XML structures can <wordasword>nest</wordasword>,
		so they can be used to store and identify any kind of
		hierarchical information, especially long, deep, or
		complex document sets or data sources, which makes it
		ideal for an information-management back-end to
		serving the Web. This is one if its most common Web
		applications, with a transformation system to serve it
		as HTML until such time as browsers are able to handle
		XML consistently.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>Publishing</term>
	    <listitem>
	      <para>The original goal of XML as defined in the
		quotation at the start of this section. Combining the
		three previous topics (identity, storage, and
		structure) means it is possible to get all the
		benefits of robust document management and control
		(with XML) and publish to the Web (as HTML) as well as
		to paper (as PDF) and to other formats (eg Braille,
		Audio, etc) from a single source document by using the
		appropriate stylesheets.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>Messaging and data transfer</term>
	    <listitem>
	      <para>XML is also very heavily used for enclosing or
		encapsulating information in order to pass it between
		different computing systems which would otherwise be
		unable to communicate because of their proprietary or
		secret data formats. By providing a
		<foreignphrase>lingua franca</foreignphrase> for data
		identity and structure, XML provides a common
		<wordasword>envelope</wordasword> for inter-process
		communication (messaging).</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term>Web services</term>
	    <listitem>
	      <para>Building on all of these, as well as its use in
		browsers, machine-processable data can be exchanged
		between consenting systems, where before it was only
		comprehensible by humans (HTML). Weather services,
		e-commerce sites, blog newsfeeds, <link linkend="ajax"
		  xreflabel="simple">AJaX</link> sites, and thousands
		of other data-exchange services use XML for data
		management and transmission, and the web browser for
		display and interaction.</para>
	    </listitem>
	  </varlistentry>
	</variablelist>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-SGML, sgml" id="whatissgml">
      <question>
	<formalpara>
	  <title>What is SGML?</title>
	  <para>Standard Generalized Markup Language, ISO
	    8879:1986</para>
	</formalpara>
      </question>
      <answer>
	<para id="faq:SGML">SGML is the Standard Generalized Markup
	  Language (<ulink url="http://www.iso.ch/">ISO
	    8879:1986</ulink>), the international standard for
	  defining descriptions of the structure of different types of
	  electronic document. There is an SGML FAQ from <personname>
	    <firstname>David</firstname>
	    <surname>Megginson</surname>
	  </personname> at <ulink id="FAQ:sgml"
	    url="http://math.albany.edu:8800/hm/sgml/cts-faq.html"></ulink>; 
	  and <personname>
	    <firstname>Robin</firstname>
	    <surname>Cover</surname>
	  </personname>'s SGML Web pages are at <ulink
	    url="http://www.oasis-open.org/cover/general.html"></ulink>. 
	  For a little light relief, try <personname>
	    <firstname>Joe</firstname>
	    <surname>English</surname>
	  </personname>'s <quote>Not the SGML FAQ</quote> at <ulink
	  id="FAQ:not-sgml" 
	    url="http://www.flightlab.com/~joe/sgml/faq-not.txt"></ulink>.</para>
	<para>SGML is very large, powerful, and complex. It has been
	  in heavy industrial and commercial use for nearly two decades,
	  and there is a significant body of expertise and software to
	  go with it.</para>
	<para>XML is a lightweight cut-down version of SGML
	  which keeps enough of its functionality to make it useful
	  but removes all the optional features which made SGML too
	  complex to program for in a Web environment.</para>
	<note>
	  <para>ISO standards like SGML are governed by the
	    International Organization for Standardization in Geneva,
	    Switzerland, and voted into or out of existence by
	    representatives from every country's national standards
	    body.</para>
	  <para>If you have a query about an international standard,
	    you should contact your national standards body for the
	    name of your country's representative on the relevant ISO
	    committee or working group.</para>
	  <para>If you have a query about your country's
	    representation in Geneva or about the conduct of your
	    national standards body, you should contact the relevant
	    government department in your country, or speak to your
	    public representative.</para>
	  <para>The representation of countries at the ISO is not a
	    matter for this FAQ. Please do not submit queries to the
	    editor about how or why your country's ISO representatives
	    have or have not voted on a specific standard.</para>
	</note>
      </answer>
    </qandaentry>
    <qandaentry id="whatishtml" remap="FAQ-HTML, html"
    revisionflag="changed">
      <question>
	<formalpara>
	  <title>What is HTML?</title>
	  <para>HyperText Markup Language, RFC 1866, the language of
	    Web pages.</para>
	</formalpara>
      </question>
      <answer>
	<para>HTML is the <ulink
	    url="http://www.w3.org/MarkUp">HyperText Markup
	    Language</ulink> (<ulink
	    url="ftp://ftp.rfc-editor.org/in-notes/rfc1866.txt">RFC
	    1866</ulink>), which started as a small application of
	  <link linkend="whatissgml" xreflabel="simple">SGML</link>
	  for the Web, originating with <ulink
	    url="http://public.web.cern.ch/Public/Content/Chapters/AboutCERN/Achievements/WorldWideWeb/WWW-en.html"><personname> 
	      <firstname>Tim</firstname>
	      <surname>Berners-Lee</surname> </personname> at
	    CERN</ulink> in 1989&ndash;90.</para>
	<para id="faq:HTML">It defines a very simple class of
	  report-style documents, with section headings, paragraphs,
	  lists, tables, and illustrations, with a few informational
	  elements, but very few presentational elements <biblioref
	    linkend="nopres" role="footnote"/>, plus some hypertext
	  and multimedia. See the question on <link
	    linkend="extendhtml" xreflabel="simple">extending
	    HTML</link>. The current recommendation is to use the XML
	  version, <link linkend="xhtml"
	    xreflabel="simple">XHTML</link>. There is a HTML and XHTML
	  FAQ maintained by <personname>
	    <firstname>Steven</firstname>
	    <surname>Pemberton</surname>
	  </personname> at <ulink id="FAQ:html"
	    url="http://www.w3.org/MarkUp/2004/xhtml-faq"></ulink></para>
	<para id="faq:HTML5">Recent moves 
	  the W3C have led to the development of a revision of HTML
	  called <ulink
	    url="http://www.w3.org/TR/html5/">HTML5</ulink>. There is an <ulink
	    url="http://www.ibm.com/developerworks/library/x-html5/?ca=dgr-lnxw01NewHTML">explanation</ulink> 
	  from Elliotte Rusty Harold, and a <ulink id="FAQ:html5"
	    url="http://blog.whatwg.org/faq/">FAQ</ulink> from the
	  WhatWG.</para>
      </answer>
    </qandaentry>
    <qandaentry id="differences" remap="FAQ-SAME, same">
      <question>
	<formalpara>
	  <title>Aren't XML, SGML, and HTML all the same
	    thing?</title>
	  <para>No, SGML and XML are
	    metalanguages. HTML is an application of them.</para>
	</formalpara>
      </question>
      <answer remap="differences similarity similarities different
	    between xml sgml html compiled case sensitive senstive">
	<para>Not quite; <link linkend="whatissgml"
	    xreflabel="simple">SGML</link> is the mother tongue, and
	  has been used for describing thousands of different document
	  types in many fields of human activity, from transcriptions
	  of <ulink url="http://celt.ucc.ie/">ancient Irish
	    manuscripts</ulink> to the <ulink
	    url="http://web.deskbook.osd.mil/">technical documentation
	    for stealth bombers</ulink>, and from <ulink
	    url="http://www.hl7.org">patients' medical and clinical
	    records</ulink> to <ulink
	    url="http://www.tecno.com/smdl.htm">musical
	    notation</ulink>. SGML is very large and complex, however,
	  and probably overkill for most common office desktop
	  applications.</para>
	<para>XML is an abbreviated version of SGML, to make it easier
	  to use over the Web, easier for you to define your own
	  document types, and easier for programmers to write programs
	  to handle them. It omits all the complex and less-used
	  options of SGML in return for the benefits of being easier
	  to write applications for, easier to understand, and more
	  suited to delivery and interoperability over the Web. But it
	  is still SGML, and XML files may still be processed in the
	  same way as any other SGML file (see the question on <link
	    linkend="software" xreflabel="simple">XML
	    software</link>).</para>
	<para><link linkend="whatissgml"
	    xreflabel="simple">HTML</link> is just one of many SGML or
	  XML applications&mdash;the one most frequently used on the
	  Web.</para>
	<para>Technical readers may find it more useful to think of
	  XML as being SGML&minus;&minus; rather than HTML++.</para>
	<para>(Ed: In respect of this last paragraph, see <link
	    linkend="programming"></link> and <link
	    linkend="execute"></link>.)</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-OWNS, owns" id="responsible">
      <question>
	<formalpara>
	  <title>Who is responsible for XML?</title>
	  <para>The W3C</para>
	</formalpara>
      </question>
      <answer remap="function W3C windows microsoft office owns market">
	<para>XML is a Recommendation of the <ulink
	    url="http://www.w3.org/">World Wide Web Consortium
	    (W3C)</ulink>, and the development of the specification is
	  supervised by an XML Working Group. A Special Interest Group
	  of co-opted contributors and experts from various fields
	  contributed comments and reviews by email.</para>
	<para>XML is a public format: it is not a proprietary
	  development of any company, although the membership of the
	  WG and the SIG represented companies as well as research and
	  academic institutions. <link linkend="spec" xreflabel="simple">The v1.0
	    specification</link> was accepted by the W3C as a
	  Recommendation on Feb 10, 1998.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-IMPORT, import" id="important">
      <question>
	<formalpara>
	  <title>Why is XML such an important development?</title>
	  <para>It overcomes the inflexibility of HTML and the
	    complexity of SGML</para>
	</formalpara>
      </question>
      <answer remap="advantages using xml">
	<para>It removes two constraints which were holding back Web
	  developments:</para>
	<orderedlist>
	  <listitem>
	    <para>dependence on a single, inflexible document type
	      (<link linkend="whatishtml"
		xreflabel="simple">HTML</link>) which was being much
	      abused for tasks it was never designed for;</para>
	  </listitem>
	  <listitem>
	    <para>the complexity of full <link linkend="whatissgml"
		xreflabel="simple">SGML</link>, whose syntax allows
	      many powerful but hard-to-program options.</para>
	  </listitem>
	</orderedlist>
	    <para>XML allows the flexible development of user-defined
	  document types. It provides a robust, non-proprietary,
	  persistent, and verifiable file format for the storage and
	  transmission of text and data both on and off the Web; and
	  it removes the more complex options of SGML, making it
	  easier to program for.</para>
      </answer>
    </qandaentry>
    <qandaentry id="extendhtml" remap="FAQ-EXTEND, extend">
      <question>
	<formalpara>
	  <title>Why not just carry on extending HTML?</title>
	  <para>HTML is already too overburdened with proprietary
	    add-ons.</para>
	</formalpara>
      </question>
      <answer>
	<para><link linkend="whatishtml" xreflabel="simple">HTML</link>
	  was already overburdened with dozens of interesting but
	  incompatible inventions from different manufacturers,
	  because it provides only one way of describing your
	  information.</para>
	<para>XML allows groups of people or organizations to <link
	    linkend="owndoctype" xreflabel="simple">create their own
	    customized markup applications</link> for exchanging
	  information in their domain (music, chemistry, electronics,
	  hill-walking, finance, surfing, petroleum geology,
	  linguistics, cooking, knitting, stellar cartography,
	  history, engineering, rabbit-keeping, <link
	    xreflabel="simple"
	    linkend="mathematics">mathematics</link>, <ulink
	    url="http://users.iclway.co.uk/mhkay/gedml/index.html">genealogy</ulink>, 
	  etc).</para>
	<para>HTML as originally conceived is now well beyond the
	  limit of its usefulness as a way of describing information,
	  and while <link linkend="whatishtml"
	    xreflabel="simple">HTML5</link> will continue to play an
	  important role for the content it represents, many new
	  applications require a more robust and flexible
	  infrastructure.</para>
      </answer>
    </qandaentry>
    <qandaentry id="whyxml" remap="FAQ-WORD, word">
      <question>
	<formalpara>
	  <title>Why should I use XML?</title>
	  <para>It's a robust, durable, manipulable, and free format
	    for information identification, storage and
	    transfer.</para>
	</formalpara>
      </question>
      <answer remap="seperate data">
	<para>Here are a few reasons for using XML (in no particular
	  order). Not all of these will apply to your own
	  requirements, and you may have additional reasons not
	  mentioned here (if so, please let the editor of the FAQ
	  know!).</para>
	<itemizedlist>
	  <listitem>
	    <para>XML can be used to describe and identify information
	      accurately and unambiguously, in a way that computers
	      can be programmed to <quote>understand</quote> your
	      information (well, at least manipulate as if they could
	      understand it).</para>
	  </listitem>
	  <listitem>
	    <para>XML allows documents which are all the same type to
	      be created and handled consistently and without
	      structural errors, because it provides a standardised
	      way of describing, controlling, or allowing/disallowing
	      particular types of document structure. [Note that this
	      has absolutely nothing whatever to do with formatting,
	      appearance, or the actual text or data content of your
	      documents, only the structure of them. If you want
	      styling or formatting, see <link
		linkend="style"></link>.]</para>
	  </listitem>
	  <listitem>
	    <para>XML provides a robust and durable format for
	      information storage and transmission. Robust because it
	      is based on a proven standard, and can thus be tested
	      and verified; durable (persistent) because it uses
	      plain-text file formats which will outlast proprietary
	      binary ones.</para>
	  </listitem>
	  <listitem>
	    <para>XML provides a common syntax for messaging systems
	      for the exchange of information between applications.
	      Previously, each messaging system had its own format and
	      all were different, which made inter-system messaging
	      unnecessarily messy, complex, and expensive. If everyone
	      uses the same syntax it makes writing these systems much
	      faster and more reliable.</para>
	  </listitem>
	  <listitem>
	    <para>XML is free. Not just free of charge (free as in
	      beer) but free of legal encumbrances (free as in
	      speech). It doesn't belong to anyone, so it can't be
	      hijacked or pirated. And you don't have to pay a fee to
	      use it (you can of course choose to use commercial
	      software to deal with it, for lots of good reasons, but
	      you don't pay for XML itself).</para>
	  </listitem>
	  <listitem>
	    <para>XML information can be manipulated programmatically
	      (under machine control), so XML documents can be pieced
	      together from disparate sources, or taken apart and
	      re-used in different ways. They can be converted into
	      any other format with no loss of information.</para>
	  </listitem>
	  <listitem id="separate">
	    <para>XML lets you separate form (appearance) from
	      content. Your XML file contains your document
	      information (text, data) and identifies its structure:
	      your formatting and other processing needs are
	      identified separately in a <link linkend="style"
	      xreflabel="simple">stylesheet</link> or processing
	      system. The two are combined at output time to apply the
	      required formatting to the text or data identified by
	      its structure (location, position, rank, order, or
	      whatever).</para>
	  </listitem>
	  <listitem>
	    <para>Any of the Design Goals listed in the <ulink
		url="http://www.w3.org/TR/2004/REC-xml-20040204/#sec-origin-goals">XML 
		Specification</ulink>.</para>
	  </listitem>
	</itemizedlist>
	<tip xreflabel="Peter Flynn">
	  <title>Why not just use Word or Notes?</title>
	  <para>Restricted proprietary data formats are unsuitable
	    for durable public information.</para>
	  <para>Information on a network which connects many different
	    types of computer has to be usable on all of them. Public
	    information in particular cannot afford to be restricted
	    to one make or model or manufacturer, or to cede control
	    of its data format to private hands. It is also helpful
	    for such information to be in a form that can be reused in
	    many different ways, as this will minimize wasted time and
	    effort. <ulink
	      url="http://publish.ucc.ie/doc/markup?sectoc=1">Proprietary 
	      data formats</ulink>, no matter how well documented or
	    publicized, are simply not an option: their control still
	    resides in private hands and they can be changed or
	    withdrawn arbitrarily without notice.</para>
	  <para><link linkend="whatissgml" xreflabel="simple">SGML</link> is the
	    international standard for defining this kind of
	    application, and was therefore the natural choice for XML,
	    but those who need an alternative based on different
	    software for other purposes are entirely free to implement
	    similar services using such a system, especially if they
	    are for private use.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="moreinfo" remap="FAQ-HOWTO, FAQ-MORE, more"
    revisionflag="changed">
      <question>
	<formalpara>
	  <title>Where do I find more information about XML?</title>
	  <para>At http://xml.coverpages.org/</para>
	</formalpara>
      </question>
      <answer remap="documentation help conferences forums mailing
	  lists discussion groups books articles summer school">
	<para id="faq:XML-Condensed">Online, there's the <link
	    linkend="spec" xreflabel="simple">XML Specification</link>
	  and the ancillary documentation available from the <ulink
	    url="http://www.w3.org/">W3C</ulink>; Robin Cover's <ulink
	    url="http://xml.coverpages.org/">SGML/XML Web
	    pages</ulink> with an extensive list of online reference
	  material and links to software; and a <ulink
	    url="http://www.textuality.com/xml/">summary</ulink> and
	  <ulink id="FAQ:xml-condensed"
	    url="http://www.textuality.com/xml/faq.html">condensed
	    FAQ</ulink> from <personname>
	    <firstname>Tim</firstname>
	    <surname>Bray</surname>
	  </personname>; and thousands of reference resources
	  available by typing <quote>xml</quote> into Google or other
	  search engine.</para>
	<para>For offline resources, see the
	  lists of books, articles, and software for XML in
	    <personname>
	    <firstname>Robin</firstname>
	    <surname>Cover</surname>
	  </personname>'s <ulink
	    url="http://xml.coverpages.org/sgml-xml.html">SGML and XML
	    Web pages</ulink>. That site should always be your first
	  port of call.</para>
	<para>The events listed below are the
	  ones I have been told about. Please <ulink
	    url="xmlfaq@silmaril.ie">mail me</ulink> if you come
	  across others: there are many other XML events around the
	  world, and most of them are announced on the <link xreflabel="simple"
	    linkend="discussions">mailing lists and
	    newsgroups</link>.</para>
	<note id="events">
	  <title>Events</title>
	  <itemizedlist>
	    <listitem>
	      <para>The <ulink
		  url="http://balisage.net/">Balisage</ulink>
		conference (the principal technical meeting) will be
		in Montréal on 1st&ndash;5th August 2011.</para>
	    </listitem>
	    <listitem>
	      <para id="summer">The 2011 annual <ulink
		  url="http://www.xmlsummerschool.com/">XML Summer
		  School</ulink>, organised by <ulink
		  url="http://elevenllp.co.uk/">Eleven
		  Informatics</ulink>, will be held in St Edmund Hall,
		Oxford on 18th&ndash;23rd September 2011.</para>
	    </listitem>
	    <listitem>
	      <para>The <ulink
		  url="http://www.idealliance.org/conferences_and_events/find?industry=xml">XML-in-Practice
		  2011 Conference &ampers; Exposition</ulink> (run by
		  IDEAlliance, formerly the GCA) 
		is themed <quote></quote>
		and will be in  in October.</para>
	    </listitem>
	    <listitem>
	      <para><ulink
	      url="http://www.xmlprague.cz/2011/index.html">XML
	      Prague</ulink> will be held on March 26th &ampers; 27th,
	      2011 at Charles University.</para>
	    </listitem>
	  </itemizedlist>
	</note>
      </answer>
    </qandaentry>
    <qandaentry id="discussions" remap="FAQ-MAILINGLIST, mailinglist">
      <question>
	<formalpara>
	  <title>Where can I discuss implementation and development of
	    XML?</title>
	  <para>On mailing lists, Usenet newsgroups, web-based
	    bulletin-boards, and IRC channels</para> 
	</formalpara>
      </question>
      <answer remap="forums">
	<para>Two of the principal online support media are Usenet
	  newsgroups and mailing lists. The IRC network is also used
	  to some extent, and most individual projects and programs
	  have their own topic-specific bulletin-boards on their web
	  sites. There is also an unknown number of
	  question-and-answer forum sites which are findable using
	  search engines.</para>
	<para>For off-line support, see <link
	    linkend="moreinfo"></link> for details of conferences and
	  summerschools.</para>
	<itemizedlist>
	  <listitem>
	    <para>The main Usenet newsgroup is <ulink type="news"
		url="comp.text.xml">comp.text.xml</ulink>, although it
	      is less used than formerly. Ask your Internet Provider
	      for access to Usenet, or use a Web interface like the
	      <ulink
		url="http://groups.google.com/group/comp.text.xml/topics">searchable 
		archive</ulink> maintained by Google. If your browser
	      or mailer doesn't provide newsreading facilities,
	      install one that does, or (better) use a standalone
	      newsreader.</para>
	    <para>The <ulink type="news"
		url="comp.text.sgml">comp.text.sgml</ulink> is for all
	      practical purposes no longer used. The
	      Microsoft-specific newsgroups are being phased out in
	      favour of web-based forums hosted by Microsoft
	      themselves.</para>
	  </listitem>
	  <listitem>
	    <para>The general-purpose mailing list for public
	      discussion is <ulink
		url="http://listserv.heanet.ie/xml-l.html">XML-L</ulink>: 
	      to subscribe, visit <ulink
		url="https://listserv.heanet.ie/cgi-bin/wa?SUBED1=xml-l&ampers;A=1">the 
		Web site</ulink> and click on the link to join.</para>
	  </listitem>
	  <listitem>
	    <para>For those developing software components for XML
	      there is the <ulink
		url="http://lists.xml.org/archives/xml-dev/">xml-dev
		mailing list</ulink>. You can subscribe by sending a
	      1&ndash;line mail message to <ulink
		url="xml-dev-request@lists.xml.org"></ulink> saying
	      just <literal>SUBSCRIBE</literal>. Note that this list
	      is for those people actively involved in developing
	      resources for XML. It is not for general information
	      about XML (use the XML-L list above for that).</para>
	  </listitem>
	  <listitem>
	    <para>The XSL-List is for for discussing XSL (both XSLT
	      and XSL:FO). For details of how to subscribe, see <ulink
		url="http://www.mulberrytech.com/xsl/xsl-list"></ulink>.</para>
	  </listitem>
	  <listitem>
	    <para>There is a long list of other discussion groups,
	      mailing lists, and forums on Robin Cover's site at
	      <ulink
		url="http://xml.coverpages.org/lists.html"></ulink>.</para>
	  </listitem>
	</itemizedlist>
	<tip xreflabel="Andrew Watt">
	  <para>There is a mailing list specifically for <ulink
	      url="http://www.egroups.com/group/XSL-FO">XSL-FO only,
	      on eGroups.com</ulink>. You can subscribe by sending a
	    message to <ulink
	      url="XSL-FO-subscribe@egroups.com"></ulink>.</para>
	</tip>
	<warning>
	  <para>Be aware that the Yahoo E-Groups XSL-FO list sends out
	    regular automated spam to non-members falsely claiming
	    that they have asked to join.</para>
	</warning>
	<tip xreflabel="Gianni Rubagotti">
	  <para>A new Italian mailing list about XML is born: to
	    subscribe, send a mail message without a subject line but
	    with text saying <literal>subscribe XML-IT</literal> to
	    <ulink url="majordomo@ananas.usr.dsi.unimi.it"></ulink>.
	    Everyone, Italian or not, who wants to debate about XML in
	    our tongue is welcome.</para>
	  <para id="x-hum">Gianni also runs the <ulink
	      url="http://groups.yahoo.com/group/x-humanities/">Humanities 
	      XML List</ulink>.</para>
	</tip>
	<tip xreflabel="J-P Theberge">
	  <para>A French mailing list about XML has been created. To
	    subscribe, send <literal>subscribe</literal> to <ulink
	      url="xml-request@trisome.com"></ulink>.</para>
	</tip>
	<tip id="rng-list" xreflabel="Murata Makoto" lang="jp">
	  <para>Please mention this mailing list to your colleagues
	    who use RELAX NG. Go to: <ulink
	      url="http://groups.yahoo.com/group/rng-users/"></ulink>.</para>
	</tip>
	<note>
	  <title>Mailing lists</title>
	  <para>When you join a mailing list you will be sent details
	    of how to use it.  Please Read The Fine Documentation
	    because it  contains important information, particularly
	    about what to do if your company or ISP changes your email
	    address.</para>
	  <para>Please note that there is a lot of inaccurate and
	    misleading information published in print and on the Web
	    about subscribing to and unsubscribing from mailing lists.
	    Don't guess: Read The Fine Documentation.</para>
	</note>
      </answer>
    </qandaentry>
    <qandaentry id="programming" remap="langs javascript cobol pl/1 pl/i
	  pascal perl python ruby tcl/tk ppl differences" >
      <question>
	<formalpara>
	  <title>What is the difference between XML and C or
	    C++ or Java?</title>
	  <para>C and Java are for writing programs; XML is for
	    storing text.</para>
	</formalpara>
      </question>
      <answer>
	<para>C and C++ (and other languages like FORTRAN, or Pascal,
	  or Visual Basic, or Java or hundreds more) are
	  <emphasis>programming languages</emphasis> with which you
	  specify calculations, actions, and decisions to be carried
	  out in order:</para>
	<programlisting><![CDATA[
mod curconfig[if left(date,6) = "01-Apr", 
    t.put "April Fool!", 
    f.put days('31102011','DDMMYYYY') -
          days(sdate,'DDMMYYYY')
    " more shopping days to Samhain"];
	  ]]></programlisting>
	<para>XML is a markup specification language with which you
	  can design ways of describing information (text or data),
	  usually for storage, transmission, or processing by a
	  program. It says nothing about what you should do with the
	  data (although your choice of element names may hint at what
	  they are for):</para> 
	<programlisting><![CDATA[
<part num="DA42" models="LS AR DF HG KJ" update="2001-11-22">
  <name>Camshaft end bearing retention circlip</name>
  <image drawing="RR98-dh37" type="SVG" x="476" y="226"/>
  <maker id="RQ778">Ringtown Fasteners Ltd</maker>
  <notes>An <tool id="GH25"/>angle-nosed insertion tool</tool> is 
    required for the removal and replacement of this part.</notes>
</part>
	  ]]></programlisting>
	<para>On its own, an SGML or XML file (including HTML) doesn't
	  do anything. It's a data format which just sits there until
	  you run a program which does something
	  <emphasis>with</emphasis> it. See also the question about
	  <link linkend="execute" xreflabel="simple">how to run or
	    execute XML files</link>.</para>
	<tip xreflabel="William Hammond">
	  <para>(in article
	    <literal><![CDATA[<i7ll1362ib.fsf@hilbert.math.albany.edu>]]></literal>)</para>
	  <para>SGML is a category of <wordasword>document
	      types</wordasword>, with a configurable shared syntax,
	    most of which (like classic HTML) cannot be compiled to
	    produce executable programs.  XML is a subcategory of SGML
	    with syntactic restrictions.  For example, with XML the
	    vocabulary of a document type is always case sensitive,
	    while with SGML it may be either case sensitive or case
	    insensitive.  So, for example, classic HTML is an SGML
	    document type, and XHTML+MathML is an XML document
	    type.</para>
	  <para>While some document types correspond to document
	    markup languages, other document types (like a CTAN catalog
	    entry) are just for structured data[...]</para>
	  <para>I doubt seriously, however, that a computer language
	    like C is in any reasonable sense equivalent to an SGML
	    document type.
	  </para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-REPLACE, replace" id="replacehtml">
      <question>
	<formalpara>
	  <title>Does XML replace HTML?</title>
	  <para>No.</para>
	</formalpara>
      </question>
      <answer>
	<para>No. XML itself does not replace HTML. Instead, it
	  provides an alternative which allows you to define your own
	  set of markup elements. HTML is expected to remain in common
	  use on the web, and the current version of HTML (<link
	    linkend="xhtml" xreflabel="simple">XHTML</link>) is in XML
	  syntax, although HTML5 may depart from this.</para>
	<para>XML is designed to make the writing of processing
	  software much easier than with SGML, which is what the
	  original HTML was based on.</para>
      </answer>
    </qandaentry>
    <qandaentry id="xhtml" remap="htmlxml">
      <question>
	<formalpara>
	  <title>Is there an XML version of HTML?</title>
	  <para>Yes, XHTML from W3C</para>
	</formalpara>
      </question>
      <answer>
	<para>Yes, the W3C Recommendation is <ulink
	    url="http://www.w3.org/TR/xhtml1/">XHTML</ulink> which is
	  <quote>a reformulation of HTML 4 in XML 1.0</quote>. This
	  specification defines HTML as an XML application, and
	  provides three DTDs corresponding to the ones defined by
	  HTML 4.* (Strict, Transitional, and Frameset).</para>
	<para>The semantics of the elements and their attributes are
	  as defined in the W3C Recommendation for HTML 4. These
	  semantics were intended to provide the foundation for future
	  extensibility of XHTML. Compatibility with existing HTML
	  browsers is possible by following a small set of guidelines
	  (see the W3C site).</para>
      </answer>
    </qandaentry>
  </qandadiv>
  <qandadiv remap="FAQ-USER, User" id="users">
    <title>Existing users (including everyone who uses a
      browser)</title>
    <qandaentry remap="FAQ-USEXML, usexml" id="usexml">
      <question>
	<formalpara>
	  <title>What do I have to do to use XML?</title>
	  <para>To read it: an XML browser (eg Firefox or IE). To
	    create: an XML editor (Emacs, Spy, etc).</para>
	</formalpara>
      </question>
      <answer>
	<para>For the average user of the Web, nothing except use a
	  browser which works with XML (see the <link
	    linkend="browsers" xreflabel="simple">question about
	    browsers</link>). Remember some XML components are still
	  being invented or implemented (see the <ulink
	    url="http://www.w3.org/">W3C</ulink> web site), so some
	  features are still either undefined or have yet to be
	  written.</para>
	<para>You can use XML-conformant browsers to look at some of
	  the stable XML material, such as <ulink
	    url="ftp://sunsite.unc.edu/pub/sun-info/standards/xml/eg/">Jon 
	    Bosak's Shakespeare plays</ulink> and the molecular
	  experiments of the <ulink
	    url="http://www.xml-cml.org">Chemical Markup Language
	    (CML)</ulink>. There are some more example sources listed
	  at <ulink
	    url="http://xml.coverpages.org/xml.html#examples"></ulink>, 
	  and you will find XML (particularly in the guise of <link
	    linkend="xhtml" xreflabel="simple">XHTML</link>) being
	  introduced in places where it won't break older
	  browsers.</para>
	<para>If you want to start preparations for creating your own
	  XML files, see the questions in the <link xreflabel="simple"
	    linkend="authors">Authors' Section</link> and the <link
	    xreflabel="simple" linkend="developers">Developers'
	    Section</link>.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-XMLDOC, xmldoc" id="internals" >
      <question>
	<formalpara>
	  <title>What does an XML document actually look like (inside)?</title>
	  <para>Pointy brackets like HTML</para>
	</formalpara>
      </question>
      <answer remap="top level internalsubset">
	<para>The basic structure of XML is similar to other
	  applications of SGML, including HTML. The basic components
	  can be seen in the following examples. An XML document
	  starts with an optional <firstterm>Prolog</firstterm>, which
	  can have two (optional) parts:</para>
	<orderedlist>
	  <listitem>
	    <para>The <firstterm>XML Declaration</firstterm></para>
	    <programlisting><![CDATA[
<?xml version="1.0" encoding="utf-8"?>
	 ]]></programlisting> 
	    <para>which specifies that this is an XML document and
	    that it uses the UTF-8 character repertoire (the default);</para>
	  </listitem>
	  <listitem>
	    <para>A Document Type Declaration</para>
	    <programlisting><![CDATA[
<!DOCTYPE report SYSTEM "http://sales.acme.corp/dtds/salesrep.dtd">
	    ]]></programlisting>
	    <para>which identifies the type of document (here,
	      <wordasword>report</wordasword>) and says where the
	      Document Type <emphasis>Description</emphasis> (DTD) is
	      stored;</para>
	  </listitem>
	</orderedlist>
	<para>The Prolog is followed by the <firstterm>Document
	    Instance</firstterm>:</para>
	<orderedlist>
	  <listitem>
	    <para>A <firstterm>root element</firstterm>, which is the
	      outermost (top level) element (start-tag plus end-tag)
	      which encloses everything else: in the examples below
	      the root elements are <sgmltag>conversation</sgmltag>
	      and <sgmltag>titlepage</sgmltag>;</para>
	  </listitem>
	  <listitem>
	    <para>A structured mix of descriptive or prescriptive
	      <firstterm>elements</firstterm> enclosing the
	      <firstterm>character data content</firstterm> (text),
	      and optionally any <firstterm>attributes</firstterm>
	      (<quote>name="value"</quote> pairs) inside some
	      start-tags.</para>
	  </listitem>
	</orderedlist>
	<para>XML documents can be very simple, with straightforward
	  nested markup of your own design:</para>
	<programlisting><![CDATA[
<?xml version="1.0" standalone="yes"?>
<conversation>
  <greeting>Hello, world!</greeting>
  <response>Stop the planet, I want to get 
   off!</response>
</conversation>
	  ]]></programlisting>
	<para>Or they can be more complicated, with a <link
	    linkend="schemas" xreflabel="simple">Schema</link> or
	  <link linkend="dtds" xreflabel="simple">DTD</link>, and
	  maybe an <firstterm>internal subset</firstterm> (local DTD
	  changes in [square brackets]); and an arbitrarily complex
	  nested structure:</para>
	<programlisting><![CDATA[
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE titlepage 
  SYSTEM "http://www.foo.bar/dtds/typo.dtd" 
[<!ENTITY % active.links "INCLUDE">]>
<titlepage id="BG12273624">
  <white-space type="vertical" amount="36"/>
  <title font="Baskerville" alignment="centered" 
   size="24/30">Hello, world!</title>
  <white-space type="vertical" amount="12"/>
	  <!-- In some copies the following 
           decoration is hand-colored, presumably 
           by the author -->
  <image location="http://www.foo.bar/fleuron.eps" 
   type="URI" alignment="centered"/>
  <white-space type="vertical" amount="24"/>
  <author font="Baskerville" size="18/22" 
   style="italic">Vitam capias</author>
  <white-space type="vertical" role="filler"/>
</titlepage>
	  ]]></programlisting>
	<para>Or they can be anywhere between: a lot will depend on
	  how you want to define your document type (or whose you use)
	  and what it will be used for. Database-generated or
	  program-generated XML documents used in e-commerce are
	  usually unformatted because they are for machine
	  consumption, not for human reading, and they may use very
	  long names or values, with multiple redundancy and sometimes
	  no character data content at all, just values in
	  attributes:</para>
	<programlisting><![CDATA[
<?xml version="1.0"?>
<ORDER-UPDATE AUTHMD5="4baf7d7cff5faa3ce67acf66ccda8248"
 ORDER-UPDATE-ISSUE="193E22C2-EAF3-11D9-9736-CAFC705A30B3"
 ORDER-UPDATE-DATE="2005-07-01T15:34:22.46"
 ORDER-UPDATE-DESTINATION="6B197E02-EAF3-11D9-85D5-997710D9978F"
 ORDER-UPDATE-ORDERNO="8316ADEA-EAF3-11D9-9955-D289ECBC99F3">
  <ORDER-UPDATE-DELTA-MODIFICATION-DETAIL ORDER-UPDATE-ID="BAC352437484">
    <ORDER-UPDATE-DELTA-MODIFICATION-VALUE ORDER-UPDATE-ITEM="56"
     ORDER-UPDATE-QUANTITY="2000"/>
  </ORDER-UPDATE-DELTA-MODIFICATION-DETAIL>
</ORDER-UPDATE> 
	  ]]></programlisting>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-XMLOFFER, xmloffer" id="xmlhtml">
      <question>
	<formalpara>
	  <title>Should I use XML instead of HTML?</title>
	  <para>Yes if you need robustness, accuracy, and
	    persistence.</para>
	</formalpara>
      </question>
      <answer remap="cascading style sheets">
	<para>Yes, if you need robustness, accuracy, and persistence.
	  XML allows authors and providers to <link
	    linkend="owndoctype" xreflabel="simple">design their own
	    document markup</link> instead of being limited by HTML.
	  Document types can be explicitly tailored to an application,
	  so the cumbersome fudging and poodlefaking that has to take
	  place with <link linkend="whatishtml"
	    xreflabel="simple">HTML</link> becomes a thing of the
	  past: your markup can always say what it means. Trivial
	  example:</para>
	  <programlisting><![CDATA[
<date YYYY-MM-DD="2005-12-26">next Monday</date>
          ]]></programlisting>
	<itemizedlist>
	  <listitem>
	    <para>Information content can be richer and easier to use,
	      because the descriptive and <link xreflabel="simple"
		linkend="links">hypertext linking abilities of
		XML</link> are much greater than those available in
	      HTML.</para>
	  </listitem>
	  <listitem>
	    <para>XML can provide more and better facilities for
	      browser presentation and performance, using XSLT and CSS
	      stylesheets;</para>
	  </listitem>
	  <listitem>
	    <para>It removes many of the underlying complexities of
	      SGML-format HTML (which led to them being ignored and
	      broken) in favor of a more flexible model, so writing
	      programs to handle XML is much easier than doing the
	      same for all the old broken HTML.</para>
	  </listitem>
	  <listitem>
	    <para>Information becomes more accessible and reusable,
	      because the more flexible markup of XML can be used by
	      any XML software instead of being restricted to specific
	      manufacturers as has become the case with HTML.</para>
	  </listitem>
	  <listitem>
	    <para>XML files can be used outside the Web as well, in
	      existing document-handling environments (eg
	      publishing).</para>
	  </listitem>
	</itemizedlist>
	<para>If your information is transient,
	  or completely static <emphasis>and</emphasis> unreferenced,
	  or very short and simple, and unlikely to need updating,
	  HTML may be all you need.</para>
      </answer>
    </qandaentry>
    <qandaentry id="readxml">
      <question>
	<formalpara>
	  <title>Someone sent me an XML file. How do I read it?</title>
	  <para>Open it in an XML browser or XML editor.</para>
	</formalpara>
      </question>
      <answer remap="reading opening">
	<para>If the file is well-formed or valid XML, you can just
	  open it with any XML-conformant browser (see <link
	    linkend="browsers"></link>). This will display the file in
	  an unformatted view, showing all the markup in a format that
	  lets you fold up or unfold the nested hierarchy (click on
	  the little plus and minus symbols), which will at least let
	  you read something.</para>
	<para>If the file contains a link to an XSLT or CSS stylesheet
	  (and the stylesheet was provided or is web-accessible) then
	  the browser should format the file in a readable manner (but
	  beware that in-browser formatting is not robust).</para>
	<para>If you want to edit the file, you need an XML editor
	  (see <link linkend="editors"></link>). Unless you are very
	  skilled with pointy-bracket markup, do
	  <emphasis>not</emphasis> try to edit XML files with non-XML
	  editors.</para>
      </answer>
    </qandaentry>
    <qandaentry id="style" remap="FAQ-STYLE, style">
      <question>
	<formalpara>
	  <title>How do I control formatting and appearance?</title>
	  <para>Use a CSS or XSLT stylesheet.</para>
	</formalpara>
      </question>
      <answer remap="calling assigning stylesheets document format
	language styling converting putting transform cascading style
	sheets layout ie internet explorer ie6">
	<para>In HTML, default styling was built into the browsers
	  because the tagset of HTML was predefined and hardwired into
	  browsers. In XML, where you can define your own tagset,
	  browsers cannot possibly be expected to guess or know in
	  advance what names you are going to use and what they will
	  mean, so you need a stylesheet if you want to display
	  formatted text.</para>
	<para><link xreflabel="simple" linkend="browsers">Browsers
	    which read XML</link> will accept and use a CSS stylesheet
	  at a minimum, but you can also use the more powerful XSLT
	  stylesheet language to transform your XML into
	  HTML&mdash;which browsers, of course, already know how to
	  display (and that HTML can still use a CSS stylesheet). This
	  way you get all the document management benefits of using
	  XML, but you don't have to worry about your readers needing
	  XML smarts in their browsers.</para>
	<tip xreflabel="Mike Brown">
	  <para>XSLT is an XML document processing language that uses
	    source code that happens to be written in XML. An XSLT
	    document declares a set of rules for an XSLT processor to
	    use when interpreting the contents of an XML document.
	    These rules tell the XSLT processor how to generate a new
	    XML-like data structure and how that data should be
	    emitted&mdash;as an XML document, as an HTML document, as
	    plain text, or perhaps in some other format.</para>
	  <para>This transformation can be done either inside the
	    browser, or by the server before the file is sent.
	    Transformation in the browser offloads the processing from
	    the server, but may introduce browser dependencies,
	    leading to some of your readers being excluded.
	    Transformation in the server makes the process
	    browser-independent, but places a heavier processing load
	    on the server.</para>
	</tip>
	<para>As with any system where files can be viewed at random
	  by arbitrary users, the author cannot know what resources
	  (such as fonts) are on the user's system, so the same care
	  is needed as with HTML using fonts. To invoke a stylesheet
	  from an XML file for standalone processing in the browser,
	  include one of the stylesheet declarations:</para>
	<programlisting><![CDATA[ 
<?xml-stylesheet href="foo.xsl" type="text/xsl"?> 
<?xml-stylesheet href="foo.css" type="text/css"?> 
	  ]]></programlisting>
	<para>(substituting the URI of your stylesheet, of
	  course). See <ulink
	    url="http://www.w3.org/TR/xml-stylesheet/"></ulink> for
	  the full details.
	  The <ulink url="http://www.w3.org/Style/css">Cascading
	    Stylesheet Specification (CSS)</ulink> provides a simple
	  syntax for assigning styles to elements, and has been
	  implemented in most browsers.</para>
	<para id="faq:XSL"><personname>
	    <firstname>Dave</firstname>
	    <surname>Pawson</surname>
	  </personname> maintains a comprehensive XSL FAQ at <ulink
	    id="FAQ:xsl"
	    url="http://www.dpawson.co.uk/xsl/"></ulink>,
	  and his book <biblioref linkend="fox"/> [the Fox book] is
	  available from O'Reilly. XSL uses XML syntax (an XSL
	  stylesheet is just an XML file) and has widespread support
	  from several major  browser vendors (see the questions on
	  <link linkend="browsers" xreflabel="simple">browsers</link>
	  and <link linkend="software" xreflabel="simple">other
	    software</link>). XSL comes in two flavours:</para>
	<itemizedlist>
	  <listitem>
	    <para>XSL itself, which is a pure formatting language,
	      outputting a Formatted Objects (FO) file, which needs a
	      text formatter like <ulink
		url="http://xml.apache.org/">FOP</ulink>, <ulink
		url="http://www.renderx.com/">XEP</ulink>, or others
	      to create printable (PDF) output (but see <link
		linkend="TeX" xreflabel="directional"></link>).
	      Currently I am not aware of any Web browsers which
	      support direct XSL rendering to PDF;</para>
	  </listitem>
	  <listitem>
	    <para>XSLT (T for Transformation), which is a language to
	      specify transformations of XML into HTML either inside
	      the browser or at the server before transmission. It can
	      also specify transformations from one vocabulary of XML
	      to another, and from XML to plaintext (which can be any
	      format, including RTF and <LaTeX/>).</para>
	  </listitem>
	</itemizedlist>
	<para>Currently only Microsoft Internet Explorer 5.5 and
	  above, and <ulink
	    url="http://www.mozilla.org/">Firefox</ulink> 0.9.6 and
	  above handle XSLT inside the browser (MSIE5.5 needs some
	  <ulink
	    url="http://www.netcrucible.com/xslt/msxml-faq.htm">post-installation 
	    surgery</ulink> to remove the obsolete WD-xsl and replace
	  it with the current XSL-Transform processor; MSIE6 and
	  Firefox work as installed).</para>
	<tip>
	  <title>WYSIWYG for XSL</title>
	  <para>There have been attempts to produce pseudo-WYSIWYG
	    editors for creating XSL[T] stylesheets, but they have
	    mostly been restricted to simple mapping between input
	    elements and output elements (eg a DocBook
	    <sgmltag>para</sgmltag> to a HTML <sgmltag>p</sgmltag>).
	    Anything beyond this seems likely to fail because of the
	    infinite complexity of what people want to do with their
	    information. If you have access to the ACM database, see
	    the <ulink
	      url="http://portal.acm.org/citation.cfm?id=502189">paper
	      by Pietriga, Vion-Dury, and Quint on VXT</ulink>, from
	    the ACM DocEng'01 (Atlanta) Proceedings.</para>
	</tip>
	<tip>
	  <title>Generating HTML on the server</title>
	  <para>There is a growing use of server-side processors like
	    <ulink url="http://cocoonapache.org/">Cocoon</ulink>,
	    <ulink url="http://axkit.org/">AxKit</ulink>, <ulink
	      url="http://www.propylon.com/products/propelx/">PropelX</ulink>, 
	    and others, which let you create, store, and manage your
	    information in XML but serve it auto-converted to HTML or
	    some other format, thus allowing the output to be used by
	    any browser. XSLT is also widely used to transform XML
	    into non-SGML formats for input to other systems (for
	    example to transform XML into <LaTeX/> for
	    typesetting).</para>
	</tip>
	<tip id="TeX">
	  <title>Alternatives to XSL:FO</title>
	  <para>Instead of generating PDF via
	    an FO processor, it is possible to use XSLT to transform
	    XML to <LaTeX/> for typesetting PDF (as is done for the
	    print versions of this FAQ, from DocBook to
	    <LaTeX/>). This has the advantage of being able to make
	    use of <LaTeX/>'s extensive library of prewritten
	    formatting modules (<quote>packages</quote>), which avoids
	    much of the wheel-reinventing currently required with
	    XSL:FO.</para>
	  <para>Alternatively, <personname>
	      <firstname>David</firstname>
	      <surname>Carlisle</surname>
	    </personname>'s <productname>xmltex</productname> reads
	    XML directly, offering another practical if experimental
	    solution to typesetting XML. One use of a <TeX/> system
	    that can typeset XML files is as a backend processor for
	    XSL:FO, serialized as XML. <personname>
	      <firstname>Sebastian</firstname>
	      <surname>Rahtz</surname>
	    </personname>'s Passive<TeX/> uses
	    <productname>xmltex</productname> to achieve this
	    end.</para>
	  <para id="faq:TeX">The <TeX/> FAQ is at <ulink id="FAQ:tex"
	      url="http://www.tex.ac.uk/faq"></ulink>.</para>
	</tip>
	<para>SGML systems used a similar stylesheet mechanism: some
	  of the common ones were the FOSI (Formatted Output
	  Specification Instance), which was standard in defence and
	  industrial engineering applications, especially when using
	  the Arbortext editor (Adept, now Epic); the DynaText/DynaWeb
	  stylesheet used in SGML publishing to the web; and the Synex
	  stylesheet used in browsers based on the Synex engine (eg
	  Panorama, whose styling interface was partly adopted in
	  XMetaL), the expertise of whose designers persists in the
	  DocZilla browser.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-BROWSER, browser" id="browsers">
      <question>
	<formalpara>
	  <title>Where can I get an XML browser?</title>
	  <para>MSIE 7; Firefox 3</para>
	</formalpara>
      </question>
      <answer remap="brousers web browsers compatible difference
	  cascading style sheets ie">
	<para>Current state of existing browser support for XML (1 January
	  2011):</para>
	<itemizedlist>
	  <listitem>
	    <para>Current versions of Microsoft Internet Explorer,
	      Firefox, Safari, Chrome, and Opera all appear to support
	      XML with CSS and/or XSLT stylesheets. The editor would
	      welcome additional information and corrections.</para>
	  </listitem>
	  <listitem>
	    <para>Don't use Netscape (any version), Internet Explorer
	      6 or earlier, or any early versions of Mozilla if you
	      want XML support: they either don't have it or were
	      hopelessly broken. Upgrade to the current version of
	      <ulink url="http://www.mozilla.org/">Firefox</ulink> as
	      soon as possible.</para>
	  </listitem>
	  <listitem>
	    <para>The remainder of this list is of historical interest
	  only.</para>
	  </listitem>
	  <listitem>
	    <para id="faq:MSXML">Microsoft Internet Explorer
	      5.0 and 5.5 handled XML, processing it by default using a
	      built-in stylesheet written in a Microsoft-specific,
	      obsolete predecessor of XSLT called XSL (not to be
	      confused with the real XSLT). The output of the
	      stylesheet is DHTML, which, when rendered in the
	      browser, shows a coloured, syntax-highlighted version of
	      the XML document, with collapsible views. If the XML
	      document references a stylesheet, that stylesheet will
	      be used instead, within the limitations of MSIE's
	      incomplete implementation of CSS. MSIE 5.0 and 5.5 can
	      also use stylesheets in another obsolete syntax called
	      WD-xsl, which should be avoided. These versions can be
	      upgraded to support real XSLT: see the <ulink id="FAQ:msxml"
		url="http://www.netcrucible.com/xslt/msxml-faq.htm">MSXML 
		FAQ</ulink>.</para>
	    <para>MSIE 6.0 and later use real XSLT 1.0, but
	      can use both the obsolete syntaxes as well.</para>
	  </listitem>
	  <listitem>
	    <para>Mozilla <ulink
		url="http://www.mozilla.org/">Firefox</ulink> 0.9 up,
	      Netscape&nbsp;6 and 7 (there is no Netscape&nbsp;5), and Galeon
	      all have full XML support with XSLT and CSS. In
	      general, Firefox is more robust than MSIE, and provides
	      better standards adherence.</para>
	    <para>I have a user report that Netscape 4.6 and 4.8 supports XML,
		but no independent verification.</para>
	  </listitem>
	  <listitem>
	    <para>The authors of the former
	      MultiDoc Pro SGML browser, <ulink
		url="http://www.citec.fi/">CITEC</ulink> (whose engine
	      was also used in Panorama and other browsers),
	      joined forces with Mozilla to produce a multi-everything
	      browser called DocZilla, which reads HTML, XML, and
	      SGML, with XSLT and CSS stylesheets. This runs under
	      Windows and Linux and is currently at release 1.0. See
	      <ulink url="http://www.doczilla.com"></ulink> for
	      details. This is by far the most ambitious browser
	      project, and is backed by very solid markup-handling
	      expertise.</para>
	  </listitem>
	  <listitem>
	    <para>Contrary to earlier reports, <ulink
		url="http://www.opera.com/opera5/specs.html">Opera</ulink> 
	      does <emphasis>not</emphasis> appear to support XML. The
	      browser size is tiny by comparison with the others, but HTML/CSS
	      features are good and the speed is excellent, although
	      the earlier slavish insistence on mimicking everything
	      old (pre-Mozilla) Netscape did, especially the bugs,
	      still shows through in places.</para>
	  </listitem>
	</itemizedlist>
	<para>I have less information on the XML capabilities of the
	  new (OS/X) Mac browser (Safari), which is based on the KHTML
	  engine used in Konqueror. Konqueror itself does not appear
	  to support XML or XSLT (at least in KDE under Fedora Core 4,
	  for example), but Safari 1.3.2 (v312.6) under OS 10.3 does
	  provide partial support for XML, but does not honour an
	  external DTD modified by an internal subset (thanks to
	  <personname>
	    <firstname>John</firstname>
	    <surname>Haynie</surname>
	  </personname> for testing this).</para>
	<tip xreflabel="Mike Brown">
	  <para>The concept of <quote>browsing</quote> is primarily
	    the result of HTML having the semantics that it does. In
	    an HTML document there are sections of text called anchors
	    that are <quote>hyperlinked</quote> to other documents
	    that might be at remote locations on a network or
	    filesystem. HTML documents provide cues to a web browser
	    regarding how the document should be displayed and what
	    kind of behaviors are expected of the browser when the
	    user interacts with it. The HTML specification provides
	    many suggestions and requirements for the browser, and
	    provides specific meanings for many different examples of
	    markup, such as the fact that an
	    <programlisting><![CDATA[<img>]]></programlisting> element
	    refers to an image that should be retrieved by the browser
	    and rendered inline with the adjacent text.</para>
	  <para>Unlike HTML, XML does not have such inherent semantics
	    at all. There is no prescribed method for rendering XML
	    documents. Therefore, what it means to
	    <quote>browse</quote> XML is open to interpretation. For
	    example, an XML document describing the characteristics of
	    a machine part does not carry any information about how
	    that information should be presented to a user. An
	    application is free to use the data to produce an image of
	    the part, generate a formatted text listing of the
	    information, display the XML document's markup with a
	    pretty color scheme, or restructure the data into a format
	    for storage in a database, transmission over a network, or
	    input to another program.</para>
	  <para>However, despite the fact that XML documents are
	    purely descriptive data files, it is possible to
	    <quote>browse</quote> them in a sense, by rendering them
	    with stylesheets. A stylesheet is a separate document that
	    provides hints and algorithms for rendering or
	    transforming the data in the XML document. HTML users may
	    be familiar with Cascading Style Sheets (CSS). The CSS
	    stylesheet language is general and powerful enough to be
	    applied to XML documents, although it is oriented toward
	    visual rendering of the document and does not allow for
	    complex processing of the document's data. By associating
	    an XML document with a CSS stylesheet, it may be possible
	    to load an XML document in a CSS-aware web browser, and
	    the browser may be able to provide some kind of rendering
	    of it, even if the browser does not otherwise know how to
	    read and process XML documents. However, not all web
	    browsers will load an XML document correctly, and they are
	    not required to recognize the XML markup that associates
	    the document with a stylesheet, so one cannot assume that
	    XML documents can be opened with just any web
	    browser.</para>
	  <para>A more complex and powerful <link
	      linkend="style">stylesheet language</link> is XSLT, the
	    Transformations part of the Extensible Stylesheet
	    Language, which can be used to transform XML to other
	    formats, including HTML, other forms of XML, and plain
	    text.  If the output of this transformation is HTML, it
	    can be viewed in a web browser as any other HTML document
	    would.</para>
	  <para>The degree of support for XML and stylesheets in web
	    browsers varies greatly. Although loading and rendering
	    XML in the browser is possible in some cases, it is not
	    universally supported. Therefore, much XML content on the
	    web is translated to HTML on the servers. It is this
	    generated HTML that is delivered to the browsers. Most of
	    <ulink url="http://www.microsoft.com">Microsoft</ulink>'s
	    web site, for example, exists as XML that is converted to
	    HTML on the fly. The web browser never knows the
	    difference.</para>
	</tip>
	<para>See also the notes on <link linkend="software"
	    xreflabel="simple">software for authors</link> and <link
	    linkend="developers" xreflabel="simple">XML for
	    developers</link>, and the more detailed list on the XML
	  pages in the SGML Web site at  <ulink
	    url="http://xml.coverpages.org/"></ulink>.</para>
      </answer>
    </qandaentry>
    <qandaentry id="execute" remap="exec">
      <question>
	<formalpara>
	  <title>How do I execute or run an XML file?</title>
	  <para>Not a meaningful question. XML is a data
	    format, not a programming language.</para>
	</formalpara>
      </question>
      <answer remap="executable execution run running">
	<para>You can't and you don't. XML itself is not a programming
	  language, so normal XML documents don't <quote>run</quote>
	  or <quote>execute</quote>. XML is a markup specification
	  language and XML files are just data: they sit there until
	  you run a program which displays them (like a browser) or
	  does some work with them (like a converter which writes the
	  data in another format, or a database which reads the data),
	  or modifies them (like an editor).</para>
	<para>If you want to view or display an XML file, open it with
	  an <link linkend="editors" xreflabel="simple">XML
	    editor</link> or an <link xreflabel="simple"
	    linkend="browsers">XML browser</link>.</para>
	<para>The water is muddied by XSL (both XSLT and XSL:FO) which
	  use XML syntax to implement a declarative programming
	  language. In these special cases you
	  <emphasis>can</emphasis>&nbsp;<quote>execute</quote> an XML
	  file, by running a processing application like Saxon, which
	  compiles the directives specified in XSLT files into Java
	  bytecode to process XML.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-SWITCH, switch" id="switchover">
      <question>
	<formalpara>
	  <title>Do I have to switch from SGML or HTML to XML?</title>
	  <para>Not if you don't want to</para>
	</formalpara>
      </question>
      <answer>
	<para>No, existing SGML and HTML applications software will
	  continue to work with existing files. But as with any
	  enhanced facility, if you want to view or download and use
	  XML files, you will need to use XML-aware software. There is
	  much more being developed for XML than there ever was for
	  SGML, so a lot of users are moving.</para>
	<para>However, for some static SGML applications (eg large
	  document archives) with well-established and stable
	  software, a good case can be made for <quote>not fixing it
	    if it ain't bust</quote>, and deferring a move to XML
	  until an appropriate time comes for a revision of the
	  service or features.</para>
      </answer>
    </qandaentry>
    <qandaentry id="officeapps" revisionflag="changed">
      <question>
	<formalpara>
	  <title>Can I use XML for ordinary office
	    applications?</title>
	  <para>Yes, use Star Office, Open Office, WordPerfect, or
	    even MS-Office (11/XP only).</para>
	</formalpara>
      </question>
      <answer>
	<para>Yes, most office productivity suites already do this,
	  and there are more on the way:</para>
	<itemizedlist>
	  <listitem id="odf">
	    <para><ulink
		url="http://www.openoffice.org/">OpenOffice</ulink>
	      has been saving its files as XML by default for a
	      several years (<filename>.odt</filename>,
	      <filename>.ods</filename>, and <filename>.odp</filename>
	      file types).  The package comprise a wordprocessor,
	      spreadsheet, presentation software, and a vector drawing
	      package, and they share related Schemas. The Office
	      Document Format (ODF) is now the official International
	      Standard (ISO/IEC 26300) for office documents.</para>
	  </listitem>
	  <listitem>
	    <para>Corel's <ulink
		url="http://www.corel.com/servlet/Satellite?pagename=Corel2/Products/Home&ampers;pid=1047022958453">WordPerfect</ulink> 
	      suite has shipped with a fully-fledged XML editor for
	      many years (which also does full SGML as well). It
	      can save the formatted output as a Microsoft Word
	      <filename>.doc</filename> file, but it uses its own
	      stylesheet technology to format documents, not XSLT or
	      CSS. It can also save its own (WordPerfect) document
		format to an XML representation.</para>
	  </listitem>
	  <listitem>
	    <para>The <ulink
		url="http://www.abisource.com/">AbiWord</ulink>
	      wordprocessor (all platforms) can open Word and
	      OpenOffice documents and save them in DocBook XML
	      format, although it does not provide native XML
	      editing.</para>
	  </listitem>
	  <listitem>
	    <para>Microsoft 
		Office 2003 provided a
	      <quote>Save As&hellip;XML</quote> to all parts of the
	      suite except Powerpoint, using WordML to represent the
		visual appearance of the document, although it will
		preserve style names if they are in use.</para>
	    <para>Word 2007 saves natively as XML document instances
	      (<filename>.docx</filename>, <filename>.xlsx</filename>,
	      and <filename>.pptx</filename> file types), using Office
	      Open XML (similar to WordML) which is Microsoft's
	      equivalent to <link linkend="odf">ODF</link>, which they
	      managed to have recognised as a parallel international
	      standard.</para>
	    <para>Word contains a real XML editor as well, supporting
	      other W3C Schemas as well as its own (but not DTDs), and
	      this also provides a method for binding element types to
	      Word's named styles (like Microsoft's earlier product
	      <ulink
		url="http://xml.coverpages.org/micrfac1.html#msauth">SGML 
		Author for Word</ulink> did).</para>
	  </listitem>
	  <listitem>
	    <para>Avoid Microsoft's <quote>Works</quote> package, as
	      it is incompatible both with Office and with XML.</para>
	  </listitem>
	  <listitem>
	    <para>I have no information on Lotus office products.</para>
	  </listitem>
	</itemizedlist>
	<para>There is more detail under <quote><ulink
	    url="http://xml.coverpages.org/xmlFileFormats.html">XML
	    File Formats for Office Documents</ulink></quote> in the XML Cover
	  Pages which briefly describes and points to further
	  information on: GNOME Office, KOffice, Microsoft XDocs,
	  OASIS TC for Open Office XML File Format, 1DOK.org Project,
	  and OpenOffice.org XML File Format.</para>
      </answer>
    </qandaentry>
  </qandadiv>
  <qandadiv id="authors" remap="FAQ-AUTHOR, Author">
    <title>Authors (including writers of HTML and Web page
      owners)</title>
    <qandaentry id="foreknowledge" remap="prelearn">
      <question>
	<formalpara>
	  <title>Do I have to know HTML or SGML before I learn
	    XML?</title>
	  <para>No, but it's useful.</para>
	</formalpara>
      </question>
      <answer>
	<para>No, although it's useful because a lot of XML
	  terminology and practice derives from two decades'
	  experience of SGML.</para>
	<para>Be aware that <quote>knowing HTML</quote> is not the
	  same as <quote>understanding SGML</quote>. Although HTML was
	  written as an SGML application, browsers ignore most of it
	  (which is why so many useful things don't work), so just
	  because something is done a certain way in HTML browsers
	  does not mean it's correct, least of all in XML.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-SPACE, space" id="whitespace">
      <question>
	<formalpara>
	  <title>How does XML handle white-space in my
	    documents?</title>
	  <para>Parsers keep it all. It's up to the application to
	    decide what to do with it.</para>
	</formalpara>
      </question>
      <answer remap="line break linefeed feed line-end line end
	    lineend eol white spaces blanks">
	<para>All white-space, including linebreaks (Mac CR, Win
	  CR/LF, Unix LF), TAB characters, and normal spaces,
	  <emphasis>even between <quote>structural</quote> elements
	    where no text can ever appear</emphasis>, is passed by the
	  parser <emphasis>unchanged</emphasis> to the application
	  (browser, formatter, viewer, converter, etc), identifying
	  the context in which the white-space was found (element
	  content, data content, or mixed content, if this information
	  is available to the parser, eg from a DTD or Schema). This
	  means <emphasis>it is the application's responsibility to
	    decide what to do with such space, not the
	    parser's</emphasis>:</para>
	<itemizedlist>
	  <listitem>
	    <para><emphasis>insignificant white-space</emphasis>
	      between structural elements (space which occurs where
	      only element content is allowed, ie between other
	      elements, where text data never occurs) will get passed
	      to the application (in SGML this white-space gets
	      suppressed, which is why you can put all that extra
	      space in HTML documents and not worry about it);</para>
	  </listitem>
	  <listitem>
	    <para><emphasis>significant white-space</emphasis> (space
	      which occurs within elements which
	      <emphasis>can</emphasis> contain text and markup mixed
	      together, usually mixed content or PCDATA) will still
	      get passed to the application exactly as under SGML. It
	      is the application's responsibility to handle it
	      correctly.</para>
	  </listitem>
	</itemizedlist>
	<para>The parser must inform the application that white-space
	  has occurred in element content, if it can detect it. (Users
	  of SGML will recognize that this information is not in the
	  <ulink
	    url="http://xml.coverpages.org/WG8-n931a.html">ESIS</ulink>, 
	  but it <emphasis>is</emphasis> in the <ulink
	    url="http://xml.coverpages.org/topics.html#groves">Grove</ulink>.)</para>
	<programlisting><![CDATA[ 
<chapter> 
  <title> 
   My title for
   Chapter 1. 
  </title> 
    <para> 
text 
    </para> 
</chapter>
	  ]]></programlisting>
	<para>In the example above, the application will receive all
	  the pretty-printing linebreaks, TABs, and spaces between the
	  elements <emphasis>as well as those</emphasis> embedded in
	  the chapter title. It is the function of the application,
	  not the parser, to decide which type of white-space to
	  discard and which to retain. Many XML applications have
	  configurable options to allow programmers or users to
	  control how such white-space is handled.</para>
	<note>
	  <title>Why?</title>
	  <para>In SGML, a DTD is compulsory always. A parser
	    therefore always knows in advance whether white-space has
	    occurred in element content (and can therefore be
	    discarded) or in mixed content or PCDATA (where it must be
	    preserved). XML allows processing without a DTD or Schema,
	    so it may be impossible to tell whether space should be
	    discarded or not, so the general rule was imposed that
	    <emphasis>all</emphasis> white-space must be reported to
	    the application.</para>
	</note>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-CASE, case" id="case">
      <question>
	<formalpara>
	  <title>Which parts of an XML document are
	    case-sensitive?</title>
	  <para>All of it, both markup and text.</para>
	</formalpara>
      </question>
      <answer remap="case sensitive senstive sensative sensitivity">
	<para>All of it, both markup and text. This is significantly
	  different from HTML and most other SGML applications. It was
	  done to allow markup in non-Latin-alphabet languages, and to
	  obviate problems with case-folding in writing systems which are
	  caseless.</para>
	<itemizedlist>
	  <listitem>
	    <para>Element type names are case-sensitive: you must
	      follow whatever combination of upper- or lower-case you
	      use to define them (either by first usage or in a <link
		linkend="dtds" xreflabel="simple">DTD or
		Schema</link>). So you can't say <sgmltag
		class="starttag">BODY</sgmltag>&hellip;<sgmltag
		class="endtag">body</sgmltag>: upper- and lower-case
	      must match; thus <sgmltag
		class="emptytag">Img</sgmltag>, <sgmltag
		class="emptytag">IMG</sgmltag>, and <sgmltag
		class="emptytag">img</sgmltag> are three different
	      element types;</para>
	  </listitem>
	  <listitem>
	    <para>For well-formed XML documents with no DTD, the first
	      occurrence of an element type name defines the
	      casing;</para>
	  </listitem>
	  <listitem>
	    <para>Attribute names are also
	      case-sensitive, for example the two width attributes in
	    <programlisting><![CDATA[<PIC
		width="7in"/>]]></programlisting> and
	    <programlisting><![CDATA[<PIC
		WIDTH="6in"/>]]></programlisting> (if they occurred in
	      the same file) are separate attributes, because of
	      the different case of <sgmltag
		class="attribute">width</sgmltag> and <sgmltag
		class="attribute">WIDTH</sgmltag>;</para>
	  </listitem>
	  <listitem>
	    <para>Attribute values are also
	      case-sensitive. CDATA values (eg
	    <programlisting>Url="MyFile.SGML"</programlisting>) always
	      have been, but NAME types (ID and IDREF attributes, and
	      token list attributes) are now case-sensitive as
	      well;</para>
	  </listitem>
	  <listitem>
	    <para>All general and parameter entity names (eg <sgmltag
		class="genentity">Aacute</sgmltag>), and your data
	      content (text), are case-sensitive as always.</para>
	  </listitem>
	</itemizedlist>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-EXIST, exist" id="conversion">
      <question>
	<formalpara>
	  <title>How can I make my existing HTML files work in
	    XML?</title>
	  <para>Either make them XHTML or use a different document
	    type.</para>
	</formalpara>
      </question>
      <answer>
	<para>Either convert them to conform to some new document type
	  (with or without a DTD or Schema) and write a stylesheet to
	  go with them; or edit them to conform to <link xreflabel="simple"
	    linkend="xhtml">XHTML</link>.</para>
	<para>It is necessary to convert existing HTML files because
	  XML does not permit end-tag minimisation (missing <sgmltag
	    class="endtag">p</sgmltag>, etc), unquoted attribute
	  values, and a number of other SGML shortcuts which have been
	  normal in most HTML DTDs. However, many HTML authoring tools
	  already produce almost (but not quite) <link xreflabel="simple"
	    linkend="wf">well-formed XML</link>.</para>
	<para>You may be able to convert HTML
	  to XHTML using the
	  <personname>
	    <firstname>Dave</firstname>
	    <surname>Raggett</surname>
	  </personname>'s <ulink
	    url="http://tidy.sourceforge.net/">HTML Tidy</ulink>
	  program, which can clean up some of the horrible formatting
	  mess left behind by inadequate HTML editors, and even
	  separate out some of the formatting to a stylesheet, but
	  there is usually still some hand-editing to do.</para>
	<para>Most modern website design
	  programs, including DreamWeaver, still don't produce
	  anything like clean HTML, largely because they are for
	  making pages look pretty, rather than getting the
	  information right. If you get the information right in XML
	  first, and export it to a page design produced using a
	  website design program, it's probably less important that
	  the HTML is a mess. Using a website design program and its
	  HTML pages as the sole repository of your information can be
	  a dangerous mistake, though.</para>
      </answer>
      <answer remap="normalize normalization normalise normalisation
      escape characters sequences"> 
	<tip>
	  <title>Converting valid HTML to XHTML</title>
	  <para>If your HTML files are valid (full formal validation
	    with an SGML parser, not just a simple syntax check), then
	    try validating them as XHTML with an XML parser. If you
	    have been creating clean HTML without embedded formatting
	    then this process should throw up only mismatches in
	    upper/lowercase element and attribute names, and empty
	    elements (plus perhaps the odd non-standard element type
	    name if you use them). Simple hand-editing or a short
	    script should be enough to fix these changes.</para>
	  <para>If your HTML validly uses end-tag omission, this can
	    be fixed automatically by a normalization program like
	    <productname>sgmlnorm</productname> (from
	    <productname>OpenSP</productname>, which is part of <ulink
	      url="http://sourceforge.net/projects/openjade/"><productname>OpenJade</productname></ulink>) 
	    or by the <function>sgml-normalize</function> function in
	    an editor like
	    <productname>Emacs</productname>/<productname>psgml</productname> 
	    (don't be put off by the names, they both do XML).</para>
	  <para>If you have a lot of valid HTML files, you could write
	    a script to do this in a programming language which
	    understands SGML markup (such as <ulink
	      url="http://www.omnimark.com"><productname>Omnimark</productname></ulink>, 
	    <ulink
	      url="http://sgml.dircon.co.uk/"><productname>SGMLC</productname></ulink>, 
	    or one of the popular scripting languages (eg
	    <productname>Perl</productname>,
	    <productname>Python</productname>,
	    <productname>Tcl</productname>, etc), using their SGML/XML
	    libraries; or you could even use editor macros if you know
	    what you're doing.</para>
	</tip>
	<tip>
	  <title>Converting to a new document type</title>
	  <para>If you want to move your files out of HTML into some
	    other DTD entirely, there are many native XML application
	    DTDs, and standard XML versions of popular DTDs like
	    <productname>TEI</productname> and
	    <productname>DocBook</productname> or
	    <productname>DITA</productname> to choose from. There
	    is a pilot site run by CommerceNet (<ulink
	      url="http://www.xmlx.com/"></ulink>) for the exchange of
	    XML DTDs.</para>
	  <para>Alternatively you could just make up your own markup:
	    so long as it makes sense and you create a well-formed
	    file, you should be able to write a CSS or XSLT stylesheet
	    and have your document displayed in a browser.</para>
	</tip>
	<tip>
	  <title>Converting invalid HTML to well-formed XHTML</title>
	  <para>If your files are invalid HTML (95&pct; of the Web)
	    they can be converted to well-formed DTDless files as
	    follows:</para>
	  <orderedlist>
	    <listitem>
	      <para>replace the DOCTYPE Declaration with the XML
		Declaration <programlisting><![CDATA[<?xml
		version="1.0"
		encoding="iso-8859-1"?>]]></programlisting> (using the
		appropriate character encoding).</para> 
	    </listitem>
	    <listitem>
	      <para>If there was no DOCTYPE Declaration, just prepend
		the XML Declaration.</para>
	    </listitem>
	    <listitem>
	      <para>Change any EMPTY elements (eg every
		<sgmltag>BASE</sgmltag>, <sgmltag>ISINDEX</sgmltag>,
		<sgmltag>LINK</sgmltag>, <sgmltag>META</sgmltag>,
		<sgmltag>NEXTID</sgmltag> and <sgmltag>RANGE</sgmltag>
		in the header, and every <sgmltag>AREA</sgmltag>,
		<sgmltag>ATOPARA</sgmltag>,
		<sgmltag>AUDIOSCOPE</sgmltag>,
		<sgmltag>BASEFONT</sgmltag>, <sgmltag>BR</sgmltag>,
		<sgmltag>CHOOSE</sgmltag>, <sgmltag>COL</sgmltag>,
		<sgmltag>FRAME</sgmltag>, <sgmltag>HR</sgmltag>,
		<sgmltag>IMG</sgmltag>, <sgmltag>KEYGEN</sgmltag>,
		<sgmltag>LEFT</sgmltag>, <sgmltag>LIMITTEXT</sgmltag>,
		<sgmltag>OF</sgmltag>, <sgmltag>OVER</sgmltag>,
		<sgmltag>PARAM</sgmltag>, <sgmltag>RIGHT</sgmltag>,
		<sgmltag>SPACER</sgmltag>, <sgmltag>SPOT</sgmltag>,
		<sgmltag>TAB</sgmltag>, and <sgmltag>WBR</sgmltag> in
		the body of the document) so that they end with
	      <programlisting>/></programlisting> instead, for example
	      <programlisting><![CDATA[<img src="mypic.gif"
		  alt="Picture"/>]]></programlisting>;</para>
	    </listitem>
	    <listitem>
	      <para>Make all element names and attribute names
		lowercase;</para>
	    </listitem>
	    <listitem>
	      <para>Ensure there are correctly-matched explicit
		end-tags for all non-EMPTY elements; eg every <sgmltag
		  class="starttag">para</sgmltag> must have a <sgmltag
		  class="endtag">para</sgmltag>, etc;</para>
	    </listitem>
	    <listitem>
	      <para>Escape all <literal><![CDATA[<]]></literal> and
		<literal><![CDATA[&]]></literal> non-markup (ie
		literal text) characters as <sgmltag
		  class="genentity">lt</sgmltag> and <sgmltag
		  class="genentity">amp</sgmltag> respectively (there
		shouldn't be any isolated
		<literal><![CDATA[<]]></literal> characters to start
		with, anyway!);</para>
	    </listitem>
	    <listitem>
	      <para>Ensure all attribute values are in matched quotes
		(values with embedded single quotes must be in double
		quotes, and <foreignphrase>vice
		  versa</foreignphrase>&mdash;if you need both, use
		the <sgmltag class="genentity">quot</sgmltag>
		character entity reference);</para>
	    </listitem>
	    <listitem>
	      <para id="semicolon">Ensure all script URIs which have
		<literal><![CDATA[&]]></literal> as a field separator
		are changed to use <sgmltag
		  class="genentity">amp</sgmltag> or a semicolon
		instead.</para>
	    </listitem>
	  </orderedlist>
	</tip>
	  <para>Be aware that some obsolete HTML browsers may not
	  accept XML-style EMPTY elements with the trailing slash, so
	  the above changes may not be backwards-compatible. An
	  alternative is to add a dummy end-tag to all EMPTY elements,
	  so <programlisting><![CDATA[<img
	      src="foo.gif"/>]]></programlisting> becomes
	    <programlisting><![CDATA[<img
	      src="foo.gif"></img>]]></programlisting>. This is valid
	  XML but you must be able to guarantee no-one will ever put
	  any text content in such elements. Adding a space before the
	  closing slash in EMPTY elements (eg
	  <programlisting><![CDATA[<img
	      src="foo.gif" />]]></programlisting>) may also fool
	  older browsers into accepting XHTML as HTML.</para>
	<para>If you answer Yes to any of the questions in the <link
	    linkend="checklist"></link>, you can save
	    yourself a lot of grief by fixing those problems first
	    before doing anything else. You will likely then be
	    getting close to having well-formed files.</para> 
	  <para>Markup which is syntactically correct but semantically
	    meaningless or void should be edited out before
	    conversion. Examples are spacing devices such as repeated
	    empty paragraphs or linebreaks, empty tables, invisible
	    spacing GIFs etc. XML uses stylesheets, so you won't need
	    any of these.</para>
	  <para>Unfortunately there is rather a lot of work to do if
	    your files are invalid: this is why many Webmasters now
	    insist that only valid or well-formed files are used (and
	    why you should instruct your designers to do the same), in
	    order to avoid unnecessary manual maintenance and
	    conversion costs later.</para>
	<tip id="checklist">
	  <title>Checklist for invalid HTML</title>
	  <para>If your HTML files fall into this category (HTML
	    created by most WYSIWYG editors is usually invalid)
	    then they will almost certainly have to be converted
	    manually, although if the deformities are regular and
	    carefully constructed, the files may actually be almost
	    well-formed, and you could write a program or script to do
	    as described above. The oddities you may need to check for
	    include:</para>
	  <itemizedlist role="checklist">
	    <listitem>
	      <para>Do the files contain markup syntax errors? For
		example, are there any missing angle-brackets,
		backslashes instead of forward slashes on end-tags, or
		elements which nest incorrectly (eg
	      <programlisting><![CDATA[<B>starting <I>inside one
		element</B> but ending outside</I> it]]></programlisting>)?</para>
	    </listitem>
	    <listitem>
	      <para>Are there any URIs (eg in <sgmltag
		  class="attribute">href</sgmltag>s or <sgmltag
		  class="attribute">src</sgmltag>s) which use
		Microsoft Windows-style backslashes instead of normal
		forward slashes?</para>
	    </listitem>
	    <listitem>
	      <para>Do the files contain markup which conflicts with
		HTML DTDs, such as headings or lists inside
		paragraphs, list items outside list environments,
		header elements like <sgmltag>base</sgmltag> preceding
		the first <sgmltag>html</sgmltag>, etc? (another
		sloppy editor trick)</para>
	    </listitem>
	    <listitem>
	      <para>Do the files use imaginary elements which are not
		in any known HTML DTD? (large amounts of these are
		used in proprietary markup systems masquerading as
		HTML). Although this is easy to transform to a DTDless
		well-formed file (because you don't have to define
		elements in advance) most proprietary or
		browser-specific extensions have never been formally
		defined, so it is often impossible to work out
		meaningfully where the element types can be
		used.</para>
	    </listitem>
	    <listitem>
	      <para>Are there any invalid (non-XML) characters in your
		files? Look especially for native Apple Mac Roman-8
		characters left by careless designers; any of the
		illegal characters (the 32 characters at decimal codes
		128&ndash;159 inclusive) inserted by MS-Windows
		editors; and any of the ASCII control characters
		0&ndash;31 (except those permitted like TAB, CR, and
		LF). These need to be converted to the correct
		characters in ISO 8859-1 (a common default in
		browsers), or the relevant plane of Unicode (in which
		case you will probably need to use UTF-8 as your
		document encoding).</para>
	    </listitem>
	    <listitem>
	      <para>Do your files contain invalid (old
		Mosaic/Netscape-style) comments? Comments must look
		<programlisting><![CDATA[<!-- like this
		-->]]></programlisting> with double-dashes each end
		and no double (especially not multiple) dashes in
		between.</para>
	    </listitem>
	  </itemizedlist>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-SUBSET, subset" id="tools">
      <question>
	<formalpara>
	  <title>If XML is just a subset of SGML, can I use XML files
	    directly with existing SGML tools?</title>
	  <para>Yes, if they are up to date</para>
	</formalpara>
      </question>
      <answer>
	<para>Yes, provided you use up-to-date SGML software which
	  knows about the <ulink
	    url="http://www.ornl.gov/sgml/sc34/document/0029.htm">WebSGML 
	    Adaptations TC to ISO 8879</ulink> (the features needed to
	  support XML, such as the variant form for EMPTY elements;
	  some aspects of the SGML Declaration such as NAMECASE
	  GENERAL NO; multiple attribute token list declarations,
	  etc).</para>
	<para>An alternative is to use an SGML DTD to let you create a
	  fully-normalised SGML file, but one which does not use empty
	  elements; and then remove the DocType Declaration so it
	  becomes a well-formed DTDless XML file. Most SGML tools now
	  handle XML files well, and provide an option switch between
	  the two standards. (see the pointers in <link
	    linkend="software"></link>).</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-LEARN, learn" id="learning">
      <question>
	<formalpara>
	  <title>I'm used to authoring and serving HTML. Can I learn
	    XML easily?</title>
	  <para>Yes</para>
	</formalpara>
      </question>
      <answer>
	<para>Yes, very easily, but at the moment there is still a
	  need for more tutorials, simpler tools, and more examples of
	  XML documents. <link linkend="wf"
	    xreflabel="simple"><quote>Well-formed</quote> XML
	    documents</link> may look similar to HTML except for some
	  small but very important points of syntax. </para>
	<para>The big practical difference is that XML has to stick to
	  the rules. HTML browsers let you serve them even fatally
	  broken or ridiculously corrupt HTML because they don't do a
	  formal parse but elide all the broken bits instead. With XML
	  your files have to be completely correct or they simply
	  won't work at all. One outstanding problem is that some
	  browsers claiming XML conformance are also broken, and some
	  browsers' support for CSS styling is dubious at the best.
	  Try yours on the <ulink
	    url="http://xml.silmaril.ie/test.xml">test file</ulink>
	  and the <ulink
	    url="http://xml.silmaril.ie/hotels.xml">list of real hotel
	  web sites</ulink>.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-CHARENTS, charents" id="characters">
      <question>
	<formalpara>
	  <title>Can XML use non-Latin characters?</title>
	  <para>Yes, this is the default</para>
	</formalpara>
      </question>
      <answer remap="charset">
	<para id="faq:Unicode">Yes, the <link linkend="spec" xreflabel="simple">XML
	    Specification</link> explicitly says XML uses <ulink
	    url="http://www.iso.ch/">ISO 10646</ulink>, the
	  international standard character repertoire which covers
	  most known languages. <ulink
	    url="http://www.unicode.org/">Unicode</ulink> is an
	  identical repertoire, and the two standards track each
	  other. The spec says (2.2): <quote>All XML processors must
	    accept the UTF-8 and UTF-16 encodings of ISO
	    10646&hellip;</quote>. There is a Unicode FAQ at <ulink
	    id="FAQ:unicode"
	    url="http://www.unicode.org/faq/"></ulink> and an example
	  of the range of alphabets and symboks at <ulink
	    url="http://www.cogsci.ed.ac.uk/~richard/unicode-sample-3-2.html"></ulink>.</para>
	<warning>
	  <para>While XML software may allow you to enter any Unicode
	    character into a document, you can only see the characters
	    if your computer has a suitable font! Not all typefaces
	    and font files have the entire Unicode repertoire (ones
	    that do are <emphasis>huge</emphasis>).</para>
	</warning>
	<para>UTF-8 is an encoding of Unicode into 8-bit characters:
	  the first 128 are the same as ASCII, and <ulink
	    url="http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt">higher-order 
	    characters are used to encode anything else from Unicode
	    into sequences of between 2 and 6 bytes</ulink>. UTF-8 in
	  its single-octet form is therefore the same as ISO 646 IRV
	  (ASCII), so you can continue to use ASCII for English or
	  other languages using the Latin alphabet without diacritics.
	  Note that UTF-8 is incompatible with ISO 8859-1 (ISO
	  Latin-1) after code point 127 decimal (the end of
	  ASCII).</para>
	<para>UTF-16 is an encoding of Unicode into 16-bit characters,
	  which lets it represent 16 planes. UTF-16 is incompatible
	  with ASCII because it uses two 8-bit bytes per character
	  (four bytes above U+FFFF).</para>
	<tip xreflabel="Peter Flynn" id="iso-8859-1">
	  <para>The encoding specification can refer to any character
	  set your software supports, but the XML Specification only
	  requires that applications support UTF-8 and UTF-16. Some of
	  the common encodings supported by software include:</para>
	  <variablelist>
	    <varlistentry>
	      <term>US-ASCII</term>
	      <listitem>
		<para>Characters codes TAB, LF, CR, space, and the
		  printable characters 33 to 126 (decimal) only (all other
		  control characters are forbidden by XML).</para>
	      </listitem>
	    </varlistentry>
	    <varlistentry>
	      <term>ISO-8859-1</term>
	      <listitem>
		<para>(Western European Latin-1) As ASCII plus codes
		128 to 255 (decimal). Covers most (but not all)
		western European accented letters.</para>
	      </listitem>
	    </varlistentry>
	    <varlistentry>
	      <term>ISO-8859-2 to 15</term>
	      <listitem>
		<para>The other planes of ISO-8859 (2 to 15) cover
		  different sets of Latin-based alphabetic and other
		  symbols.</para>
	      </listitem>
	    </varlistentry>
	    <varlistentry>
	      <term><quote>Codepages</quote> and other obsolescent sets</term>
	      <listitem>
		<para>Some software may also support various obsolete
		  <quote>codepages</quote>, such as IBM-850, Microsoft
		  Windows-1252, Apple Macintosh Roman-8, DEC
		  Multinational and other non-standard character
		  encodings, but these are generally non-portable and
		  should be avoided where possible.</para>
	      </listitem>
	    </varlistentry>
	  </variablelist>
	  <para>One common practice in western Europe is to use
	    ISO-8859-1 so that the majority of common accented letters
	    can be used as single bytes, and to use character entity
	    references or numeric entities for all other characters.
	    This has the advantage that such files can be opened in
	    almost any single-byte editor. The drawback is that
	    numeric entities are not mnemonic, and character entities
	    have to be declared in DTD or internal subset, but if they
	    are rare, this may not be a serious problem.</para>
	</tip>
	<tip xreflabel="Bertilo Wennergren" id="utf-16">
	  <para>UTF-16 is an encoding that represents each Unicode
	    character of the first plane (the first 64K characters) of
	    Unicode with a 16-bit unit&mdash;in practice with two
	    bytes for each character. Thus it is backwards compatible
	    with neither ASCII nor Latin-1. UTF-16 can also access an
	    additional 1 million characters by a mechanism known as
	    surrogate pairs (two 16-bit units for each
	    character).</para>
	  <para><quote>&hellip;the mechanisms for signalling which of
	      the two are in use, and for bringing other encodings
	      into play, are [&hellip;] in the discussion of character
	      encodings.</quote> The <link linkend="spec" xreflabel="simple">XML
	      Specification</link> explains how to specify in your XML
	    file which coded character set you are using.</para>
	  <para><quote>Regardless of the specific encoding used, any
	      character in the ISO 10646 character set may be referred
	      to by the decimal or hexadecimal equivalent of its bit
	      string</quote>: so no matter which character set you
	    personally use, you can still refer to specific individual
	    characters from elsewhere in the encoded repertoire by
	    using <sgmltag class="numcharref">dddd</sgmltag> (decimal
	    character code) or <sgmltag
	      class="numcharref">xHHHH</sgmltag> (hexadecimal
	    character code, in uppercase). The terminology can get
	    confusing, as can the numbers: see the <ulink
	      url="http://cns-web.bu.edu/pub/djohnson/web_files/i18n/ISO-10646.html">ISO 
	      10646 Concept Dictionary</ulink>. <personname>
	      <firstname>Rick</firstname>
	      <surname>Jelliffe</surname>
	    </personname> has
	    <ulink
	      url="http://xml.coverpages.org/xml-ISOents.txt">XML-ized
	      the ISO character entity sets</ulink>. <personname>
	      <firstname>Mike</firstname>
	      <surname>Brown</surname>
	    </personname>'s
	    encoding information at <ulink
	      url="http://skew.org/xml/tutorial/">http://skew.org/xml/tutorial/</ulink> 
	    is a very useful explanation of the need for correct
	    encoding. There is an excellent online database of glyphs
	    and characters in many encodings from the Estonian
	    Language Institute server at <ulink
	      url="http://www.eki.ee/letter/">http://www.eki.ee/letter/</ulink>.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-DOCTYPE, doctype" id="dtds">
      <question>
	<formalpara>
	  <title>What's a Document Type Definition (DTD) and where do
	    I get one?</title>
	  <para>A specification of document structure. You can write
	    one or download them.</para>
	</formalpara>
      </question>
      <answer remap="compiled dtds xsds differences">
	<para>A DTD is a description in XML Declaration Syntax of a
	  particular type or class of document. It sets out what names
	  are to be used for the different types of element, where
	  they may occur, and how they all fit together. (A <link
	    linkend="schemas">Schema</link> does the same thing in XML
	  Document Syntax, and allows more extensive
	  data-checking.)</para>
	<para>For example, if you want a document type to be able to
	  describe Lists which contain Items, the relevant part of
	  your DTD might contain something like this:</para>
	<programlisting><![CDATA[ 
<!ELEMENT List (Item)+> 
<!ELEMENT Item (#PCDATA)> 
	  ]]></programlisting>
	<para>This defines a list as an element type containing one or
	  more items (that's the plus sign); and it defines items as
	  element types containing just plain text (Parsed Character
	  Data or PCDATA). Validators read the DTD before they
	  read your document so that they can identify where every
	  element type ought to come and how each relates to the
	  other, so that applications which need to know this in
	  advance (most editors, search engines, navigators, and
	  databases) can set themselves up correctly. The example
	  above lets you create lists like:</para>
	<programlisting><![CDATA[
<List>
  <Item>Chocolate</Item>
  <Item>Music</Item>
  <Item>Surfing</Item>
</List> 
	  ]]></programlisting>
	<para>(The
	  indentation in the example is just for legibility while
	  editing: it is not required by XML.)</para>
	<para>A DTD provides applications with advance notice of what
	  names and structures can be used in a particular document
	  type. Using a DTD and a validating editor means you can be
	  certain that all documents of that particular type
	  will be constructed and named in a consistent and conformant
	  manner.</para>
	<para>DTDs are not required for processing <link
	    linkend="wf">well-formed documents</link>, but they are
	  needed if you want to take advantage of XML's special
	  attribute types like the built-in ID/IDREF cross-reference
	  mechanism; or the use of default attribute values; or
	  references to external non-XML files
	  (<quote>Notations</quote>); or if you simply want a check on
	  document validity before processing.</para>
	<para>There are thousands of DTDs already in existence in all
	  kinds of areas (see the <ulink
	    url="http://xml.coverpages.org/">SGML/XML Web
	    pages</ulink> for pointers). Many of them can be
	  downloaded and used freely; or you can write your own (see
	  the question on <link linkend="owndtd"
	    xreflabel="simple">creating your own DTD</link>. Old SGML
	  DTDs need to be converted to XML for use with XML systems:
	  <link linkend="dtdconv" xreflabel="simple">read the question
	    on converting SGML DTDs to XML</link>, but most popular
	  SGML DTDs are already available in XML form.</para>
	<para>Some XML editors use a binary
	  compiled format of DTD produced by their own management
	  routines to allow a single person in an organisation to be
	  in charge of modifications, and to distribute only an
	  unmodifiable (binary compiled) version to users.</para>
	<para>The alternatives to a DTD are various forms of <link
	    linkend="schemas">Schema</link>. These provide more
	  extensive validation features than DTDs, including character
	  data content validation.</para>
      </answer>
    </qandaentry>
    <qandaentry id="makeup" remap="makeup">
      <question>
	<formalpara>
	  <title>Does XML let me make up my own tags?</title>
	  <para>Yes but they're not called tags. They're element
	    types.</para>
	</formalpara>
      </question>
      <answer>
	<para>No, it lets you make up names for your own element
	  types. If you think tags and elements are the same thing you
	  are already in considerable trouble: read the rest of this
	  question carefully.</para>
	<para>The same applies if you are thinking in terms of
	  <wordasword>fields</wordasword> (see <link
	    linkend="databases"></link>). Wrong paradigm, wrong
	  language.</para>
	<tip xreflabel="Bob DuCharme">
	  <para>Don't confuse the term <quote>tag</quote> with the
	    term <quote>element</quote>. They are not interchangeable.
	    An element usually contains two different kinds of tag: a
	    start-tag and an end-tag, with text or more markup between
	    them.</para>
	  <para>XML lets you decide which elements you want in your
	    document and then indicate your element boundaries using
	    the appropriate start- and end-tags for those elements.
	    Each
	    <programlisting><![CDATA[<!ELEMENT...]]></programlisting>
	    declaration defines a type of element that may be used in
	    a document conforming to that DTD. We call this type of
	    element an <quote>element type</quote>. Just as the HTML
	    DTD includes the <sgmltag>H1</sgmltag> and
	    <sgmltag>P</sgmltag> element types, your document can have
	    <sgmltag>color</sgmltag> and <sgmltag>price</sgmltag>
	    element types, or anything else you want.</para>
	  <para>Normal non-empty elements are made up of a start-tag,
	    the element's content, and an end-tag.
	  <programlisting><![CDATA[<color>red</color>]]></programlisting> 
	    is a complete instance of the <sgmltag>color</sgmltag>
	    element. <sgmltag class="starttag">color</sgmltag> is only
	    the start-tag of the element, showing where it begins; it
	    is not the element itself.</para>
	  <para>Empty elements are a special case that may be
	    represented either as a pair of start- and end-tags with
	    nothing between them (eg <programlisting><![CDATA[<price
	    retail="123"></price>]]></programlisting>) or as a single
	    empty element start-tag that has a closing slash to tell
	    the parser <quote>don't go looking for an end-tag to match
	      this</quote> (eg <programlisting><![CDATA[<price
	      retail="123"/>]]></programlisting>).</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="owndoctype">
      <question>
	<formalpara>
	  <title>How do I create my own document type?</title>
	  <para>Analyse the class of documents, and write a DTD or
	    Schema</para>
	</formalpara>
      </question>
      <answer>
	<para>Document types usually need a formal description, either
	  a DTD or a Schema. Whilst it is possible to process
	  well-formed XML documents without any such description,
	  trying to create them without one is asking for trouble. A
	  DTD or Schema is used with an XML editor or API interface to
	  guide and control the construction of the document, making
	  sure the right elements go in the right places.</para>
	<para>Creating your own document type therefore begins with an
	  analysis of the class of documents you want to describe:
	  reports, invoices, letters, configuration files, credit-card
	  verification requests, or whatever. Once you have the
	  structure correct, you write code to express this formally,
	  using <link linkend="owndtd" xreflabel="simple">DTD</link>
	  or Schema syntax.</para>
      </answer>
    </qandaentry>
    <qandaentry id="owndtd" remap="owndtd">
      <question>
	<formalpara>
	  <title>How do I write my own DTD?</title>
	  <para>Learn XML Declaration Syntax</para>
	</formalpara>
      </question>
      <answer>
	<para>You need to use the XML Declaration Syntax (very simple:
	  declaration keywords begin with
	  <programlisting><![CDATA[<!]]></programlisting> rather than
	  just the open angle bracket, and the way the declarations
	  are formed also differs slightly). Here's an example of a
	  DTD for a shopping list, based on the fragment used <link
	  linkend="dtds" xreflabel="simple">earlier</link>:</para>
	<programlisting><![CDATA[
<!ELEMENT Shopping-List (Item)+>
<!ELEMENT Item (#PCDATA)>
	  ]]></programlisting>
	<para>It says that there shall be an element called
	  <sgmltag>Shopping-List</sgmltag> and that it shall contain
	  elements called <sgmltag>Item</sgmltag>: there must be at
	  least one Item (that's the plus sign) but there may be more
	  than one. It also says that the <sgmltag>Item</sgmltag>
	  element may contain only parsed character data (PCDATA, ie
	  text: no further markup).</para>
	<para>Because there is no other element which contains
	  <sgmltag>Shopping-List</sgmltag>, that element is assumed to
	  be the <quote>root</quote> element, which encloses
	  everything else in the document. You can now use it to
	  create an XML file: give your editor the
	  declarations:</para>
	<programlisting><![CDATA[ 
<?xml version="1.0"?> 
<!DOCTYPE Shopping-List SYSTEM "shoplist.dtd"> 
	  ]]></programlisting>
	<para>(assuming you put the DTD in that file). Now your editor
	  will let you create files according to the pattern:</para>
	<programlisting><![CDATA[
<Shopping-List>
  <Item>Chocolate</Item>
  <Item>Sugar</Item>
  <Item>Butter</Item>
</Shopping-List>
	  ]]></programlisting>
	<para>It is possible to develop complex and powerful DTDs of
	  great subtlety, but for any significant use you should learn
	  more about document systems analysis and document type
	  design. See for example <biblioref 
	    linkend="devdtd"/>: this was written for SGML but perhaps
	  95&pct; of it applies to XML as well, as XML is much simpler
	  than full SGML&mdash;see the <link linkend="restrict"
	    xreflabel="simple">list of restrictions</link> which shows
	  what has been cut out.</para>
	<warning>
	  <para>Incidentally, a DTD file <emphasis>never</emphasis>
	    has a DOCTYPE Declaration in it: that only occurs in an
	    XML document instance (it's what references the DTD). And
	    a DTD file also never has an XML Declaration at the top
	    either. Unfortunately there is still software around which
	    inserts one or both of these.</para>
	</warning>
      </answer>
    </qandaentry>
    <qandaentry id="rootelement" remap="documentElement .documentElement">
      <question>
	<formalpara>
	  <title>Can a root element type be explicitly declared in the
	    DTD?</title>
	  <para>No, use the Document Type Declaration.</para>
	</formalpara>
      </question>
      <answer>
	<para>No. This is done in the document's Document Type
	  Declaration, not in the DTD.</para>
	<tip xreflabel="Bob DuCharme">
	  <para>In a Document Type Declaration like this:
	  <programlisting><![CDATA[ 
<!DOCTYPE chapter SYSTEM "docbookx.dtd"> 
	    ]]></programlisting> the whole point of the
	    <sgmltag>chapter</sgmltag> part is to identify which of
	    the element types declared in the specified DTD should be
	    used as the root element. I believe the highest level
	    element in DocBook is <sgmltag>set</sgmltag>, but I find
	    it hard to imagine someone creating a document to
	    represent a set of books. We are free to use
	    <sgmltag>set</sgmltag>, <sgmltag>book</sgmltag>,
	    <sgmltag>chapter</sgmltag>, <sgmltag>article</sgmltag>, or
	    even <sgmltag>para</sgmltag> as the document element for a
	    valid DocBook document.</para>
	  <para>[One job some parsers do is determine which element
	    type[s] in a DTD are not contained in the content model of
	    any other element type: these are by deduction the prime
	    candidates for being default root elements. (PF)]</para>
	  <para>This is A Good Thing, because it adds flexibility to
	    how the DTD is used. It's the reason that XML (and SGML)
	    have lent themselves so well to electronic publishing
	    systems in which different elements were mixed and matched
	    to create different documents all conforming to the same
	    DTD.</para>
	  <para>I've seen schema proposals that let you specify which
	    of a schema's element types could be a document's root
	    element, but after a quick look at <ulink
	      url="http://www.w3.org/TR/xmlschema-1/#cElement_Declarations">section 
	      3.3 of Part 1 of the W3C Schema Recommendation</ulink>
	    and the RELAX NG schema for RELAX, I don't believe that
	    either of these let you do this. I could be wrong.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="schemas" remap="schemata dtds xsds differences">
      <question>
	<formalpara>
	  <title>I keep hearing about alternatives to DTDs. What's a
	    Schema?</title>
	  <para>Like a DTD for validating content as well as
	    structure.</para>
	</formalpara>
      </question>
      <answer remap="modelling modeling schemalocation">
	<para><ulink url="http://www.w3.org/TR/xmlschema-0/">The W3C
	    XML Schema recommendation</ulink> provides a means of
	  specifying formal data typing and validation of element
	  content in terms of data types, so that document type
	  designers can provide criteria for checking the data content
	  of elements as well as the markup itself. Schemas are
	  written in XML Document Syntax, like XML documents are,
	  avoiding the need for processing software to be able to read
	  XML Declaration Syntax (used for DTDs).</para>
	<para id="faq:Schema">There is a separate Schema FAQ at <ulink
	    id="FAQ:schema" url="http://www.schemavalid.com"></ulink>.
	  The term <quote>vocabulary</quote> is sometimes used to
	  refer to DTDs and Schemas together. Schemas are aimed at
	  e-commerce, data control, and database-style applications
	  where character data content requires validation and where
	  stricter data control is needed than is possible with DTDs;
	  or where strong data typing is required. They are usually
	  unnecessary for traditional text document publishing
	  applications.</para>
	<para>Unlike DTDs, Schemas cannot be specified in an XML
	  Document Type Declaration. They can be specified in a <link
	    xreflabel="simple" linkend="namespaces">Namespace</link>,
	  where Schema-aware software should pick it up, but this is
	  optional:</para>
	<programlisting><![CDATA[
<invoice id="abc123"
         xmlns="http://example.org/ns/books/"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://acme.wilycoyote.org/xsd/invoice.xsd">
...
</invoice>         
	  ]]></programlisting>
	<para>More commonly, you specify the Schema in your processing
         software, which should record separately which Schema is used
         by which XML document instance.</para>
	<para>In contrast to the complexity of the W3C Schema model,
	  Relax NG is a lightweight, easy-to-use XML schema language
	  devised by <personname>
	    <firstname>James</firstname>
	    <surname>Clark</surname>
	  </personname> (see <ulink
	    url="http://relaxng.org/"></ulink>) with development
	  hosted by <ulink
	    url="http://www.oasis-open.org/committees/relax-ng/">OASIS</ulink>. 
	  It allows similar richness of expression and the use of XML
	  as its syntax, but it provides an additional, simplified,
	  syntax which is easier to use for those accustomed to
	  DTDs.</para>
	<warning>
	  <para>Authors and publishers should note that the English
	    plural of Schema is Schemas: the use of the singular to do
	    duty for the plural is a foible dear to the semi-literate;
	    the use of the old (Greek) plural schemata is unnecessary
	    didacticism.</para>
	  <para>Writers should also note that the plural of DTD is
	    <ulink
	      url="http://xml.coverpages.org/properSpellingForPluralOfDTD.html">DTDs</ulink>: 
	    there is no apostrophe&mdash;see <biblioref
	      linkend="esl"/>.</para>
	</warning>
	<tip xreflabel="Bob DuCharme">
	  <para>Many XML developers were dissatisfied with the syntax
	    of the markup declarations described in the XML spec for
	    two reasons. First, they felt that if XML documents were
	    so good at describing structured information, then the
	    description of a document type's own structure (its
	    schema) should be in an XML document instead of written
	    with its own special syntax. In addition to being more
	    consistent, this would make it easier to edit and
	    manipulate the schema with regular document manipulation
	    tools. Secondly, they felt that traditional DTD notation
	    didn't allow document type designers the power to impose
	    enough constraints on the data&mdash;for example, the
	    ability to say that a certain element type must always
	    have a positive integer value, that it may not be empty,
	    or that it must be one of a list of possible choices. This
	    eases the development of software using that data because
	    the developer has less error-checking code to
	    write.</para>
	</tip>
	<tip xreflabel="Peter Flynn">
	<para>A <link linkend="dtds" xreflabel="simple">DTD</link> is
	    only for specifying the element structure of an XML file,
	    with a very limited amount of control over attribute
	    values. It gives the names of the elements, attributes,
	    and entities that can be used, and how they fit together.
	    DTDs are designed for use with traditional text documents,
	    not rectangular or tabular data, so the concept of data
	    types is not relevant: text is just text. If you need to
	    specify numeric ranges or to define limitations or checks
	    on the character data (text) content, a DTD is the wrong
	    tool.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-HYPERTEXT, hypertext" id="links">
      <question>
	<formalpara>
	  <title>How will XML affect my document links?</title>
	  <para>XML Links are much more powerful, but not yet
	    implemented in browsers</para>
	</formalpara>
      </question>
      <answer remap="extending linking anchors hrefs multiref">
	<para>The linking abilities of XML systems are potentially
	  much more powerful than those of HTML, so you'll be able to
	  do much more with them. Existing <sgmltag
	    class="attribute">href</sgmltag>-style links will remain
	  usable, but the new linking technology is based on the
	  lessons learned in the development of other standards
	  involving hypertext, such as <ulink
	    url="http://www.tei-c.org/">TEI</ulink> and <ulink
	    url="http://xml.coverpages.org/hytime.html">HyTime</ulink>, 
	  which let you manage bidirectional and multi-way links, as
	  well as links to a whole element or span of text (within
	  your own or other documents) rather than to a single point.
	  These features have been available to SGML users for many
	  years, so there is considerable experience and expertise
	  available in using them. Currently only Mozilla Firefox
	  implements XLink.</para>
	<para id="linkspecs">The <ulink
	    url="http://www.w3.org/TR/xlink/">XML Linking
	    Specification (XLink)</ulink> and the <ulink
	    url="http://www.w3.org/TR/WD-xptr">XML Extended Pointer
	    Specification (XPointer)</ulink> documents contain the
	  details. An XLink can be either a URI or a TEI-style
	  Extended Pointer (<link linkend="loc2" id="loc1"
	    xreflabel="simple">XPointer</link>), or both. A URI on its
	  own is assumed to be a resource; if an XPointer follows it,
	  it is assumed to be a sub-resource of that URI; an XPointer
	  on its own is assumed to apply to the current document (all
	  exactly as with HTML).</para>
	<para>An XLink may use one of <literal>#</literal>,
	  <literal>?</literal>, or <literal>|</literal>. The
	  <literal>#</literal> and <literal>?</literal> mean the same
	  as in HTML applications; the <literal>|</literal> means the
	  sub-resource can be found by applying the link to the
	  resource, but the method of doing this is left to the
	  application. An XPointer can only follow a
	  <literal>#</literal>.</para>
	<para>The <ulink
	    url="http://etext.virginia.edu/bin/tei-tocs?div=DIV2;id=SAXR">TEI 
	    Extended Pointer Notation</ulink> (EPN) is much more
	  powerful than the fragment address on the end of some URIs,
	  as it allows you to specify the location of a link end using
	  the structure of the document as well as (or in addition to)
	  known, fixed points like IDs. For example, <link
	    xreflabel="simple" linkend="loc1" id="loc2">the linked
	    second occurrence</link> of the word
	  <quote>XPointer</quote> two paragraphs back could be
	  referred to with the URI (shown here with linebreaks and
	  spaces for clarity: in practice it would of course be all
	  one long string):</para>
	<programlisting><![CDATA[
http://xml.silmaril.ie/faq.xml#ID(hypertext)
    .child(1,#element,'answer')
    .child(2,#element,'para')
    .child(1,#element,'link')
	  ]]></programlisting> 
	<para>This means the first <sgmltag>link</sgmltag> element
	  within the second paragraph within the
	  <sgmltag>answer</sgmltag> in the element whose ID is
	  <sgmltag class="attvalue">hypertext</sgmltag> (this
	  question). Count the objects from the start of this question
	  (which has the ID <sgmltag
	    class="attvalue">hypertext</sgmltag>) in the <ulink
	    url="http://xml.silmaril.ie/faq.sgml">XML
	    source</ulink>:</para>
	<orderedlist>
	  <listitem>
	    <para>the first child object is the element containing the
	      question (<sgmltag>quandaentry</sgmltag>);</para>
	  </listitem>
	  <listitem>
	    <para>the second child object is the answer (the
	      <sgmltag>answer</sgmltag> element);</para>
	  </listitem>
	  <listitem>
	    <para>within this element go to the second
	      paragraph;</para>
	  </listitem>
	  <listitem>
	    <para>find the first <sgmltag>link</sgmltag>
	      element.</para>
	  </listitem>
	</orderedlist>
	<para>Eve Maler explained the relationship of XLink and
	  XPointer as follows:</para>
	<blockquote>
	  <para>XLink governs how you insert links
	    <emphasis>into</emphasis> your XML document, where the
	    link might point to anything (eg a GIF file); XPointer
	    governs the fragment identifier that can go on a URL when
	    you're linking <emphasis>to</emphasis> an XML document,
	    <emphasis>from</emphasis> anywhere (eg from an HTML
	    file).</para>
	  <para>[Or indeed from an XML file, a URI in a mail message,
	    etc&hellip;Ed.]</para>
	</blockquote>
	<para><personname>
	    <firstname>David</firstname>
	    <surname>Megginson</surname>
	  </personname> has produced an <ulink
	    url="http://www.megginson.com/Software/psgml-xpointer.el">xpointer</ulink> 
	  function for Emacs/psgml which will deduce an XPointer for
	  any location in an XML document. XML Spy has a similar
	  function.</para>
      </answer>
    </qandaentry>
    <qandaentry id="mathematics" remap="FAQ-MATH, math">
      <question>
	<formalpara>
	  <title>Can I encode mathematics using XML?</title>
	  <para>Yes, using MathML.</para>
	</formalpara>
      </question>
      <answer remap="add subtract multiply divide addition subtraction
	multiplication division">
	<para>Yes, if the <link linkend="dtds"
	    xreflabel="simple">document type</link> you use provides
	  for math, and your users' browsers are capable of rendering
	  it. The mathematics-using community has developed the <ulink
	    url="http://www.w3.org/Math/">MathML
	    Recommendation</ulink> at the W3C, which is a native XML
	  application suitable for embedding in other DTDs and
	  Schemas.</para>
	<para>It is also possible to make XML fragments from other
	  DTDs, such as <ulink
	    url="http://xml.coverpages.org/gen-apps.html#iso12083DTDs">ISO 
	    12083 Math</ulink>, or <ulink
	    url="http://www.openmath.org/">OpenMath</ulink>, or one of
	  your own making. Browsers which display math embedded in
	  SGML existed for many years (eg DynaText, Panorama, Multidoc
	  Pro), and mainstream browsers are now rendering MathML.
	  <personname>
	    <firstname>David</firstname>
	    <surname>Carlisle</surname>
	  </personname> has produced a <ulink
	    url="http://www.mathmlconference.org/2002/presentations/carlisle/">set 
	    of stylesheets</ulink> for rendering MathML in browsers.
	  It is also possible to use XSLT to convert XML math markup
	  to <LaTeX/> for print (PDF) rendering, or to use
	  XSL:FO.</para>
	<para>Please note that XML is not itself a programming
	  language, so concepts such as arithmetic and
	    <wordasword>if</wordasword>-statements (if-then-else
	    logic) are not meaningful in normal XML documents.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-META, metadata" id="metadata">
      <question>
	<formalpara>
	  <title>How does XML handle metadata?</title>
	  <para>Any way you want.</para>
	</formalpara>
      </question>
      <answer>
	<para>Because XML lets you define your own markup languages,
	  you can make full use of the extended hypertext features of
	  XML (see the question on <link xreflabel="simple"
	    linkend="links">Links</link>) to store or link to metadata
	  in any format (eg using <ulink
	    url="http://www.sdct.itl.nist.gov/~ftp/x3l8/other/Standards/iso11179/">ISO&nbsp;11179</ulink>, 
	  as a <ulink
	    url="http://www.oasis-open.org/committees/tm-pubsubj/">Topic 
	    Maps Published Subject</ulink>, with <ulink
	    url="http://purl.oclc.org/metadata/dublin_core/">Dublin
	    Core, Warwick Framework</ulink>, or with <ulink
	    url="http://www.dstc.edu.au/RDU/RDF/">Resource Description
	    Framework (RDF)</ulink>, or even <ulink
	    url="http://www.w3.org/PICS/">Platform for Internet
	    Content Selection (PICS)</ulink>).</para>
	<para>There are no predefined elements in XML, because it is
	  an architecture, not an application, so it is not part of
	  XML's job to specify how or if authors should or should not
	  implement metadata. You are therefore free to use any
	  suitable method. Browser makers may also have their own
	  architectural recommendations or methods to propose.</para>
      </answer>
    </qandaentry>
    <qandaentry id="graphics" remap="FAQ-GRAPH, graph">
      <question>
	<formalpara>
	  <title>How do I use graphics in XML?</title>
	  <para>Reference them as for HTML or use XLink. Or embed
	    SVG.</para>
	</formalpara>
      </question>
      <answer remap="drawings sounds nonparsed raster images">
	<para>Graphics have traditionally just been links which happen
	  to have a picture file at the end rather than another piece
	  of text. They can therefore be implemented in any way
	  supported by the XLink and XPointer specifications (see
	  <link linkend="links"></link>),
	  including using similar syntax to existing HTML images. They
	  can also be referenced using XML's built-in
	  <sgmltag>NOTATION</sgmltag> and <sgmltag>ENTITY</sgmltag>
	  mechanism in a similar way to standard SGML, as external
	  unparsed entities.</para>
	<para>However, the SVG specification (see <link
	    linkend="svg"></link>) lets you use XML markup to
	  draw vector graphics objects directly in your XML file. This
	  provides enormous power for the inclusion of portable
	  graphics, especially interactive or animated sequences, and
	  it is now slowly becoming supported in browsers.</para>
	<para>The XML linking specifications for external images give
	  you much better control over the traversal and activation of
	  links, so an author can specify, for example, whether or not
	  to have an image appear when the page is loaded, or on a
	  click from the user, or in a separate window, without having
	  to resort to scripting.</para>
	<para>XML itself doesn't predicate or restrict graphic file
	  formats: GIF, JPG, TIFF, PNG, CGM, EPS, and SVG at a minimum
	  would seem to make sense; however, vector formats (EPS, SVG)
	  are normally essential for non-photographic images
	  (diagrams).</para>
	<para>You cannot embed a raw binary graphics file (or any
	  other binary [non-text] data) directly into an XML file
	  because any bytes happening to resemble markup would get
	  misinterpreted: you must refer to it by linking (see below).
	  It is, however, possible to include a text-encoded
	  transformation of a binary file as a CDATA Marked Section,
	  using something like UUencode with the markup characters
	    <literal>]</literal>, <literal>&amp;</literal> and <literal>></literal>
	  removed from the map so that they could not occur as an
	  erroneous CDATA termination sequence and be misinterpreted.
	  You could even use simple hexadecimal encoding as used in
	  PostScript. For vector graphics, however, the solution is to
	  use SVG (see <link linkend="svg"></link>).</para>
	<para>Sound files are binary objects in the same way that
	  external graphics are, so they can only be referenced
	  externally (using the same techniques as for graphics).
	  Music files written in MusiXML or an XML variant of SMDL
	  could however be embedded in the same way as for SVG.</para>
	<para>The point about using entities to manage your graphics
	  is that you can keep the list of entity declarations
	  separate from the rest of the document, so you can re-use
	  the names if an image is needed more than once, but only
	  store the physical file specification in a single place.
	  This is available only when using a DTD, not a
	  Schema.</para>
	<tip xreflabel="Bob DuCharme">
	  <para>All the data in an XML document entity must be
	    parsable XML. You can define an external entity as either
	    a parsed entity (parsable XML) or an unparsed entity
	    (anything else). Unparsed entities can be used for picture
	    files, sound files, movie files, or whatever you like.
	    They can only be referenced from within a document as the
	    value of an attribute (much like a bitmap picture on an
	    HTML Web page is the value of the <sgmltag>img</sgmltag>
	    element's <sgmltag>src</sgmltag> attribute) and not part
	    of the actual document. In an XML document, this attribute
	    must be declared to be of type <sgmltag>ENTITY</sgmltag>,
	    and the entity's declaration must specify a declared
	    <sgmltag>NOTATION</sgmltag>, because if the entity isn't
	    XML, the XML processor needs to know what it is. For
	    example, in the following document, the
	    <sgmltag>colliepic</sgmltag> entity is declared to have a
	    JPEG notation, and it's used as the value of the empty dog
	    element's <sgmltag class="attribute">picfile</sgmltag>
	    attribute.</para>
	  <programlisting><![CDATA[ 
<?xml version="1.0"?> 
<!DOCTYPE dog [ 
<!NOTATION JPEG SYSTEM "Joint Photographic Experts Group"> 
<!ENTITY colliepic SYSTEM "lassie.jpg" NDATA JPEG>
<!ELEMENT dog EMPTY> 
<!ATTLIST dog picfile ENTITY #REQUIRED> 
]> 
<dog picfile="colliepic"/> 
	    ]]></programlisting>
	  <para>The Entity method is particularly useful when you have
	    many images, or many repeated uses of the same images,
	    because you only declare them once, at the top of the
	    document, making image management much easier.</para>
	  <para>The XLink and XPointer linking specifications describe
	    other ways to point to a non-XML file such as a graphic.
	    These offer more sophisticated control over the external
	    entity's position, handling, and appearance within the XML
	    document.</para>
	</tip>
	<tip id="svg" xreflabel="Peter Murray-Rust">
	  <para>GIFs and JPEGs cater for bitmaps (pixel
	    representations of images: all made up of coloured dots).
	    Vector graphics (scalable, made up of drawing
	    specifications) are addressed in the W3C's graphics
	    activity as Scalable Vector Graphics (see <ulink
	      url="http://www.w3.org/Graphics/SVG"></ulink>). 
	    With the specification now complete, it is
	    possible to transmit the graphical representation as
	    vectors directly within the XML file. For many graphics
	    objects this will mean greatly decreased download time and
	    scaling without loss of detail.</para> 
	</tip>
	<tip xreflabel="Max Dunn">
	  <para id="faq:SVG">SVG has really taken off recently, and is
	    quite an XML success story [&hellip;] there are already
	    nearly conformant implementations. We recently started an
	    SVG FAQ at <ulink id="FAQ:svg"
	      url="http://www.siliconpublishing.org/svgfaq/"></ulink> 
	    which we are planning to move to <ulink
	      url="http://www.svgfaq.com/"></ulink>.</para>
	  <para>XSLT can be used to generate SVG from XML; details are
	    at <ulink
	      url="http://www.siliconpublishing.org/svgfaq/XSLT.asp"></ulink> 
	    (be careful to use XSLT, not <ulink
	      url="http://www.netcrucible.com/xslt/msxml-faq.htm">Microsoft's 
	      obsolete WD-xsl</ulink>). Documents can also interact
	    with SVG images (see <ulink
	      url="http://www.xml.com/pub/a/2000/03/22/style/index.html">http://www.xml.com/pub/a/2000/03/22/style/index.html</ulink>).</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="parsers" >
      <question>
	<formalpara>
	  <title>What is parsing and how do I do it in XML?</title>
	  <para>Parsing is splitting up information into its component
	    parts</para>
	</formalpara>
      </question>
      <answer remap="penguins">
	<para>Parsing is the act of splitting up information into its
	  component parts (schools used to teach this in language
	  classes until the teaching profession collectively caught
	  the anti-grammar disease).</para>
	<para><quote>Mary feeds Spot</quote> parses as</para>
	<orderedlist>
	  <listitem>
	    <para>Subject = Mary, proper noun, nominative case</para>
	  </listitem>
	  <listitem>
	    <para>Verb = feeds, transitive, third person singular,
	      present tense</para>
	  </listitem>
	  <listitem>
	    <para>Object = Spot, proper noun, accusative case</para>
	  </listitem>
	</orderedlist>
	<para>In computing, a parser is a program (or a piece of code
	  or API that you can reference inside your own programs)
	  which analyses files to identify the component parts. All
	  applications that read input have a parser of some kind,
	  otherwise they'd never be able to figure out what the
	  information means. Microsoft Word contains a parser which
	  runs when you open a <programlisting>.doc</programlisting>
	  file and checks that it can identify all the hidden codes.
	  Give it a corrupted file and you'll get an error
	  message.</para>
	<para>XML applications are just the same: they contain a parser
	  which reads XML and identifies the function of each the pieces of
	  the document, and it then makes that information available in
	  memory to the rest of the program.</para>
	<para>While reading an XML file, a parser checks the syntax
	  (pointy brackets, matching quotes, etc) for well-formedness,
	  and reports any violations (reportable errors). The <link
	    xreflabel="simple" linkend="spec">XML Specification</link>
	  lists what these are.</para>
	<para>Validation is another stage beyond parsing. As the
	  component parts of the program are identified, a validating
	  parser can compare them with the pattern laid down by a DTD
	  or a Schema, to check that they conform. In the process,
	  default values and datatypes (if specified) can be added to
	  the in-memory result of the validation that the validating
	  parser gives to the application.</para>
	<programlisting><![CDATA[
<person corpid="abc123" birth="1960-02-31" gender="female">
  <name>
    <forename>Judy</forename>
    <surname>O'Grady</surname>
  </name>
</person> 
	  ]]></programlisting>
	<para>The example above parses as:</para>
	<orderedlist>
	  <listitem>
	    <para>Element <sgmltag class="gi">person</sgmltag>
	      identified with Attribute <sgmltag
		class="attribute">corpid</sgmltag> containing <sgmltag
		class="attvalue">abc123</sgmltag> and Attribute
	      <sgmltag class="attribute">birth</sgmltag> containing
	      <sgmltag class="attvalue">1960-02-31</sgmltag> and
	      Attribute <sgmltag class="attribute">gender</sgmltag>
	      containing <sgmltag class="attvalue">female</sgmltag>
	      containing ...</para>
	  </listitem>
	  <listitem>
	    <para>Element <sgmltag class="gi">name</sgmltag>
	      containing ...</para>
	  </listitem>
	  <listitem>
	    <para>Element <sgmltag class="gi">forename</sgmltag>
	      containing text <quote>Judy</quote> followed by
	      ...</para>
	  </listitem>
	  <listitem>
	    <para>Element <sgmltag class="gi">surname</sgmltag>
	      containing text <quote>O'Grady</quote></para>
	  </listitem>
	</orderedlist>
	<para>(and lots of other stuff too).</para>
	<para>As well as built-in parsers, there are also stand-alone
	  parser-validators, which read an XML file and tell you if
	  they find an error (like missing angle-brackets or quotes,
	  or misplaced markup). This is essential for testing files in
	  isolation before doing something else with them, especially
	  if they have been created by hand without an XML editor, or
	  by an API which may be too deeply embedded elsewhere to
	  allow easy testing.</para>
	<tip id="howval" xreflabel="Bill Rayer" role="helped">
	  <para>For standalone parsing/validation use software like
	    <personname>
	      <firstname>James</firstname>
	      <surname>Clark</surname>
	    </personname>'s <ulink
	      url="http://www.jclark.com/sp">nsgmls</ulink> or
	      <personname>
	      <firstname>Richard</firstname>
	      <surname>Tobin</surname>
	    </personname>'s <ulink
	      url="http://www.cogsci.ed.ac.uk/~richard/rxp.html">rxp</ulink>. 
	    Both work under Linux and Windows/DOS. The difference is
	    in the format of the error listing (if any), and that some
	    versions of nsgmls do not retrieve DTDs or other files
	    over the network, whereas rxp does.</para>
	  <para>Make sure your XML file correctly references its DTD
	    in a Document Type Declaration, and that the DTD file[s]
	    are locally accessible (rxp will retrieve them if you have
	    an Internet connection; nsgmls may not, so it may need a
	    local copy).</para>
	  <para>Download and install the software. Make sure it is
	    installed to a location where your operating system can
	    find it. If you don't know what any of this means, you
	    will need some help from someone who knows how to download
	    and install software on your type of operating
	    system.</para>
	  <para>For nsgmls, copy <filename>pubtext/xml.soc</filename>
	    and <filename>pubtext/xml.dcl</filename> to your working
	    directory.</para>
	  <para>To validate <filename>myfile.xml</filename>, open a
	    shell window (Linux) or an MS-DOS (<quote>command</quote>)
	    window (Microsoft Windows). In these examples we'll assume
	    your XML file is called <filename>myfile.xml</filename>
	    and it's in a folder called <filename>myfolder</filename>.
	    Use the real names of your folder and file when you type
	    the commands.</para>
	  <variablelist>
	    <varlistentry>
	      <term>For onsgmls:</term>
	      <listitem>
		<para><programlisting><![CDATA[
$ onsgmls -wxml -wundefined -cxml.soc -s myfile.xml
		  ]]></programlisting>
		  There are many other options for
		  <productname>onsgmls</productname> which 
		  are described on the <ulink
		    url="http://openjade.sourceforge.net/">Web page</ulink>.
		  The ones given here are required because it's based
		  on an SGML parser and these options switch it to XML mode
		  and suppress the normal output, leaving just the
		  errors (if any).</para>
		<para>(In Microsoft Windows you may have to prefix the
		  <command>onsgmls</command> command with the full
		  path to wherever it was installed, eg
		  <filename>C:\Program
		    Files\OpenSP\bin\onsgmls</filename>).</para>
	      </listitem>
	    </varlistentry>
	    <varlistentry>
	      <term>For rxp:</term>
	      <listitem>
		<para><programlisting><![CDATA[
$ rxp myfile.xml
		  ]]></programlisting>
		Rxp also has some options which are described on
		  its <ulink
		    url="http://www.cogsci.ed.ac.uk/~richard/rxp.html">Web 
		    page</ulink>.</para>
		<para>(In Microsoft Windows you may have to prefix the
		  <command>rxp</command> command with the full path to
		  wherever it was installed, eg <filename>C:\Program
		    Files\ltxml2\bin\rxp</filename>).</para>
	      </listitem>
	    </varlistentry>
	  </variablelist>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="includes">
      <question>
	<formalpara>
	  <title>How do I include one XML file in another?</title>
	  <para>Use a general entity, same as for SGML</para>
	</formalpara>
      </question>
      <answer remap="xinclude">
	<para>One method is to use Document Entities, which work
	  exactly the same as for SGML. First you declare the entity
	  you want to include, and then you reference it by
	  name:</para>
	<programlisting><![CDATA[ 
<?xml version="1.0"?>
<!DOCTYPE novel SYSTEM "/dtd/novel.dtd" [
<!ENTITY chap1 SYSTEM "mydocs/chapter1.xml">
<!ENTITY chap2 SYSTEM "mydocs/chapter2.xml">
<!ENTITY chap3 SYSTEM "mydocs/chapter3.xml">
<!ENTITY chap4 SYSTEM "mydocs/chapter4.xml">
<!ENTITY chap5 SYSTEM "mydocs/chapter5.xml">
]>
<novel>
  <header>
    ...blah blah...
  </header>
&chap1; 
&chap2; 
&chap3; 
&chap4; 
&chap5; 
</novel>
	  ]]></programlisting>
	<para>The difference between this method and the one used for
	  including a DTD fragment (see <link
	    linkend="dtdincludes"></link>) is that this uses an
	  external general (file) entity which is referenced in the
	  same way as for a character entity (with an ampersand).</para>
	<para>The one thing to make sure of is that the included file
	  <emphasis>must not</emphasis> have an XML or DOCTYPE
	  Declaration on it. If you've been using one for editing the
	  fragment, remove it before using the file in this way. Yes,
	  this is a pain in the butt, but if you have lots of
	  inclusions like this, write a script to strip off the
	  declaration (and paste it back on again for editing).</para>
	<para>Schemas do not support entities, so the alternative is
	  to use <ulink
	    url="http://www.w3.org/TR/xinclude/">XInclude</ulink>.
	  This is a W3C specification for including one XML document
	  (or fragment) inside another.</para> 
	<programlisting><![CDATA[
<?xml version="1.0"?>
...
<article xmlns="http://docbook.org/ns/docbook"
      xmlns:xi="http://www.w3.org/2001/XInclude">
   <articleinfo>
     <xi:include href="metadata.xml" parse="xml"
         xpointer="title"/>
   </articleinfo>
   <sect1>
      ...
   </sect1>
</article>
	  ]]></programlisting>
	<para>Your processing software must be able to handle XInclude
	  for this to work. The <ulink
	    url="http://www.w3.org/TR/xptr/">XPointer</ulink> syntax
	  can direct the parser to a specific location within the
	  document, unlike entities, where the entire document is
	  included.</para>
      </answer>
    </qandaentry>
    <qandaentry id="cdata" >
      <question>
	<formalpara>
	  <title>When should I use a CDATA Marked Section?</title>
	  <para>CDATA is only for text containing markup-like characters.</para>
	</formalpara>
      </question>
      <answer>
	<para>You should almost never need to use CDATA Sections. The
	  CDATA mechanism was designed to let an author quote
	  fragments of text containing markup characters (the
	  open-angle-bracket and the ampersand), for example when
	  documenting XML (this FAQ uses CDATA Sections quite a lot,
	  for obvious reasons). A CDATA Section turns off markup
	  recognition for the duration of the section (it gets turned
	  on again only by the closing sequence of double
	  end-square-brackets and a close-angle-bracket).</para>
	<para>Consequently, <emphasis>nothing</emphasis> in a CDATA
	  section can ever be recognised as anything to do with
	  markup: it's just a string of opaque characters, and if you
	  use an XML transformation language like XSLT, <emphasis>any
	    markup characters in it will get turned into their
	    character entity equivalents</emphasis>.</para>
	<para>If you try, for example, to use:</para>
	<programlisting><![CDATA[
some text with <![CDATA[<em>markup</em>]]&gt;]]> in it.
	</programlisting>
	<para>in the expectation that the embedded markup would remain
	  untouched, it won't: it will just output</para>
	<programlisting>
some text with &amp;lt;em>markup&amp;lt;/em> in it.
	</programlisting>
	<para>In other words, CDATA Sections
	  <emphasis>cannot</emphasis> preserve the embedded markup
	  <emphasis>as markup</emphasis>. Normally this is exactly
	  what you want because this technique was designed to let
	  people do things like write documentation about markup. It
	  was <emphasis>not</emphasis> designed to allow the passing
	  of little chunks of (possibly bogus or invalid) unparsed HTML
	  embedded inside your own XML through to a subsequent
	  process&mdash;because that would risk invalidating the
	  output.</para>
	<para>As a result you <emphasis>cannot</emphasis> expect to
	  keep markup untouched simply because it looked as if it was
	  safely <quote>hidden</quote> inside a CDATA section: it
	  can't be used as a magic shield to preserve HTML markup for
	  future use <emphasis>as markup</emphasis>, only as
	  characters.</para>
	<tip>
	  <para>Read <link linkend="html"></link> as
	    well, which is very closely related.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="html" >
      <question>
	<formalpara>
	  <title>How can I handle embedded HTML in my XML?</title>
	  <para>Provide for it in the output, use a deep copy, or try
	    disable-output-escaping.</para>
	</formalpara>
      </question>
      <answer>
	<para>Apart from using <link linkend="cdata" xreflabel="simple">CDATA
	    Sections</link>, there are two common occasions when
	  people want to handle embedded HTML inside an XML
	  element:</para>
	<orderedlist>
	  <listitem>
	    <para>when they have received (possibly poorly-designed)
	      XML from somewhere else which they must find a way to
	      handle;</para>
	  </listitem>
	  <listitem>
	    <para>when they have an application which has been
	      explicitly designed to store a string of characters
	      containing <literal><![CDATA[&lt;]]></literal> and
	      <literal><![CDATA[&amp;]]></literal> character entity
	      references with the objective of turning them back into
	      markup in a later process (eg FreeMind, Atom).</para>
	  </listitem>
	</orderedlist>
	<para>Generally, you want to avoid this kind of trick, as it
	  usually indicates that the document structure and design has
	  been insufficiently thought out. However, there are
	  occasions when it becomes unavoidable, so if you really need
	  or want to use embedded HTML markup inside XML,
	  <emphasis>and</emphasis> have it processable later as
	  markup, there are a couple of techniques you may be able to
	  use:</para>
	<itemizedlist>
	  <listitem>
	    <para>Provide templates for the handling of that markup in
	      your XSLT transformation or whatever software you use
	      which simply replicates what was there, eg</para>
	    <programlisting><![CDATA[
<xsl:template match="b">
  <b>
    <xsl:apply-templates/>
  </b>
</xsl:template/>
	      ]]></programlisting>
	  </listitem>
	  <listitem>
	    <para>Use XSLT's <quote>deep copy</quote> instruction,
	      which outputs nested well-formed markup verbatim,
	      eg</para>
	    <programlisting><![CDATA[
<xsl:template match="ol">
  <xsl:copy-of select="."/>
</xsl:template/>
	      ]]></programlisting>
	  </listitem>
	  <listitem>
	    <para>As a last resort, use the
	      <sgmltag>disable-output-escaping</sgmltag> attribute on
	      the <sgmltag>xsl:text</sgmltag> element of XSL[T] which
	      is available in some processors, eg</para>
	    <programlisting><![CDATA[
<xsl:text disable-output-escaping="yes"><![CDATA[<b>Now!</b>]]&gt;</xsl:text>
	      ]]></programlisting>
	  </listitem>
	  <listitem>
	    <para>Some processors (eg JX) are now providing their own
	      equivalents for disabling output escaping. Their
	      proponents claim it is <quote>highly desirable</quote>
	      or <quote>what most people want</quote>, but it still
	      needs to be treated with care to prevent unwanted
	      (possibly dangerous) arbitrary code from being passed
	      untouched through your system. It also adds another
	      dependency to your software.</para>
	  </listitem>
	</itemizedlist>
	<para>For more details of using these techniques in XSL[T],
	  see <ulink
	    url="http://www.dpawson.co.uk/xsl/sect2/cdata.html">the
	    relevant question in the XSL FAQ</ulink>.</para>
	<tip>
	  <para>Read <link linkend="cdata"></link> as
	    well, which is very closely related.</para>
	</tip>
      </answer>
    </qandaentry>
    <qandaentry id="specials" >
      <question>
	<formalpara>
	  <title>What are the special characters in XML?</title>
	  <para>Just five: 
	    <literal><![CDATA[&lt;]]></literal> (<literal>&lt;</literal>),
	    <literal><![CDATA[&amp;]]></literal> (<literal>&amp;</literal>),
	    <literal><![CDATA[&gt;]]></literal> (<literal>&gt;</literal>),
	    <literal><![CDATA[&quot;]]></literal> (<literal>&quot;</literal>), 
	    and
	    <literal><![CDATA[&apos;]]></literal>
	  (<literal>&apos;</literal>).</para> 
	</formalpara>
      </question>
      <answer remap="hex code hexcode specials reserved words nbsp lt
	  amp gt quot apos reserved tags restricted characters">
	<para>For normal text (<emphasis>not</emphasis> markup), there
	  are no special characters except <literal>&lt;</literal> and
	  <literal>&amp;</literal>: just make sure your XML
	  Declaration refers to the correct encoding scheme for the
	  language and/or writing system you want to use,
	  <emphasis>and</emphasis> that your computer correctly stores
	  the file using that encoding scheme. See <link
	    xreflabel="simple" linkend="characters">the question on
	    non-Latin characters</link> for a longer
	  explanation.</para>
	<para>Apart from the invisible ASCII control characters (the
	  ones you can't type), all other characters are just normal
	  text. Currency signs (&euro;, &pound;, &dollar;, &florin;,
	  and others), all the punctuation (except
	  <literal>&lt;</literal> and <literal>&amp;</literal>), and
	  all other letters, signs, and symbols in any language or
	  writing system are just text.</para>
	<para>If your keyboard will not allow you to type the
	  characters you want, or if you want to use characters
	  outside the limits of the encoding scheme you have chosen,
	  you can use a symbolic notation called <quote>entity
	    referencing</quote>. Entity references can either be
	  <emphasis>numeric</emphasis>, using the decimal or
	  hexadecimal <ulink
	    url="http://www.unicode.org/">Unicode</ulink> code point
	  for the character (eg if your keyboard has no Euro symbol
	  (&euro;) you can type
	  <literal><![CDATA[&#8364;]]></literal>); or they can be
	  <emphasis>character</emphasis>, using an established set of
	  names which you can declare in your DTD (eg
	  <literal><![CDATA[<!ENTITY euro "&#8364;">]]></literal>)
	  which then lets you use the name
	  <literal><![CDATA[&euro;]]></literal> in your document. If
	  you are using a Schema, you must use the numeric form for
	  all except the five below because Schemas have no way to
	  make character entity declarations.</para>
	<para>If you use XML with no DTD, then these five character
	  entities are assumed to be predeclared, and you can use them
	  without declaring them:</para>
	<variablelist>
	  <varlistentry>
	    <term><literal><![CDATA[&lt;]]></literal></term>
	    <listitem>
	      <para>The less-than character (<literal>&lt;</literal>) starts
		<firstterm>element markup</firstterm> (the first
		character of a start-tag or an end-tag).</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term><literal><![CDATA[&amp;]]></literal></term>
	    <listitem>
	      <para>The ampersand character (<literal>&amp;</literal>)
		starts <firstterm>entity markup</firstterm> (the first
		character of a character entity reference).</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term><literal><![CDATA[&gt;]]></literal></term>
	    <listitem>
	      <para>The greater-than character (<literal>&gt;</literal>)
		ends a start-tag or an end-tag.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term><literal><![CDATA[&quot;]]></literal></term>
	    <listitem>
	      <para>The double-quote character (<literal>&quot;</literal>)
		can be symbolised with this character entity reference
		when you need to embed a double-quote inside a string
		which is already double-quoted.</para>
	    </listitem>
	  </varlistentry>
	  <varlistentry>
	    <term><literal><![CDATA[&apos;]]></literal></term>
	    <listitem>
	      <para>The apostrophe or single-quote character
		(<literal>&apos;</literal>) can be symbolised with this
		character entity reference when you need to embed a
		single-quote or apostrophe inside a string which is
		already single-quoted.</para>
	    </listitem>
	  </varlistentry>
	</variablelist>
	<para>If you are using a DTD then you
	  <emphasis>must</emphasis> declare <emphasis>all</emphasis>
	  the character entities you need to use (if any),
	  <emphasis>including</emphasis> any of the five above that
	  you plan on using (they cease to be predeclared if you use a
	  DTD). If you are using a Schema, you must use the numeric
	  form for all except the five above because Schemas have no
	  way to make character entity declarations.</para>
	<warning>
	  <para>There are circumstances where you can use special
	    characters as themselves, such as in <link
	      xreflabel="simple" linkend="cdata">CDATA
	      Sections</link>. Most control characters are prohibited
	    in XML: see the <link xreflabel="simple"
	      linkend="spec">Specification</link> for exact
	    details.</para>
	</warning>
	<para>There are also no reserved words as such in the user
	  namespace of XML: you can call an element
	  <wordasword>element</wordasword> and an attribute
	  <wordasword>attribute</wordasword> and so on as in the
	  following (perverse) example:</para>
	<programlisting><![CDATA[
<?xml version="1.0"?>
<!DOCTYPE DOCTYPE SYSTEM "SYSTEM" [
<!ELEMENT DOCTYPE (ELEMENT+)>
<!ATTLIST ELEMENT ATTLIST ENTITY #IMPLIED>
<!NOTATION DOCTYPE SYSTEM "ENTITY">
<!ENTITY NOTATION SYSTEM "ENTITY" NDATA DOCTYPE>
]>
<DOCTYPE>
  <ELEMENT ATTLIST="NOTATION">foo</ELEMENT>
</DOCTYPE>
	]]></programlisting>
	<para>where the file <filename>SYSTEM</filename> contains the
	  declaration: <literal><![CDATA[<!ELEMENT ELEMENT
	    (#PCDATA)>]]></literal> and the file
	  <filename>ENTITY</filename> does not even exist.</para>
	<para>There are <firstterm>keywords</firstterm> like
	  <literal>DOCTYPE</literal> and <literal>IMPLIED</literal>
	which are reserved Names, but they are prefixed by a flag
	character (the Markup Declaration Open character or the
	Reserved Name Indicator) so that they cannot be confused with
	user-specified Names.</para>
      </answer>
    </qandaentry>
  </qandadiv>
  <qandadiv id="developers" remap="FAQ-DEVELOPER, Developer">
    <title>Developers and Implementors</title>
    <qandaentry remap="FAQ-SPEC, spec" id="spec">
      <question>
	<formalpara>
	  <title>Where's the spec?</title>
	  <para>Right <ulink
	    url="http://www.w3.org/TR/REC-xml">here</ulink></para>
	</formalpara>
      </question>
      <answer>
	<para>Right here: <biblioref linkend="thespec"/>
	  (<filename>http://www.w3.org/TR/REC-xml</filename>).
	  Includes the EBNF, and all the normative material. There are
	  also versions in <ulink
	    url="http://www.fxis.co.jp/DMS/sgml/xml/">Japanese</ulink>; 
	  <ulink
	    url="http://xml.silmaril.ie/faq-es.html">Spanish</ulink>;
	  <ulink
	    url="http://xml.t2000.co.kr/faq/index.html">Korean</ulink>; 
	  a <ulink
	    url="http://www.xml.com/axml/testaxml.htm">Java-ised
	    annotated version</ulink>, and <author of="xmlann">
	    <contrib></contrib>
	  </author>'s book, <citetitle
	  author="xmlann"></citetitle>.</para>
	<para><personname>
	    <firstname>Eve</firstname>
	    <surname>Maler</surname>
	  </personname> maintains <ulink
	    url="http://www.w3.org/XML/1998/06/xmlspec-v21.dtd">the
	    DTD used for the spec itself</ulink>; the DTD is also to
	  encode several other W3C specifications, such as XLink,
	  XPointer, DOM, XML Schema, etc. There is <ulink
	    url="http://www.w3.org/XML/1998/06/xmlspec-report-v21.htm">documentation</ulink> 
	  available for the DTD. Note that the XML spec needs to use
	  <ulink
	    url="http://www.w3.org/XML/1998/06/xmlspec-v21a.dtd">a
	    special one-off version of the DTD</ulink>, since the real
	  original DTD used for it has long since been lost.</para>
      </answer>
    </qandaentry>
    <qandaentry remap="FAQ-TERMS, terms" id="terminology"> 
      <question>
	<formalpara> 
	  <title>I'm trying to understand the XML Spec: why does it
	    have such difficult terminology?</title> 
	  <para>It has to be formal to be accurate.</para>
	</formalpara> 
      </question> 
      <answer> 
	<para>For implementation to succeed, the terminology needs to
	  be precise. Design goal eight of the specification tells us
	  that <quote>the design of XML shall be formal and
	    concise</quote>. To describe XML, the specification
	  therefore uses formal language drawn from several fields,
	  specifically those of document engineering, international
	  standards and computer science.  This is often confusing to
	  people who are unused to these disciplines because they use
	  well-known English words in a specialised sense which can be
	  very different from their common meanings&mdash;for example:
	  grammar, production, token, or terminal.</para> 
	<para>The specification does not explain these terms because
	  of the other part of the design goal: the specification
	  should be concise. It doesn't repeat explanations that are
	  available elsewhere: it is assumed you know this and either
	  know the definitions or are capable of finding them. In
	  essence this means that to grok the fullness of the spec,
	  you do need a knowledge of some SGML and computer science,
	  and have some exposure to the language of formal
	  standards.</para> 
	<para>Sloppy terminology in specifications causes
	  misunderstandings and makes it hard to implement
	  consistently, so formal standards have to be phrased in
	  formal terminology. This FAQ is not a formal document, and
	  the astute reader will already have noticed it refers to
	  <quote>element names</quote> where <quote>element type
	    names</quote> is more correct; but the former is more
	  widely understood.</para> 
	<para>Those new to the terminology may find it useful to read
	  something like the <biblioref linkend="tei"/> or <biblioref
	    linkend="xmlann"/>.</para> 
      </answer> 
    </qandaentry>
    <qandaentry remap="FAQ-VALIDWF, validwf" id="validity">
      <question>
	<formalpara>
	  <title>What are these terms DTDless, valid, and
	    well-formed?</title>
	  <para>Well-formed means syntactically correct (DTD or not);
	    valid means a DTD has been used.</para>
	</formalpara>
      </question>
      <answer remap="internalsubset well from formed modelling modeling">
	<para>XML lets you use a Schema or Document Type Definition
	  (DTD) to describe the markup (elements and other constructs)
	  available in any specific type of document. However, the
	  design and construction of Schemas and DTD can be complex
	  and non-trivial, so XML also lets you work without one.
	  DTDless operation means you can invent markup without having
	  to define it formally, provided you stick to the rules of
	  XML syntax.</para>
	<para>To make this work, a DTDless file is assumed to define
	  its own markup purely by the existence and location of
	  elements where you create them. When an XML application
	  encounters a DTDless file, it builds its internal model of
	  the document structure while it reads it, because it has no
	  Schema or DTD to tell it what to expect. There must
	  therefore be no surprises or ambiguous syntax. To achieve
	  this, the document must be <quote>well-formed</quote> (must
	  follow the rules).</para>
	<para>To understand why this concept is needed, look at
	  standard HTML as an example:</para>
	<itemizedlist>
	  <listitem>
	    <para>The <sgmltag class="gi">img</sgmltag> element is
	      declared (in the DTDs for HTML) as EMPTY, so it doesn't
	      have an end-tag (there is no such thing as <sgmltag
		class="endtag">img</sgmltag>);</para>
	  </listitem>
	  <listitem>
	    <para>Many other HTML elements (such as <sgmltag
		class="gi">para</sgmltag>) allow you to omit
	      theend-tag for brevity when using the SGML version of
	      HTML.</para> 
	  </listitem> 
	  <listitem> 
	    <para>If an XML processor reads an HTML file without
	      knowing this (because it isn't using a DTD), and it
	      encounters an <sgmltag class="starttag">img</sgmltag> or
	      a <sgmltag class="starttag">para</sgmltag> (or any other
	      start-tag), it would have no way to know whether or not
	      to expect an end-tag. This makes it impossible to know
	      if the rest of the file is correct or not, because it
	      has now no evidence of whether it is inside an element
	      or if it has finished with it.</para> 
	  </listitem> 
	</itemizedlist> 
	<para>Well-formed documents therefore
		<emphasis>require</emphasis> start-tags and end-tags
		on every normal element, and any EMPTY elements must
		be made unambiguous, either by using normal start-tags
		and end-tags, or by appending a slash to the name of
		the start-tag before the closing <literal>></literal>
		as a sign that there will be no separate
		end-tag.</para> 
	<para>All XML documents, both DTDless and valid, must be
		well-formed. They must start with an XML Declaration
		if necessary (for example, identifying the character
		encoding or using the Standalone Document
		Declaration):</para> 
	<programlisting><![CDATA[ 
<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> 
<foo> 
  <bar>...<blort/>...</bar> 
</foo> 
	  ]]></programlisting> 
	<tip xreflabel="David Brownell"> 
	  <para>XML that's just well-formed doesn't need to use a
	    Standalone Document Declaration at all. Such declarations
	    are there to permit certain speedups when processing
	    documents while ignoring external parameter
	    entities&mdash;basically, you can't rely on external
	    declarations in standalone documents. The types that are
	    relevant are entities and attributes. Standalone documents
	    must not require any kind of attribute value normalisation
	    or defaulting, otherwise they are invalid.</para> 
	</tip> 
	<para>It's also possible to use a Document Type Declaration
	  with DTDless files, even though there is no Document Type to
	  refer to: </para> 
	<tip xreflabel="Richard Lander"> 
	  <para>If you need character entities [other than the five
	    built-in ones] in a DTDless file, you can declare them in
	    an internal subset without referencing anything other than
	    the root element type:</para>
	  <programlisting><![CDATA[ 
<?xml version="1.0" standalone="yes"?> 
<!DOCTYPE example [ 
<!ENTITY mdash "---"> 
]> 
<example>Hindsight&mdash;a wonderful thing.</example> 
]]></programlisting> 
	</tip> 
	<tip id="wf"> <title>Rules for well-formedness:</title>
	  <itemizedlist> 
	    <listitem> 
	      <para>All tags must be balanced: that is, every element
		which may contain character data or sub-elements must
		have both the start-tag and the end-tag present
		(omission is not allowed except for EMPTY elements,
		see below);</para> 
	    </listitem> 
	    <listitem> 
	      <para>All attribute values must be in quotes. The
		single-quote character (the apostrophe) may be used if
		the value contains a double-quote character, and vice
		versa. If you need isolated quotes as data as well,
		you can use <sgmltag class="genentity">apos</sgmltag>
		or <sgmltag class="genentity">quot</sgmltag>. Do not
		under any circumstances use the automated typographic
		(<quote>curly</quote>) inverted commas substituted by
		some wordprocessors for quoting attribute
		values.</para> 
	    </listitem> 
	    <listitem> 
	      <para>Any EMPTY elements (eg those with no end-tag like
		HTML's <sgmltag class="gi">img</sgmltag>, <sgmltag
		  class="gi">hr</sgmltag>, and <sgmltag
		  class="gi">br</sgmltag> and others) must
		<emphasis>either</emphasis> end with
		<literal>/></literal>&nbsp;<emphasis>or</emphasis>
		they must look like non-EMPTY elements by having a
		real end-tag (but no content). Example: <sgmltag
		  class="starttag">br</sgmltag> would become either
		<sgmltag class="emptytag">br</sgmltag> or <sgmltag
		  class="starttag">br</sgmltag><sgmltag
		  class="endtag">br</sgmltag> (with nothing in
		between).</para> 
	    </listitem> 
	    <listitem> 
	      <para>There must not be any isolated markup-start
		characters (<literal><![CDATA[<]]></literal> or
		<literal><![CDATA[&]]></literal>) in your text data.
		They must be given as <sgmltag
		  class="genentity">lt</sgmltag> and <sgmltag
		  class="genentity">amp</sgmltag> respectively, and
		the sequence
		<literal>]]</literal><literal><![CDATA[>]]></literal>
		may only occur as the end of a CDATA marked section:
		if you are using it for any other purpose it must be
		given as <literal>]]</literal><sgmltag
		  class="genentity">gt</sgmltag>.</para> 
	    </listitem>
	    <listitem> 
	      <para>Elements must nest inside each other properly (no
		overlapping markup, same as for HTML);</para>
	    </listitem> 
	    <listitem> 
	      <para>DTDless well-formed documents may use attributes
		on any element, but the attributes are all assumed to
		be of type CDATA. You cannot use ID/IDREF attribute
		types for parser-checked cross-referencing in DTDless
		documents.</para> 
	    </listitem> 
	    <listitem> 
	      <para>XML files with no DTD are considered to have
		<sgmltag class="genentity">lt</sgmltag>, <sgmltag
		  class="genentity">gt</sgmltag>, <sgmltag
		  class="genentity">apos</sgmltag>, <sgmltag
		  class="genentity">quot</sgmltag>, and <sgmltag
		  class="genentity">amp</sgmltag> predefined and thus
		available for use. With a DTD, all character entities
		used must be declared, including these five. </para>
	    </listitem> 
	  </itemizedlist> 
	</tip> 
	<tip id="valid">
	  <title>Rules for validity</title> 
	  <para>Valid XML files are well-formed files which have a
	    <link linkend="dtds" xreflabel="simple">Document Type
	      Definition (DTD)</link> and which conform to it. They
	    must already be <link xreflabel="simple"
	      linkend="wf">well-formed</link>, so all the rules above
	    apply.</para> 
	  <para>A valid file begins with a Document Type Declaration
	    specifying a DTD, or specifying a W3C Schema. It may have
	    an optional XML Declaration prepended.</para>
	  <programlisting><![CDATA[ 
<?xml version="1.0"?> 
<!DOCTYPE advert SYSTEM "http://www.foo.org/ad.dtd"> 
<advert>
  <headline>...<pic/>...</headline> 
  <text>...</text>
</advert> 
	    ]]></programlisting> 
	</tip> 
	<para id="fpis">The XML Specification predefines an SGML
	  Declaration for XML which is fixed for all instances and is
	  therefore hard-coded into all XML software and never
	  specified separately (except when using an SGML/XML
	  switchable validator like
	  <productname>onsgmls</productname>: see below).</para>
	<tip id="sgmldec"> 
	  <para>The SGML Declaration for XML has been removed from the
	    text of the Specification but is available as <ulink
	      url="http://www.w3.org/TR/NOTE-sgml-xml-971215">a
	      separate document</ulink>). As this appears to suffer
	    occasionally from bitrot or neglect, there is a copy
	    <ulink url="xml-websgml.dec">here (WebSGML TC)</ulink> and
	    <ulink url="xml-enr.dec">here (Extended Naming Rules
	      TC)</ulink>, and a version for
	    <productname>onsgmls</productname>&nbsp;<ulink
	      url="/xml-onsgmls.dec">here</ulink>.</para> 
	</tip> 
	<para>The specified DTD must be accessible to the XML
	  processor using the URI supplied in the SYSTEM Identifier,
	  either by being available locally (ie the user already has a
	  copy on disk), or by being retrievable via the network. Note
	  that DTD specifications <emphasis>must</emphasis> be URIs
	  (local, relative, or absolute). Proprietary-specific
	  filesystem references (eg
	  <filename>C:\dtds\my.dtd</filename> are not URIs and cannot
	  be used: use the <filename>file:///C|/dtds/my.dtd</filename>
	  format instead.</para> 
	<para>It is possible (many people would say preferable) to
	  supply a Formal Public Identifier with the PUBLIC keyword,
	  and use an XML Catalog to dereference it, but the
	  Specification mandates a SYSTEM Identifier so this must
	  still be supplied (after the PUBLIC identifier: no further
	  keyword is needed):</para>
	<programlisting><![CDATA[ 
<!DOCTYPE advert PUBLIC	
   "-//Foo, Inc//DTD Advertisements//EN"
   "http://www.foo.org/ad.dtd"> 
<advert>...</advert>
	  ]]></programlisting> 
	<para>The test for validity is that a validating parser finds
	  no errors in the file: it must conform absolutely to the
	  definitions and declarations in the DTD.</para> 
	<para>XML (W3C) Schemas are not usually linked directly from
	  within an XML document instance in the way that DTDs are:
	  the relevant Schema (XSD file) for a document instance is
	  normally specified to the parser separately, either by file
	  system reference, or using a <ulink
	    url="http://www.w3.org/TR/xmlschema-0/#NS">Target
	    Namespace</ulink>.</para> 
      </answer> 
    </qandaentry>
    <qandaentry id="attributes" remap="attriborelem">
      <question> 
	<formalpara> 
	  <title>Which should I use in my DTD/Schema, attributes or
	    elements?</title> 
	  <para>See <ulink
	      url="http://xml.coverpages.org/elementsAndAttrs.html">http://xml.coverpages.org/elementsAndAttrs.html</ulink></para>
	</formalpara> 
      </question> 
      <answer remap="attributes	versus elements"> 
	<para>There is no single answer to this: a lot depends on what
	  you are designing the document type for.</para> 
	<para>Traditional editorial practice for normal text documents
	  is to put the real text (what would be printed) as character
	  data content, and keep the metadata (information about the
	  text) in attributes, from where they can more easily be
	  isolated for analysis or special treatment like display in
	  the margin or in a mouseover:</para> 
	<programlisting><![CDATA[ <l n="184"> <spara>Portia</spara>
	  <text>The quality of mercy is not strain'd,</text> </l>
	  ]]></programlisting> 
	<para>But from the systems point of view, there is nothing
	  wrong with storing the data the other way round, especially
	  where the volume of text data on each occasion is relatively
	  small:</para>
	<programlisting><![CDATA[ <line speaker="Portia" text="The
	  quality of mercy is not strain'd,">184</line>
	  ]]></programlisting> 
	<para>A lot will depend on what you want to do with the
	  information and which bits of it are easiest accessed by
	  each method. A rule of thumb for conventional text documents
	  is that if the markup were all stripped away, the bare text
	  should still be readable and usable, even if unformatted and
	  inconvenient. For database output, however, or other
	  machine-generated documents like e-commerce transactions,
	  human reading may not be meaningful, so it is perfectly
	  possible to have documents where all the data is in
	  attributes, and the document contains no character data in
	  content models at all.  See <ulink
	    url="http://xml.coverpages.org/elementsAndAttrs.html"></ulink> 
	  for more information.</para> 
	<tip xreflabel="Mike Kay"> 
	  <para>From a user: <quote><emphasis>[&hellip;] do most of
		you out there use element-based or attribute-based
		xml? why?</emphasis></quote></para> 
	  <para>Beginners always ask this question. Those with a
	    little experience express their opinions passionately.
	    Experts tell you there is no right answer. (<ulink
	      url="http://lists.xml.org/archives/xml-dev/200006/msg00293.html"></ulink>)</para>
	</tip> 
      </answer> 
    </qandaentry> 
    <qandaentry	id="sgmlchanges" remap="FAQ-DTD, dtd"> 
      <question>
	<formalpara> 
	  <title>What has changed between SGML and XML?</title> 
	  <para>Stricter syntax and no options.</para> </formalpara>
      </question> 
      <answer>
	<para>The main syntactic change is that EMPTY elements in
	  DTDless documents <emphasis>must</emphasis> use the Null
	  End-Tag trick (eg <sgmltag class="emptytag">img
	    src="pic"</sgmltag>) because without a DTD or Schema there
	  is no way for the parser to know not to expect an end-tag.
	  If an element type is declared as EMPTY in the DTD/Schema
	  then it can use <emphasis>either</emphasis> the NET
	  <emphasis>or</emphasis> the full end-tag syntax (eg <sgmltag
	    class="starttag">img src="pic"</sgmltag><sgmltag
	    class="endtag">img</sgmltag>).</para>
	<para>Other syntactic changes are that
	  <emphasis>all</emphasis> attribute values must be quoted;
	  there is no minimisation of attributes or elements; and
	  everything is case-sensitive. One important addition is that
	  multiple ATTLIST  declarations are allowed, so an internal
	  subset can add to the attributes already declared for an
	  element type.</para>
	<para id="restrict">The principal changes in Document Type
	  Definitions (DTDs) are in what you can specify. To simplify
	  it and make it easier to write processing software, a large
	  number of SGML markup declaration options have been
	  suppressed (see the <link linkend="dtdconv"
	    xreflabel="simple">list of omitted features</link>). The
	  biggest change in vocabulary management is the introduction
	  of W3C Schemas, which allow a level of content-type
	  validation not available in DTDs, and are themselves
	  expressed in XML Document Syntax.</para> 
	<para>The main addition here is <link xreflabel="simple"
	    linkend="namespaces">namespaces</link>, which enable
	  Schemas and documents to distinguish element-type and
	  attribute-type source (ownership, origin, or application).
	  This lets you have element types with the same name but
	  different meanings in the same document, eg
	  <sgmltag>DocBook:table</sgmltag> and
	  <sgmltag>TEI:table</sgmltag>. An extra Name Start Character
	  (the colon) was added in XML Names to allow this. Despite
	  its classification, a colon may only appear in mid-name,
	  <emphasis>not</emphasis> at the start or the end, and the
	  prefix <sgmltag>xml:</sgmltag> is Reserved.</para> 
      </answer> 
    </qandaentry>
    <qandaentry id="scripts" remap="FAQ-JAVA, java">
      <question>
	<formalpara>
	  <title>Can I use JavaScript, ActiveX, etc in XML
	    files?</title>
	  <para>Not in the XML file itself, but via a
	    stylesheet.</para>
	</formalpara>
      </question>
      <answer>
	<para>This will depend on what facilities your users' browsers
	  implement. XML is about describing information; scripting
	  languages and languages for embedded functionality are
	  software which enables the information to be manipulated at
	  the user's end, so these languages do not normally have any
	  place in an XML file itself, but in stylesheets like XSL and
	  CSS where they can be added to generated HTML.</para>
	<para>XML itself provides a way to define the markup needed to
	  implement scripting languages: as a neutral standard it
	  neither encourages not discourages their use, and does not
	  favour one language over another, so it is possible to use
	  XML markup to store the program code, from where it can be
	  retrieved by (for example) XSLT and re-expressed in a HTML
	  <sgmltag>script</sgmltag> element.</para>
	<para>Server-side script embedding, like PHP or ASP, can be
	  used with the relevant server to modify the XML code on the
	  fly, as the document is served, just as they can with HTML.
	  Authors should be aware, however, that embedding server-side
	  scripting may mean the file as stored is not valid XML: it
	  only becomes valid when processed and served, so care must
	  be taken when using validating editors or other software to
	  handle or manage such files. A better solution may be to use
	  an XML serving solution like <ulink
	    url="http://cocoon.apache.org/">Cocoon</ulink>, <ulink
	    url="http://axkit.org/">AxKit</ulink>, or <ulink
	    url="http://www.propylon.com/products/propelx/">PropelX</ulink>.</para>
      </answer>
    </qandaentry>
    <qandaentry id="java" remap="java-gen">
      <question>
	<formalpara>
	  <title>Can I use Java to create or manage XML files?</title>
	  <para>Sure.</para>
	</formalpara>
      </question>
      <answer>
	<para>Yes, any programming language can be used to output data
	  from any source in XML format. There is a growing number of
	  front-ends and back-ends for programming environments and
	  data management environments to automate this. Java is just
	  the most popular one at the moment.</para>
	<para>There is a large body of middleware (APIs) written in
	  Java and other languages for managing data either in XML or
	  with XML input or output. There is a suite of Java tutorials
	  (with source code and explanation) available at <ulink
	    url="http://developerlife.com"></ulink>.</para>
	<note>
	  <para>Please do not mail the FAQ editor with questions about
	    your Java programming bugs. Ask one of the Java newsgroups
	    instead.</para>
	</note>
      </answer>
    </qandaentry>
    <qandaentry id="databases" remap="db">
      <question>
	<formalpara>
	  <title>How do I get XML into or out of my database?</title>
	  <para>Ask your database manufacturer</para>
	</formalpara>
      </question>
      <answer remap="mysql msql sql oracle db2 server records">
	<para>Ask your database manufacturer: they all provide XML
	  import and export modules to connect XML applications with
	  databases.</para>
	<para>In some trivial cases there will be a 1:1 match
	  between field names in the database table and element type
	  names in the XML Schema or DTD, but in most cases some
	  programming will be required to establish the desired match.
	  This can usually be stored as a procedure so that subsequent
	  uses are simply commands or calls with the relevant
	  parameters.</para>
	<para>Alternatively, most database systems now provide an XML
	  dump format that lets you export a table as-is, for example
	  by surrounding the field values with tags called after the
	  fieldnames.</para>
	<para>In less trivial, but still simple, cases, you could
	  export by writing a report routine that formats the output
	  as an XML document by adding the relevant tags as literals
	  before and after each data value; and you could import by
	  writing an XSLT transformation that formatted the XML data
	  as a load file in your database's preferred format.</para>
	<warning>
	  <para>Users from a database or computer science background
	    should be aware that XML is not a database management
	    system: it is a text markup system. While there are many
	    similarities, some of the concepts of one are simply
	    non-existent in the other: XML does not possess some
	    database-like features in the same way that databases do
	    not possess markup-like ones. It is a common error to
	    believe that XML is a DBMS like Oracle or Access and
	    therefore possesses the same facilities. It
	    doesn't.</para>
	</warning>
	<para id="dbarts">Database users should read the article
	  <biblioref linkend="docdb"/> [thanks to <personname>
	    <firstname>Bart</firstname>
	    <surname>Lateur</surname>
	  </personname> for identifying this.] <personname>
	    <firstname>Ronald</firstname>
	    <surname>Bourret</surname>
	  </personname> also maintains a good resource on XML and
	  Databases discussing native XML databases at <ulink
	    url="http://www.rpbourret.com/xml/XMLAndDatabases.htm"></ulink>.</para>
	<para id="faq:XQL">There is some information about the <ulink
	    url="http://www.w3.org/XML/Query">XQuery</ulink> (XQL)
	  Language in the <link linkend="searching"
	    xreflabel="simple">note on Searching</link>.</para>
      </answer>
    </qandaentry>
    <qandaentry id="namespaces" remap="namespaces">
      <question> 
	<formalpara> 
	  <title>What's a namespace?</title> 
	  <para>A named DTD/Schema fragment identified by a URI
	    (URL).</para> 
	</formalpara> 
      </question> 
      <answer> 
	<tip xreflabel="Randall Fowle"> 
	  <para>A namespace is a collection of element and attribute
	    names identified by a Uniform Resource Identifier
	    reference. The reference may appear in the root
	    element as a value of the <sgmltag
	      class="attribute">xmlns</sgmltag> attribute. For
	    example, the namespace reference for an XML document
	    with a root element <sgmltag class="gi">x</sgmltag>
	    might appear like this:</para>
	  <programlisting><![CDATA[ 
<x xmlns="http://www.company.com/company-schema">
	    ]]></programlisting> 
	  <para>More than one namespace may appear in a single XML
	    document, to allow a name to be used more than
	    once. Each reference can declare a prefix to be used
	    by each name, so the previous example might appear
	    as</para> 
	  <programlisting><![CDATA[ 
<x xmlns:spc="http://www.company.com/company-schema">
	    ]]></programlisting> 
	  <para>which would nominate the namespace for the
	    <quote>spc</quote> prefix:</para>
	  <programlisting><![CDATA[ 
<spc:name>Mr. Big</spc:name>
	    ]]></programlisting> 
	</tip> 
	<tip xreflabel="James Anderson"> 
	  <para>In general, note that the binding may also be effected
	    by a default value for an attribute in the DTD.</para> 
	  <para id="faq:Namespaces">The reference does not need to be
	    a physical file; it is simply a way to distinguish
	    between namespaces. The reference should tell a person
	    looking at the XML document where to find definitions
	    of the element and attribute names using that
	    particular namespace. <personname>
	      <firstname>Ronald</firstname>
	      <surname>Bourret</surname> </personname> maintains the
	    Namespace FAQ at <ulink id="FAQ:namespaces"
	      url="http://www.rpbourret.com/xml/NamespacesFAQ.htm">http://www.rpbourret.com/xml/NamespacesFAQ.htm</ulink>.</para>
	</tip> 
      </answer> 
    </qandaentry> 
    <qandaentry remap="FAQ-XMLSOFT, xmlsoft" id="software"> 
      <question>
	<formalpara> 
	  <title>What XML software is available?</title> 
	  <para>Thousands of programs: too many to list here.</para>
	</formalpara> 
      </question> 
      <answer remap="vb5 vb6 visual basic"> 
	<para>Hundreds, possibly thousands, of programs. Details are
	  no longer listed in this FAQ as they are now too many and
	  are changing too rapidly to be kept up to date: see the XML
	  Web pages at <ulink
	    url="http://xml.coverpages.org/">http://xml.coverpages.org/</ulink> 
	  and watch for announcements on the <link xreflabel="simple"
	    linkend="discussions">mailing lists and
	    newsgroups</link>.</para> 
	<para>For a detailed guide to some examples of XML programs
	  and the concepts behind them, see the editor's book
	  <biblioref linkend="toolbook"/>.</para> 
	<para>Details of some XML software products are held on the
	  <ulink url="http://xml.coverpages.org/sgml-xml.html">XML Web
	    pages</ulink>. For browsers see the question on <link
	    xreflabel="simple" linkend="browsers">XML Browsers</link>
	  and the details of the <link linkend="discussions"
	    xreflabel="simple">xml-dev mailing list</link> for
	  software developers. Bert Bos keeps <ulink
	    url="http://www.w3.org/XML/notes.html">a list of some XML
	    developments</ulink> in Bison, Flex, Perl, and Python. The
	  long-established conversion and application development
	  engines like Omnimark, and SGMLC all have XML capability and
	  they all provide APIs.</para> 
	<tip id="editors"> 
	  <title>Editors</title> 
	  <para>Choosing an editor is one of the hardest tasks,
	    because everyone has different requirements and levels of
	    knowledge, and what appears to be incredibly simple to one
	    user may seem dauntingly difficult to another. All XML
	    editors guide the user in the construction or maintenance
	    of XML documents&mdash;that's their purpose in
	    life.</para> 
	  <para>The simplest ones just keep track of matching pointy
	    brackets, start-tags and end-tags, and balanced quotes,
	    leading to a <link linkend="wf"
	      xreflabel="simple">well-formed</link> file. More
	    powerful editors can read a DTD or Schema and provide menu
	    choices for element manipulation and attribute editing,
	    and prevent the creation of invalid documents. The most
	    powerful ones can also be used for DTD or Schema
	    development, and for XML processing.</para> 
	  <para>Some are text-mode editors&mdash;they show all the
	    markup and the text with nothing hidden, often using
	    colour to distinguish markup characters. Some have a
	    synchronous typographic mode as well, using a stylesheet
	    to format the information, so you appear to be editing a
	    typeset view of the document (incorrectly called
	    <acronym>WYSIWYG</acronym>). Text-mode editors worry some
	    users because the pointy brackets are visible (they think
	    it's programming); synchronous typographic editors worry
	    other people because the pointy brackets are
	    <emphasis>not</emphasis> visible, which makes it hard to
	    see where stuff begins and ends.</para> 
	  <para>The more sophisticated editors are programmable, so
	    the nature and effect of the markup and the user's actions
	    can be limited or enhanced by scripts in JavaScript,
	    VBscript, Python, Tcl, Lisp, etc, even XSLT.</para> 
	  <para>Do <emphasis>not</emphasis> be tempted to use a
	    non-XML editor like <productname>Notepad</productname>,
	    <productname>vi</productname>, or
	    <productname>textedit</productname> for XML documents: it
	    will only end in tears, shame, and recriminations. Get
	    properly-equipped. (Microsoft's separate <productname>XML
	      Notepad</productname> product <emphasis>is</emphasis>
	    usable for editing small instances, but not for DTD or
	    Schema development.)</para> 
	  <para>There is a fairly recent (2004) <ulink
	      url="http://ota.ahds.ac.uk/documents/creating/xml-editors/index.html">comparative 
	      paper on choosing an XML editor</ulink> from Thijs van
	    den Broek which may help (<ulink
	      url="http://xml.silmaril.ie/Choosing%20an%20XML%20editor%3A%20Thijs%20van%20den%20Broek.html">local 
	      copy</ulink>).</para> </tip>
	<para id="faq:XML-Dutch">There is a page of useful links for
	  XML users in Dutch at <ulink id="FAQ:xml-dutch"
	    url="http://xml.beginthier.nl/">http://xml.beginthier.nl/</ulink>.</para>
	<para id="faq:XML-Chinese">Information for developers of
	  Chinese XML systems can be found at the Chinese XML Now!
	  website of Academia Sinica: <ulink id="FAQ:xml-chinese"
	    url="http://www.ascc.net/xml/">http://www.ascc.net/xml/</ulink> 
	  This site includes a FAQ and test files.</para>
      </answer> 
    </qandaentry> 
    <qandaentry id="docdata" remap="FAQ-API, api, dom"> 
      <question> 
	<formalpara>
	  <title>What is my information? DATA or DOCUMENT?</title> 
	  <para>It depends on what you're using it for.</para>
	</formalpara> 
      </question> 
      <answer remap="nodes apis sax dom data text"> 
	<para>Some important distinctions exist between the major
	  classes of XML applications and the way in which they
	  are used.</para> 
	<para>Two classes of applications are usually referred to as
	  <quote>document</quote> and <quote>data</quote>
	  applications, and this is reflected in the software,
	  which is usually (but not always) aimed at one class
	  or the other.</para> 
	<variablelist> 
	  <varlistentry>
	    <term>Document-style applications</term> <listitem> 
	      <para>These are like traditional publishers' work: text
		and images in a structured environment, with fonts and
		formatting. In most cases this includes Web pages as
		well as material destined for print like books and
		magazines. The hallmark of document applications is
		that they make heavy use of Mixed Content (eg
		subelements in text).</para> 
	    </listitem> 
	  </varlistentry>
	  <varlistentry> 
	    <term>Data-style applications</term>
	    <listitem> 
	      <para>These are found mostly in e-commerce, web
		services, and process or application control, with XML
		being used as a container for information being stored
		or passed between systems, usually unformatted and
		unseen by humans. Their hallmark is the absence of
		Mixed Content, and the prevalence of numeric or
		categorical data.</para> 
	    </listitem> 
	  </varlistentry>
	</variablelist> 
	<para>There is a third major area, Web Development, whose
	  requirements are often hybrid, and span the features
	  of both document and data applications because they
	  contain partly static descriptive text and partly
	  dynamic data.</para> 
	<para>While in theory it would be possible to use data-class
	  software to write a novel, or document-class software
	  to create invoices, it would probably be severely
	  suboptimal. Because of the nature of the information
	  used by the two classes, data-class applications tend
	  to use <link xreflabel="simple"
	    linkend="schemas">Schemas</link>, and document-class
	  applications tend to use <link linkend="dtds"
	    xreflabel="simple">DTDs</link>, but there is a
	  considerable degree of overlap.</para> 
	<para>The way in which XML gets used in these two classes is
	  also divided in two: XML can be used manually or under
	  program control.</para> 
	<variablelist> 
	  <varlistentry>
	    <term>Manual usage</term> 
	    <listitem> 
	      <para>This means editing and maintaining the files with
		an editor, from the keyboard, seeing the information
		on the screen as you do so. This is suitable for
		individual documents, especially in the publishing
		field, for web pages, and for developers working on
		single instances such as sample files or web site
		templates. Manual processing also implies running
		production programs like formatters, converters, and
		database queries on a one-by-one basis, using the
		keyboard and mouse in the normal way. Much of the
		software for manual usage can be run from the command
		line, which makes it easy to use for one-off
		applications and in hidden applications like Web
		scripts.</para> 
	    </listitem>
	  </varlistentry> 
	  <varlistentry> 
	    <term>Programmable usage</term> 
	    <listitem> 
	      <para>This means writing programs which call on software
		services from APIs, libraries, or the network to
		handle XML files from inside the program. XML files in
		data applications are almost never edited by hand.
		This is the normal method of operating for e-commerce
		applications, web automation, web services, and other
		process or application controls. There are libraries
		and APIs for many languages, including Java, C, and
		C++ as well as the usual scripting languages like
		Python, Perl, Tcl, Ruby, etc.</para> 
	    </listitem> 
	  </varlistentry>
	</variablelist> 
	<para>In addition to these axes, there are currently two
	  different ways of processing XML, memory-mapped or
	  event-triggered, usually referred to by the names of their
	  original instantiations, the <ulink
	    url="http://www.w3.org/TR/REC-DOM-Level-1"
	    id="dom">Document Object Model (DOM)</ulink> and the
	  <ulink url="http://www.saxproject.org/">Simple API for XML
	    (SAX)</ulink> respectively. Both use a model of document
	  engineering based on a tree-like structure of hierarchical
	  document markup known as a <ulink
	    url="http://xml.coverpages.org/topics.html#groves">Grove</ulink> 
	  (a collection of trees, effectively an in-memory map of the
	  result of parsing the document markup). In this model, every
	  <wordasword>node</wordasword> (item of information) from the
	  outermost element down through every element and attribute
	  to each piece of unmarked text can be identified. For
	  applications using Schemas, a Post-Schema-Validation Infoset
	  (PSVI, equivalent to a grove) is defined, which specifies
	  what information a parser should make available to the
	  application.</para> 
	<tip xreflabel="Joe Fawcett"> 
	  <para>(in article
	    <literal><![CDATA[<eFIrHKtCGHA.2920@tk2msftngp13.phx.gbl>]]></literal>)</para> 
	  <para>Briefly <wordasword>node</wordasword> is a generic
	    term for any of the many types of XML building blocks,
	    including <firstterm>element</firstterm>: <sgmltag
	      class="emptytag">myElement</sgmltag>;
	    <firstterm>attribute</firstterm>: <sgmltag
	      class="emptytag">myElement
	      myAttribute="myValue"</sgmltag>; and <firstterm>text
	      node</firstterm>: <sgmltag class="element"
	      name="myElement">my Text Node</sgmltag></para> 
	  <para>There are also comments [<firstterm>Comment
	      Declarations</firstterm>], <firstterm>Processing
	      Instructions</firstterm> and the invisible
	    <firstterm>Document Node</firstterm> representing the
	    <firstterm>root</firstterm> of the XML document, as
	    well as others.</para> </tip> 
	<para>Grossly oversimplified, a <firstterm>DOM-based
	    application</firstterm> reads an entire XML document
	  into memory and then provides programmable access to
	  every node in every tree in the grove; whereas a
	  <firstterm>SAX-based application</firstterm> reads the
	  XML document, and events are triggered by the
	  occurrence of nodes as they happen, for which rules or
	  actions have been programmed. (In reality it's more
	  complex than that, and both methods share a lot of
	  concepts in common.)</para> 
	<para>Both models provide an abstract API for constructing,
	  accessing, and manipulating XML documents. A binding
	  of the abstract API to a particular programming
	  language provides a concrete API. Vendors provide
	  concrete APIs which let you use one or other method to
	  query and manipulate XML documents. Both types of
	  parser have been implemented in many languages and
	  under many operating systems and interfaces.  There
	  are FAQs for both <ulink
	    url="http://www.w3.org/DOM/faq.html"
	    id="FAQ:dom">DOM</ulink> and <ulink
	    url="http://www.saxproject.org/faq.html"
	    id="FAQ:sax">SAX</ulink>.</para> 
      </answer>
    </qandaentry> 
    <qandaentry remap="FAQ-SWCHX, mime" id="serversoftware"> 
      <question> 
	<formalpara> 
	  <title>Do I have to change any of my server software to work
	    with XML?</title> 
	  <para>Make sure your server sends XML files as
	    <literal>text/xml</literal></para> 
	</formalpara>
      </question> 
      <answer remap="content-type media-type media content type http https"> 
	<para>If you are just serving static files. the only changes
	  needed are to make sure your server serves up
	  <filename>.xml</filename>, <filename>.css</filename>,
	  <filename>.dtd</filename>, <filename>.xsl</filename>, and
	  whatever other file types you will use as the correct MIME
	  content (media) types.</para> 
	<para>The details of the settings are specified in <ulink
	    url="ftp://ftp.rfc-editor.org/in-notes/rfc3023.txt">RFC
	    3023</ulink>.  Popular server software like Apache HTTPD
	  knows this already.</para> 
	<para>If not, all that is needed is to edit the
	  <filename>mime-types</filename> file (or its
	  equivalent: as a server operator you already know
	  where to do this, right?) and add or edit the relevant
	  lines for the right media types. In some servers (eg
	  Apache), individual content providers or directory
	  owners may also be able to change the MIME types for
	  specific file types from within their own directories
	  by using directives in a
	  <filename>.htaccess</filename> file. The media types
	  required are:</para> 
	<itemizedlist> 
	  <listitem> 
	    <para><literal>text/xml</literal> for XML documents which
	      are <quote>readable by casual users</quote>;</para>
	  </listitem> 
	  <listitem> 
	    <para><literal>application/xml</literal> for XML documents
	      which are <quote>unreadable by casual
		users</quote>;</para> 
	  </listitem> 
	  <listitem> 
	    <para><literal>text/xml-external-parsed-entity</literal>
	      for external parsed entities such as document
	      fragments (eg separate chapters which make up a book)
	      subject to the readability distinction of
	      <literal>text/xml</literal>;</para> 
	  </listitem>
	  <listitem> 
	    <para><literal>application/xml-external-parsed-entity</literal>
		for external parsed entities subject to the
		readability distinction of
		<literal>application/xml</literal>;</para> 
	  </listitem>
	  <listitem> 
	    <para><literal>application/xml-dtd</literal> for DTD files
		and modules, including character entity sets.</para>
	  </listitem> 
	</itemizedlist> 
	<para>The RFC has further suggestions for the use of the
	  <literal>+xml</literal> media type suffix for
	  identifying ancillary files such as XSLT
	  (<literal>application/xslt+xml</literal>). </para> 
	<para>If you run scripts generating XHTML which you wish to be
	  treated as XML rather than HTML, they may need to be
	  modified to produce the relevant Document Type
	  Declaration as well as the right media type if your
	  application requires them to be validated.</para>
      </answer> 
    </qandaentry> 
    <qandaentry remap="FAQ-SSINCLUDES, ssincludes" id="serverincludes"> 
      <question> 
	<formalpara>
	  <title>Can I still use server-side inclusions?</title> 
	  <para>Yes, just make sure the output conforms to XML</para>
	</formalpara> 
      </question> 
      <answer> 
	<para>Yes, so long as what they generate ends up as part of an
	  XML-conformant file (ie either <link xreflabel="simple"
	    linkend="valid">valid</link> or just <link
	    xreflabel="simple"
	    linkend="wf">well-formed</link>).</para> 
	<para>Server-side tag-replacer scripting languages like shtml,
	  PHP, JSP, ASP, Zope, etc store almost-valid files using
	  comments, Processing Instructions, or non-XML markup, which
	  gets replaced at the point of service by text or XML markup
	  (it is unclear why some of these systems use non-HTML/XML
	  markup). There are also some XML-based preprocessors for
	  formats like <ulink url="http://www.xvrl.org">XVRL</ulink>
	  (eXtensible Value Resolution Language) which resolve
	  specialised references to external data and output a
	  normalised XML file.</para> 
      </answer> 
    </qandaentry> 
    <qandaentry remap="FAQ-CSINCLUDES, csincludes"
	  id="clientincludes"> 
      <question> <formalpara>
	  <title>Can I (and my authors) still use client-side
	    inclusions?</title> 
	  <para>Yes, just make sure the output conforms to XML</para>
	</formalpara> 
      </question> 
      <answer remap="vb5 vb6 visual basic"> 
	<para>The same rule applies as for <link xreflabel="simple"
	    linkend="serverincludes">server-side</link>
	  inclusions, so you need to ensure that any embedded
	  code which gets passed to a third-party engine (eg
	  calls to SQL, VB, Java, etc) does not contain any
	  characters which might be misinterpreted as XML markup
	  (ie no angle brackets or ampersands). Either use a
	  CDATA marked section to avoid your XML application
	  parsing the embedded code, or use the standard
	  <sgmltag class="genentity">lt</sgmltag>, and <sgmltag
	    class="genentity">amp</sgmltag> character entity
	  references instead.</para> 
      </answer> 
    </qandaentry> 
    <qandaentry id="management"> 
      <question> 
	<formalpara>
	  <title>I have to do an overview of XML for my
	    manager/client/investor/advisor. What should I
	    mention?</title> 
	  <para>Non-proprietary multi-purpose flexible markup</para>
	</formalpara> 
      </question> 
      <answer> 
	<tip xreflabel="Tad McClellan"> 
	  <itemizedlist> 
	    <listitem> 
	      <para>XML is <emphasis>not</emphasis> a markup
		language. XML is a <quote>metalanguage</quote>, that
		is, it's a language that lets you define
		<emphasis>your own</emphasis> markup languages (see
		<link xreflabel="simple"
		  linkend="whatishtml">definition</link>).</para>
	    </listitem> 
	    <listitem> 
	      <para>XML <emphasis>is</emphasis> a markup language [two
		(seemingly) contradictory statements one after another
		is an attention-getting device that I'm fond of],
		<emphasis>not</emphasis> a programming language. XML
		is data: is does not <quote>do</quote> anything, it
		has things done to it.</para> 
	    </listitem> 
	    <listitem> 
	      <para>XML is non-proprietary: your data cannot be held
		hostage by someone else.</para> 
	    </listitem> 
	    <listitem> 
	      <para id="multi">XML allows multi-purposing of your
		data.</para> 
	    </listitem> 
	    <listitem> 
	      <para id="sep">Well-designed XML applications most often
		separate <quote>content</quote> from
		<quote>presentation</quote>. You should describe what
		something <emphasis>is</emphasis> rather what
		something <emphasis>looks like</emphasis> (the
		exception being numerical or categorical data content
		which never gets presented to humans).</para> 
	    </listitem>
	  </itemizedlist> 
	</tip> 
	<para>Saying <quote>the data is in XML</quote> is a relatively
	  useless statement, similar to saying <quote>the book
	    is in a natural language</quote>. To be useful, the
	  former needs to specify <quote>we have used XML to
	    define our own markup language</quote> (and say what
	  it is), similar to specifying <quote>the book is in
	    French</quote>.</para> 
	<para>A classic example of <link xreflabel="simple"
	    linkend="multi">multipurposing</link> and <link
	    xreflabel="simple" linkend="sep">separation</link>
	  that I often use is a pharmaceutical company. They
	  have a large base of data on a particular drug that
	  they need to publish as:</para> 
	<itemizedlist>
	  <listitem> 
	    <para>reports to the FDA;</para> 
	  </listitem> 
	  <listitem> 
	    <para>drug information for publishers of drug
		directories/catalogs;</para> 
	  </listitem> 
	  <listitem> 
	    <para><quote>prescribe me!</quote> brochures to send to
		doctors;</para> 
	  </listitem> 
	  <listitem> 
	    <para>little pieces of paper to tuck into the
	      boxes;</para> 
	  </listitem> 
	  <listitem> 
	    <para>labels on the bottles;</para> 
	  </listitem> 
	  <listitem> 
	    <para>two pages of fine print to follow their ad in
	      Reader's Digest;</para> 
	  </listitem> 
	  <listitem> 
	    <para>instructions to the patient that the local
	      pharmacist prints out;</para> 
	  </listitem> 
	  <listitem> 
	    <para>etc.</para> 
	  </listitem> 
	</itemizedlist> 
	<para>Without separation of content and presentation, they
	  need to maintain essentially identical information in
	  20 places. If they miss a place, people die, lawyers
	  get rich, and the drug company gets poor. With XML (or
	  SGML), they maintain one set of carefully validated
	  information, and write 20 programs to extract and
	  format it for each application. The same 20 programs
	  can now be applied to all the hundreds of drugs that
	  they sell.</para> 
	<para>In the Web development area, the biggest thing that XML
	  offers is fixing what is wrong with HTML:</para>
	<itemizedlist> 
	  <listitem> 
	    <para>browsers allow non-compliant HTML to be
	      presented;</para> 
	  </listitem> 
	  <listitem> 
	    <para>HTML is restricted to a single set of markup
	      (<quote>tagset</quote>).</para> 
	  </listitem>
	</itemizedlist> 
	<para>If you let broken HTML work (be presented), then there
	  is no motivation to fix it. Web pages are therefore
	  tag soup that are useless for further processing. XML
	  specifies that processing must not continue if the XML
	  is non-compliant, so you keep working at it until it
	  complies. This is more work up front, but the result
	  is not a dead-end.</para> 
	<para>If you wanted to mark up the names of things: people,
	  places, companies, etc in HTML, you don't have many
	  choices that allow you to distinguish among them. XML
	  allows you to name things as what they are:</para>
	<programlisting><![CDATA[ 
<person>Charles	Goldfarb</person> worked at <company>IBM</company>
	  ]]></programlisting> 
	<para>gives you a flexibility that you don't have with
	  HTML:</para> 
	<programlisting><![CDATA[ 
<B>Charles Goldfarb</B> worked at <B>IBM</B> 
	  ]]></programlisting> 
	<para>With XML you don't have to shoe-horn your data into
	  markup that restricts your options.</para> 
      </answer>
    </qandaentry> 
    <qandaentry id="conformance" remap="test"> 
      <question> 
	<formalpara> 
	  <title>Is there a conformance test suite for XML
	    processors?</title> 
	  <para>Yes, see <ulink
	      url="http://www.oasis-open.org/committees/xmltest/testsuite.htm">http://www.oasis-open.org/committees/xmltest/testsuite.htm</ulink></para>
	</formalpara> 
      </question> 
      <answer> 
	<para><personname> <firstname>James</firstname>
	    <surname>Clark</surname> </personname> has a
	  collection of test cases for testing XML parsers at
	  <ulink
	    url="http://www.jclark.com/xml/">http://www.jclark.com/xml/</ulink>
	  which includes a conformance test against
	  <quote>canonical XML</quote>.</para> 
	<tip xreflabel="Mary Brady" id="conftest"> 
	  <para>A much larger and more comprehensive suite is the
	    NIST/OASIS Conformance Test Suite, available from
	    <ulink
	      url="http://www.oasis-open.org/committees/xmltest/testsuite.htm">http://www.oasis-open.org/committees/xmltest/testsuite.htm</ulink>,
	    which contains contributions from <personname>
	      <firstname>James</firstname> <surname>Clark</surname>
	    </personname>, OASIS and NIST, Sun, and Fuji
	    Xerox.</para> 
	</tip> 
	<tip xreflabel="Carmelo Montanez"> 
	  <para>NIST has developed a number of XSLT/XPath tests, which
	    will be part of the official OASIS XSLT/XPath suite
	    (not yet released).  These tests are available from
	    our web site at <ulink
	      url="http://xw2k.sdct.itl.nist.gov/xml/index.html">http://xw2k.sdct.itl.nist.gov/xml/index.html</ulink>
	    (click on <quote>XSL Testing</quote>). The expected
	    output may be slightly different from one
	    implementation to another.  The OASIS XSLT technical
	    committee has a solution for that problem, however our
	    tests do not yet implement such solution. Please
	    forward any comments to <ulink
	      url="carmelo@nist.gov"></ulink>.</para> 
	</tip> 
	<tip xreflabel="Jon Noring"> 
	  <para>For those who are interested, I took the current and
	    complete Unicode 3.0 <quote>cast</quote> of characters
	    and their hex codes, and created a simple XML document
	    of it to test XML browsers for Unicode conformity. It
	    is not finished yet&mdash;I need to add comments and
	    to fix the display of rtl characters (ie Hebrew,
	    Arabic). It is found at: <ulink
	      url="http://www.windspun.com/unicode-test/unicode.xml">http://www.windspun.com/unicode-test/unicode.xml</ulink>. It
	    is quite large, almost 900K in size, so be
	    prepared. IE5 renders many of the characters in this
	    XML document&mdash;and for the ones it does render it
	    appears to do so correctly.  I look forward to when
	    Opera will do likewise.  I haven't tested the current
	    version of Mozilla/Netscape for Unicode
	    conformity.</para> 
	</tip> 
      </answer> 
    </qandaentry>
    <qandaentry id="dtdconv" remap="dtdconv"> 
      <question>
	<formalpara> 
	  <title>I've already got SGML DTDs: how do
	    I convert them for use with XML?</title> 
	  <para>Edit by hand or use software like Near+Far
	    Designer.</para> 
	</formalpara> 
      </question> 
      <answer remap="internalsubset"> 
	<para>There are numerous projects to convert common or popular
	  SGML DTDs to XML format (for example, both the <ulink
	    url="http://www.tei-c.org/">TEI DTD</ulink> (Lite and
	  full versions) and the <ulink
	    url="http://www.docbook.org/">DocBook DTD</ulink> are
	  available in both SGML and XML, in Schema and DTD
	  formats).</para> 
	<tip xreflabel="Seán McGrath">
	  <title>To convert SGML DTDs to XML:</title>
	  <orderedlist> 
	    <listitem> 
	      <para>No equivalent of the SGML Declaration. So
		keywords, character set etc are essentially
		fixed;</para> 
	    </listitem> 
	    <listitem> 
	      <para>Tag minimisation is not allowed, so
		<programlisting><![CDATA[
<!ELEMENT x - O	(A,B)>
		  ]]></programlisting> becomes
		<programlisting><![CDATA[
<!ELEMENT X (A,B)>
		  ]]></programlisting> and
		<programlisting><![CDATA[
<!ELEMENT x - O	EMPTY>
		  ]]></programlisting> becomes
		<programlisting><![CDATA[
<!ELEMENT X EMPTY>
		  ]]></programlisting>;</para> 
	    </listitem>
	    <listitem> 
	      <para id="mixedcont"><sgmltag>#PCDATA</sgmltag> must
		only occur at the extreme left (ie first) in an OR
		model, eg <programlisting><![CDATA[
<!ELEMENT x - -	(A|B|#PCDATA|C)>
		  ]]></programlisting> (in SGML) becomes
		<programlisting><![CDATA[
<!ELEMENT x (#PCDATA|A|B|C)*>
		  ]]></programlisting>, and
		<programlisting><![CDATA[
<!ELEMENT x (A,#PCDATA)>
		  ]]></programlisting> is illegal;</para>
	    </listitem> 
	    <listitem> 
	      <para>No CDATA, RCDATA elements [declared
		content];</para> 
	    </listitem> 
	    <listitem> 
	      <para>Some SGML attribute types are not allowed in XML
		eg NUTOKEN;</para> 
	    </listitem> 
	    <listitem> 
	      <para>Some SGML attribute defaults are not allowed in
		XML eg CONREF and CURRENT;</para> 
	    </listitem> 
	    <listitem> 
	      <para>Comments cannot be inline to declarations like
		<programlisting><![CDATA[
<!ELEMENT x - - (A,B) -- an SGML comment in a declaration -->
		  ]]></programlisting>;</para> 
	    </listitem> 
	    <listitem> 
	      <para>A whole bunch of SGML optional features are not
		present in XML: all forms of tag minimisation
		(OMITTAG, DATATAG, SHORTREF, etc); Link Process
		Definitions; Multiple DTDs per document; and many
		more: see <ulink
		  url="http://www.w3.org/TR/NOTE-sgml-xml-971215"
		  id="howto"></ulink> for the list of bits of SGML that
		were removed for XML;</para> 
	    </listitem> 
	    <listitem> 
	      <para>And [nearly] last but not least, no CONCUR!</para>
	    </listitem> 
	    <listitem> 
	      <para>There are some important differences between the
		internal and external subset portion of a DTD in XML:
		Marked Sections can only occur in the external subset;
		and Parameter Entities must be used to replace entire
		declarations in the internal subset portion of a DTD,
		eg the following is invalid XML:</para>
	      <programlisting><![CDATA[ 
<!DOCTYPE x [ 
<!ENTITY % modelx "(A|B)*"> 
<!ELEMENT x %modelx;> 
]> 
<x></x>
		]]></programlisting> 
	    </listitem> 
	  </orderedlist> 
	  <para>For more information, see <biblioref
	      linkend="xmlexample"/>.</para> 
	</tip> 
      </answer>
    </qandaentry> 
    <qandaentry id="dtdincludes" remap="includes"> 
      <question> 
	<formalpara> 
	  <title>How do I include one DTD (or fragment) in
	    another?</title> 
	  <para>Use a parameter entity, same as for SGML</para>
	</formalpara> 
      </question> 
      <answer> 
	<para>This works exactly the same as for SGML. First you
	  declare the entity you want to include, and then you
	  reference it by name as a parameter entity:</para>
	<programlisting><![CDATA[ 
<!ENTITY % mylists SYSTEM "dtds/listfrag.ent"> 
... 
%mylists;
	  ]]></programlisting> 
	<para>Such declarations traditionally go all together towards
	  the top of the main DTD file, where they can be
	  managed and maintained, but this is not essential so
	  long as they are declared before they are used. You
	  use Parameter Entity Syntax for this (the percent
	  sign) because the file is to be included at DTD
	  compile time, not when the document instance itself is
	  parsed.</para> 
	<para>Note that a URI is compulsory in XML as the System
	  Identifier for all external file references: standard
	  rules for dereferencing URIs apply (assume the same
	  method, server, and directory as the containing
	  document). A Formal Public Identifier can also be
	  used, following the same rules as <link linkend="fpis"
	    xreflabel="simple">elsewhere</link>.</para> 
      </answer>
    </qandaentry> 
    <qandaentry id="conditionals" >
      <question> 
	<formalpara> 
	  <title>How can I include a conditional statement in my
	    XML?</title> 
	  <para>You can't, as such: XML isn't a programming language.</para> 
	</formalpara> 
      </question> 
      <answer remap="conditionals conditional logic"> 
	<para>You can't as such: <link linkend="execute"
	    xreflabel="simple">XML isn't a programming
	    language</link>, so you can't say things like</para>
	<programlisting conformance="no"><![CDATA[ 
<foo if{DB}="A">bar</foo> 
	  ]]></programlisting>
	<para>But you can have conditional criteria in a Schema, DTD, or
	    a processor, and some DTDs provide attributes for
	    conditional processing.</para>
	<para>If you need to make an element optional, based on some
	  internal or external criteria, you can do so in a
	  Schema. DTDs have no internal referential mechanism,
	  so it isn't possible to express this kind of
	  conditionality in a DTD at the individual element
	  level.</para> 
	<para>It <emphasis>is</emphasis> possible to express
	  presence-or-absence conditionality in a DTD for the
	  whole document, by using Parameter Entities as Boolean
	  switches to include or ignore certain sections of the
	  DTD based on settings either hardwired in the DTD or
	  supplied in the internal subset. Both the TEI and
	  Docbook DTDs have used this mechanism to implement
	  modularity.</para> 
	<para>Alternatively you can make the element entirely optional
	  in the DTD or Schema, and provide code in your
	  processing software that checks for its presence or
	  absence. This defers the checking until the processing
	  stage: one of the reasons for Schemas is to provide
	  this kind of checking at the time of document creation
	  or editing.</para>
	<para>In processing languages such as XSLT, there are
	  constructs for conditional processing, both for simple IFs and
	  for exclusive case-by-case choices:</para>
	<programlisting><![CDATA[
<xsl:if test="@foo='bar'">
  <xsl:text>Hello, world!</xsl:text>
</xsl:if>

<xsl:choose>
  <xsl:when test="$type=1">
    <xsl:apply-templates select="//*[@class='special']"/>
  </xsl:when>
  <xsl:when test="$type=2">
    <xsl:apply-templates select="/foo/bar"/>
  </xsl:when>
  <xsl:otherwise>
    <xsl:apply-templates/>
  </xsl:otherwise>
</xsl:choose>
	]]></programlisting>
	<para>DocBook and many other DTDs and Schemas provide
	  attributes on some elements for the specification of
	  <firstterm>effectivities</firstterm>, saying which parts of
	  the document apply in which circumstances. Processing
	  software can then isolate these and process them
	  accordingly.</para>
      </answer> 
    </qandaentry> 
    <qandaentry id="edi" remap="edi"> 
      <question> 
	<formalpara>
	  <title>What's the story on XML and EDI?</title> 
	  <para>Getting there: still needs more work and
	    agreement.</para> 
	</formalpara> 
      </question> 
      <answer> 
	<para>Electronic Data Interchange has been used in e-commerce
	  for many years to exchange documents between commercial
	  partners to a transaction. It requires special proprietary
	  software and is prohibitively expensive to implement for
	  small and medium-sized enterprises. There are moves to
	  enable EDI documents to travel inside XML, as well as
	  proposals to replace the existing EDI formats with XML ones.
	  There are guideline documents at  <ulink
	    url="http://www.eccnet.com/xmledi/guidelines-styled.xml"></ulink> 
	  and <ulink
	    url="http://www.geocities.com/WallStreet/Floor/5815/guide.htm"></ulink>.</para> 
	
	<para>Probably the biggest effect on EDI is the rise of
	  standardisation attempts for XML business documents and
	  transactions. The standard jointly sponsored by OASIS and
	  United Nations/CEFACT is <ulink
	    url="http://www.ebxml.org/">ebXML</ulink> (Electronic
	  Business XML) which provides Schemas for the common
	  commercial transaction document types. Normal office
	  documents (letters, reports, spreadsheets, etc) are already
	  being done using the materials under the charge of the OASIS
	  Open Office XML Formats TC, detailed <link
	    xreflabel="simple" linkend="officeapps">above</link>.
	  Other standards such as <ulink
	    url="http://www.openapplications.org">OAGI</ulink> and
	  <ulink url="http://www.rosettanet.org">RosettaNet</ulink>
	  are undergoing interoperability testing with ebXML.</para> 
	<para>In addition to full standards, there are many sets of
	  shims, interoperability tools, and component libraries such
	  XML Common Business Library (<ulink
	    url="http://www.xcbl.org/">xCBL</ulink>).</para>
      </answer> 
    </qandaentry> 
  </qandadiv> 
  <qandadiv id="appendices" remap="FAQ-FORM, app">
    <title>Appendices</title> 
    <qandaentry id="bibliography"> 
      <question> 
	<formalpara>
	  <title>References</title> 
	  <para>There is a much larger XML and SGML bibliography at
	    <ulink
	      url="http://xml.coverpages.org/biblio.html"></ulink>.</para>
	</formalpara> 
      </question> 
      <answer> 
	<para>This list covers only documents directly referenced in
	  this FAQ.</para> 
	<bibliodiv> 
	  <biblioentry id="toolbook" role="book"> 
	    <author>
	      <firstname>Peter</firstname> 
	      <surname>Flynn</surname>
	    </author> 
	    <title>Understanding SGML and XML Tools</title>
	    <publisher>
	      <publishername>Kluwer</publishername> 
	      <address>Boston, MA</address> 
	    </publisher> 
	    <date>1998</date>
	    <isbn>0-7923-8169-6</isbn>
	    <releaseinfo>http://www.amazon.com/exec/obidos/tg/detail/-/0792381696/qid=1128202814/sr=1-1/ref=sr_1_1/102-0476289-3244914?v=glance&amp;s=books</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="devdtd" role="book">
	    <authorgroup> 
	      <author> 
		<firstname>Eve</firstname>
		<surname>Maler</surname> 
	      </author> 
	      <author>
		<firstname>Jeanne</firstname> 
		<surname remap="preserve">el Andaloussi</surname> 
	      </author>
	    </authorgroup> 
	    <title>Developing SGML DTDs</title>
	    <subtitle>From Text to Model to Markup</subtitle>
	    <publisher> 
	      <publishername>Prentice Hall PTR</publishername> 
	      <address>Upper Saddle River, NJ</address> 
	    </publisher> 
	    <date>1995</date>
	    <isbn>0133098818</isbn>
	    <releaseinfo>http://www.amazon.com/exec/obidos/tg/detail/-/0133098818/qid=1104447963/sr=8-1/ref=sr_8_xs_ap_i1_xgl14/002-9386245-9385639?v=glance&amp;s=books&amp;n=507846</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="esl" role="book">
	    <author> 
	      <firstname>Lynne</firstname>
	      <surname>Truss</surname> 
	    </author> 
	    <title>Eats, Shoots &ampers; Leaves</title> 
	    <subtitle>The Zero-Tolerance Approach to
	      Punctuation</subtitle> 
	    <publisher>
	      <publishername>Profile Books</publishername>
	      <address>London</address> 
	    </publisher>
	    <date>2003</date> 
	    <isbn>1-86197-612-7</isbn>
	    <releaseinfo>http://www.amazon.com/exec/obidos/tg/detail/-/1592400876/qid=1104449308/sr=8-1/ref=pd_csp_1/002-9386245-9385639?v=glance&amp;s=books&amp;n=507846</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="docdb" role="inproceedings"> 
	    <articleinfo> 
	      <authorgroup>
		<author> 
		  <firstname>Airi</firstname>
		  <surname>Salminen</surname> 
		</author> 
		<author>
		  <firstname>Frank</firstname>
		  <surname>Tompa</surname>
		</author> 
	      </authorgroup> 
	      <title>Requirements for XML Document Database
		Systems</title>
	      <releaseinfo>http://db.uwaterloo.ca/~fwtompa/.papers/xmldb-desiderata.pdf</releaseinfo>
	    </articleinfo> 
	    <confgroup> 
	      <conftitle>ACM Symposium on Document
		Engineering</conftitle> 
	      <address>Atlanta, GA</address> 
	      <confdates>November 2001</confdates>
	    </confgroup> 
	  </biblioentry> 
	  <biblioentry id="xmlann" role="book"> 
	    <author>
	      <firstname>Bob</firstname>
	      <surname>DuCharme</surname> 
	    </author> 
	    <title>XML: The Annotated Specification</title> 
	    <publisher>
	      <publishername>Prentice Hall PTR</publishername>
	      <address>Upper Saddle River, NJ</address> 
	    </publisher>
	    <date>1999</date> 
	    <isbn>0-13-082676-6</isbn>
	    <releaseinfo>http://www.snee.com/bob/xmlann</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="xmlexample" role="book"> 
	    <author>
	      <firstname>Se&aacute;n</firstname>
	      <surname>McGrath</surname> 
	    </author> 
	    <title>XML by Example</title> 
	    <subtitle>Building E-Commerce Applications</subtitle> 
	    <publisher>
	      <publishername>Prentice Hall PTR</publishername>
	      <address>Upper Saddle River, NJ</address> 
	    </publisher>
	    <date>1998</date> 
	    <isbn>0139601627</isbn>
	    <releaseinfo>http://www.amazon.com/exec/obidos/tg/detail/-/0139601627/qid=1104449400/sr=8-1/ref=sr_8_xs_ap_i1_xgl14/002-9386245-9385639?v=glance&amp;s=books&amp;n=507846</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="nopres" role="inproceedings"> 
	    <articleinfo> 
	      <author>
		<firstname>Peter</firstname> 
		<surname>Flynn</surname>
	      </author> 
	      <title>Making more use of markup</title>
	      <artpagenums>158&ndash;167</artpagenums>
	      <releaseinfo>http://imbolc.ucc.ie/~pflynn/articles/moreuse.html</releaseinfo>
	    </articleinfo> 
	    <confgroup>
	      <conftitle>SGML'95</conftitle> 
	      <address>Boston, MA</address> 
	      <confdates>December 1995</confdates>
	    </confgroup> 
	  </biblioentry> 
	  <biblioentry id="richsgml" role="inproceedings"> 
	    <articleinfo> 
	      <author>
		<firstname>Chet</firstname> 
		<surname>Ensign</surname>
	      </author> 
	      <title>If SGML Is So Smart, How Come It Ain't
		Rich?</title>
	      <artpagenums>136&ndash;145</artpagenums>
	    </articleinfo> 
	    <confgroup>
	      <conftitle>SGML'95</conftitle> 
	      <address>Boston, MA</address> 
	      <confdates>December 1995</confdates>
	    </confgroup> 
	  </biblioentry> 
	  <biblioentry role="book" id="fox"> 
	    <author>
	      <firstname>Dave</firstname>
	      <surname>Pawson</surname> 
	    </author>
	    <title>XSL-FO</title> 
	    <subtitle>Making XML Look Good in Print</subtitle> 
	    <publisher>
	      <publishername>O'Reilly</publishername>
	      <address>Sebastopol, CA</address> 
	    </publisher>
	    <date>2002</date> 
	    <isbn>0-596-00355-2</isbn>
	    <releaseinfo>http://www.oreilly.com/catalog/xslfo/</releaseinfo>
	  </biblioentry> 
	  <biblioentry id="tei" role="inbook">
	    <articleinfo> 
	      <authorgroup> 
		<editor>
		  <firstname>Michael</firstname>
		  <surname>Sperberg-McQueen</surname> 
		</editor>
		<editor>
		  <firstname>Lou</firstname>
		  <surname>Burnard</surname>
		</editor> 
	      </authorgroup> 
	      <title>Gentle Introduction to XML</title>
	      <releaseinfo>http://www.tei-c.org/release/doc/tei-p5-doc/en/html/SG.html</releaseinfo>
	      <artpagenums></artpagenums> 
	    </articleinfo> 
	    <title>TEI P4: Guidelines for Electronic Text Encoding and
	      Interchange</title> 
	    <publisher> 
	      <publishername>Text Encoding Initiative
		Consortium</publishername>
	      <address>Oxford, Providence, Charlottesville,
		Bergen</address> 
	    </publisher> 
	    <date>2002</date>
	  </biblioentry> 
	  <biblioentry id="thespec" role="techreport"> 
	    <authorgroup> 
	      <editor>
		<firstname>Tim</firstname> 
		<surname>Bray</surname>
	      </editor> 
	      <editor> 
		<firstname>Jean</firstname>
		<surname>Paoli</surname> 
	      </editor> 
	      <editor>
		<firstname>CM</firstname>
		<surname>Sperberg-McQueen</surname> 
	      </editor> 
	      <editor>
		<firstname>Eve</firstname> 
		<surname>Maler</surname>
	      </editor> 
	      <editor> 
		<firstname>François</firstname>
		<surname>Yergeau</surname> 
	      </editor> 
	    </authorgroup>
	    <title>Extensible Markup Language (XML) 1.0</title>
	    <releaseinfo>http://www.w3.org/TR/REC-xml/</releaseinfo>
	    <publisher> 
	      <publishername>W3C</publishername>
	      <address>Boston</address> 
	    </publisher>
	    <edition>3rd</edition> 
	    <date>4 February 2004</date>
	  </biblioentry> 
	</bibliodiv> 
      </answer> 
    </qandaentry>
    <qandaentry id="future"> 
      <question> 
	<formalpara>
	  <title>How far are we going?</title> 
	  <para>To infinity and beyond!</para> 
	</formalpara>
      </question> 
      <answer remap="sex pornography pornographic pictures anal"> 
	<para>Running a search facility on this FAQ has produced some
	  interesting results from the notifications of both
	  matches and non-matches. <ulink
	    url="http://dylan.tweney.com/prophet/981019prophet.htm">Sex</ulink>
	  has dropped to 10th place.</para> 
	<itemizedlist>
	  <listitem> 
	    <para>The most frequent request (5&percnt; overall) is now
	      individual characters, either as character entity
	      names or as numeric values, or one of the markup
	      characters (<literal>&lt;</literal> or
	      <literal>&amp;</literal>).</para> 
	  </listitem>
	  <listitem> 
	    <para>In recent months the second largest category has
		stabilised as the word <literal>dtd</literal>
		(3&percnt;).</para> 
	  </listitem> 
	  <listitem> 
	    <para>Third comes CDATA at 2&percnt; (hardly surprising
	      given the abuse so widespread).</para> 
	  </listitem>
	  <listitem> 
	    <para>Fourth equal at 1&percnt; come XSD and XSL, neither
	      of which is dealt with in detail here as they have
	      their own FAQs.</para> 
	  </listitem> 
	</itemizedlist> 
	<para id="lite">The entertaining bits are deep in the tail,
	  like the user from Broomfield, CO, who typed in
	  <quote>How can I analyze a telephone to understand it
	    better?</quote> (taking it to pieces is probably a
	  start); the one from the Phillipines who wanted to
	  know how to <quote>describe the five fundamental
	    interactions between X-rays or Gamma rays with
	    matter</quote> (try DS9); the one from Culver City,
	  CA, who asked <quote>how are echinodermata organisms
	    different from lower invertebrates?</quote> (like I
	  care?); and the one from Lexington, KY, who asked
	  <quote>How do I add two text fields?</quote> (got me
	  there, d00d, how do you multiply a lettuce and a
	  cucumber?).</para> 
      </answer> 
      <answer>
	<programlisting><![CDATA[ 
Date: Fri, 09 Jul 1999 14:26:17 -0500 (EST) 
From: The Internet Oracle <oracle@cs.indiana.edu> 
Subject: The Oracle replies!
To: <address-removed> 
X-Planation: X-Face can be viewed with ftp.cs.indiana.edu:/pub/faces. 

The Internet Oracle has pondered your question
deeply. Your question was: 

> Oh Oracle most wise, all-seeing and all-knowing, 
> in thy wisdom grant me a response to my request: 
> 
> Is XML really going to cut the mustard? 

And in response, thus spake the Oracle: 
Well, since XML is a subset of SGML, and SGML 
has a <cut mustard> tag, I'd have to say yes.

You owe the Oracle a B1FF parser. 
	  ]]></programlisting> 
	<para>For the SGML-curious among our readers, that's:</para>
	<programlisting><![CDATA[ 
<!element cut - o empty>
<!attlist cut mustard (mustard) #required> 
<!-- :-) --> 
	  ]]></programlisting> 
      </answer> 
    </qandaentry>
    <qandaentry id="glossary"> 
      <question> 
	<formalpara>
	  <title>Not the XML FAQ</title> 
	  <para>Infrequently Asked Questions</para> </formalpara>
      </question> 
      <answer remap="infrequently"> 
	<para>This is a list of topics that people have asked about or
	  searched for in relation to the XML FAQ, which are not
	  necessarily directly connected to XML and its
	  technology, nor <emphasis>frequently</emphasis> asked
	  questions. It also includes some fall-back definitions
	  for the benefit of users who have come to XML by
	  different routes and may not have been exposed to ay
	  document publishing background.</para> 
	<para>Readers may also want to look at <personname>
	    <firstname>Joe</firstname> <surname>English</surname>
	  </personname>'s <quote>Not the SGML FAQ</quote> at
	  <ulink
	    url="http://www.flightlab.com/~joe/sgml/faq-not.txt"></ulink>.</para>
	<glosslist> 
	  <glossentry id="xls"> 
	    <glossterm remap="xls export convert">XLS</glossterm> 
	    <glossdef> 
	      <para>Microsoft proprietary spreadsheet file format
		written by their <productname>Excel</productname>
		spreadsheet program. XLS files are not XML files, but
		modern versions of <productname>Excel</productname> 
		save their data in Microsoft's own Office XML
		format (OOXML).</para> 
	      <para>Do not confuse XLS with XSL (see <link
		linkend="style"></link>).</para> 
	    </glossdef>
	  </glossentry> 
	  <glossentry id="xml"> 
	    <glossterm
	      remap="faq">XML</glossterm> <glossdef> 
	      <para>This is the XML FAQ. Everything in it is about
		XML. For introductory explanations, see <link
		  linkend="basics" xreflabel="simple"></link>.</para>
	    </glossdef>
	  </glossentry> 
	  <glossentry id="color"> 
	    <glossterm
	      remap="colors colours">Colour</glossterm> 
	    <glossdef> 
	      <para>XML is designed for identifying information about
		the structure and content of text documents, rather
		than their appearance. Although it is perfectly
		possible to identify and store information about
		appearances, this information is usually kept in a CSS
		or XSL stylesheet. If you need to record information
		about the formatting or appearance of an existing
		document, there are features in the <ulink
		url="http://www.tei-c.org/">TEI</ulink> Schema/DTD for
		doing so.</para> </glossdef> </glossentry> <glossentry
		id="editing"> <glossterm remap="opening
		docs">Editing</glossterm> <glossdef> 
	      <para>To edit (open) an XML file you should use an <link
		xreflabel="simple" linkend="editors">XML
		editor</link>. It is possible to open an XML file
		using any standard plaintext editor or even a
		wordprocessor, but be aware that they may try to
		reformat the file incorrectly because they don't
		understand XML.</para> </glossdef> </glossentry>
		<glossentry id="games"> <glossterm
		remap="nintendo">Games</glossterm> <glossdef> 
	      <para>I am not aware of any computer games written using
		XML yet, although XML is used in some of the
		internal control and configuration files used by
		games.</para> </glossdef> </glossentry> <glossentry
		id="soap"> <glossterm remap="simple object access
		protocol">SOAP</glossterm> <glossdef> 
	      <para>A <ulink url="http://www.w3.org/TR/soap/">W3C
		standard</ulink> for the <quote>definition of the
		XML-based information which can be used for exchanging
		structured and typed information between peers in a
		decentralized, distributed environment</quote>. Most
		commonly used in Web Services for
		message-passing.</para> 
	      <para>Originally the <ulink
		url="http://xml.coverpages.org/soap.html">Simple
		Object Access Protocol</ulink>, the acronym is now
		undefined, or expressed as the Service-Oriented Access
		Protocol.</para> </glossdef> </glossentry> <glossentry
		id="serving"> <glossterm remap="text/xml">Serving
		XML</glossterm> <glossdef> 
	      <para>See <link linkend="serversoftware"></link></para>
		</glossdef> </glossentry> <glossentry id="newlines">
		<glossterm remap="newlines embed linebreaks line
		breaks end crlf lfcr lf cr line feed linefeed carriage
		returns cr-lf">Line breaks</glossterm> <glossdef> 
	      <para>XML files can be created using any of the three
		standard newline representations: CR (Mac), LF (Unix),
		or CR/LF (Windows). Use of anything else may lead to
		undefined behaviour (so old DOS editors that use LF/CR
		may create unusable files).</para> 
	      <para>Line-breaking in your output is governed by your
		rendering engine (eg a browser, a typesetter,
		etc). Your DTD or Schema may define special elements
		or entities to be used on rare occasions when a forced
		linebreak is required, but this is not normally
		something done in XML (exception: reconstruction of
		historical documents using the TEI).</para>
		</glossdef> </glossentry> <glossentry id="protocol">
		<glossterm>XML Protocol</glossterm> <glossdef> 
	      <para>There is a Working Group for Web Services at the
		W3C, and part of their remit is to work on an XML
		Protocol. See <ulink
		url="http://www.w3.org/2000/xp/Group/"></ulink> for
		details.</para> </glossdef> </glossentry> <glossentry
		id="javascript"> <glossterm>Javascript</glossterm>
		<glossdef> 
	      <para>ECMAscript (to give it its real name) has nothing
		to do with the Java language. It's designed to run
		inside browser windows, navigating or acting on the
		markup of a page to create dynamic content, validate
		forms, or instantiate objects in ways that are not
		possible with static HTML. It is also designed so that
		it cannot write to the user's local filesystem, for
		obvious security reasons, so it cannot easily be used
		to create XML files locally, although there are some
		back-doors in Microsoft software which allow modified
		pages to be saved to disk.</para> </glossdef>
		</glossentry> <glossentry id="tmx"> <glossterm
		remap="oscar">TMX</glossterm> <glossdef> 
	      <para><ulink
		url="http://www.lisa.org/tmx/tmx.htm">TMX</ulink> is a
		standard method to describe translation memory data
		that is being exchanged among tools and/or translation
		vendors for human-language translation (part of the
		OSCAR project from LISA).</para> </glossdef>
		</glossentry> <glossentry id="xul"> <glossterm
		remap="interface">XUL</glossterm> <glossdef> 
	      <para>The <ulink
		url="http://www.mozilla.org/projects/xul/">XML User
		Interface Language</ulink>, designed for specifying
		the user interface in the Mozilla browser.</para>
		</glossdef> </glossentry> <glossentry id="xmlhttp">
		<glossterm remap="ajax">XMLHTTP</glossterm> <glossdef> 
	      <para>Feature implemented in MSXML and elsewhere to
		allow the retrieval of web pages, binary data, or
		scripted responses under program control (like using
		<ulink
		  url="http://curl.haxx.se/"><productname>curl</productname></ulink>, 
		<ulink
		  url="http://www.gnu.org/software/wget/wget.html"><productname>wget</productname></ulink> 
		or <ulink
		  url="http://packages.debian.org/lenny/dog"><productname>dog</productname></ulink> 
		in a shell script). Used asynchronously in <link
		  xreflabel="simple" linkend="ajax">AJaX</link>
		applications to pre-fetch data, saving time to make it
		appear that an application is operating
		locally.</para> </glossdef> </glossentry> <glossentry
		id="white-space"> <glossterm remap="whitespace white
		spaces tabs xml:space">White-space</glossterm>
		<glossdef> 
	      <para>See <link linkend="whitespace"></link>.</para>
		</glossdef> </glossentry> <glossentry id="searching" >
		<glossterm remap="extracting">Searching</glossterm>
		<glossdef> 
	      <para>You can search individual XML files on a
		sequential, stand-alone, unindexed command-line basis
		using programs such as <ulink
		  url="http://www.cogsci.ed.ac.uk/~richard/ltxml2/lxgrep.html"><productname>lxgrep</productname></ulink> 
		or <ulink
		  url="http://www.cogsci.ed.ac.uk/~richard/ltxml2/lxprintf.html"><productname>lxprintf</productname></ulink>, 
		parts of the <ulink
		  url="http://www.ltg.ed.ac.uk/software/ltxml2">LTXML2</ulink> 
		toolkit. Many editors include a search facility as well</para> 
	      <para><link linkend="style">XSLT</link> allows a limited
		search facility simply by using functions like
		<literal>contains</literal>,
		<literal>starts-with</literal>, and
		<literal>ends-with</literal>. XSLT2 adds Regular
		Expressions. <ulink
		  url="http://www.w3.org/TR/xquery/">XQuery</ulink> is
		a fully-fledged search language for XML.</para>
	      <para>The <productname>Saxon</productname> XSLT
		processor comes with an implementation of <ulink
		url="http://www.w3.org/XML/Query">XQuery</ulink> (see
		also the <ulink url="http://www.ibiblio.org/xql/">XQL
		FAQ</ulink>), which can accept queries either from the
		command line or from a file. Saxon can also use a
		control file to specify groups of XML files to be
		searched together.</para> 
	      <para>For indexed searching (for speed) you need an
		XQuery search tool that implements an indexing engine
		which reads and understands markup. These are usually
		implemented as part of a
		<wordasword>native</wordasword> XML database system
		such as <ulink
		  url="http://exist.sourceforge.net/"><productname>eXist</productname></ulink> 
		(and many others), which run either stand-alone or in
		parallel with an XML server like <ulink
		  url="http://cocoon.apache.org/"><productname>Cocoon</productname></ulink>.</para> 
	      <para>Traditional relational databases (MySQL, Oracle,
		etc) tend to store XML as undistinguished strings or
		BLOBs, using bolt-on XML backends to handle the markup
		on import and export. <wordasword>Native</wordasword>
		XML databases have the XML handling built-in, and can
		be configured for granularity, to store at a specific
		element level, making markup-sensitive searching much
		more effective.</para> 
	    </glossdef>
	  </glossentry> 
	  <glossentry id="asp"> 
	    <glossterm remap="asp dot net framework language
	      .net">asp.net</glossterm> 
	    <glossdef> 
	      <para>ASP (Active Server Pages) is a Microsoft language
		for serving dynamic web pages, similar in concept to
		JSP, PHP, and others. In itself, ASP has nothing
		inherently to do with XML, although like any
		server-side system, it can be used for serving XML
		just as well as an other type of file.</para> 
	      <para>.NET itself is an application platform and
		methodology for web services development on Microsoft
		servers. Most web services are predicated on XML as
		the <wordasword>common carrier</wordasword> of
		inter-business messaging, so .NET has a significant
		XML component.</para> 
	<tip xreflabel="Marc Hadley"> 
	  <para>There are many alternatives to ASP, most of which use
	    a similar  page based approach. Java based
	    alternatives include <ulink
	      url="http://java.sun.com/products/jsp/">Java Server
	      Pages</ulink> (JSP), <ulink
	      url="http://java.sun.com/j2ee/javaserverfaces/">Java
		Server Faces</ulink> (JSF) and <ulink
		url="http://cocoon.apache.org/">Cocoon</ulink> (which
		includes <ulink
		url="http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html">eXtensible
		Server Pages</ulink>&mdash;XSP). Popular scripting
		language  alternatives include <ulink
		url="http://www.zope.org/">Zope</ulink> (Python) and
		<ulink url="http://www.rubyonrails.org/">Rails</ulink>
		(Ruby) [all of which have extensive XML
		support.&mdash;Ed.]</para> </tip> </glossdef>
		</glossentry> <glossentry id="disadvantages">
		<glossterm>Disadvantages</glossterm> <glossdef> 
	    <para>XML markup has a few disadvantages:</para>
		<itemizedlist> <listitem> 
		<para>It can be verbose unless element and attribute
		names are chosen with care. In large documents the
		markup overhead need not be large, but in short
		messages it can be significantly more than the actual
		data, especially when the element or attribute names
		are concocted by machine.</para> </listitem>
		<listitem> 
		<para>Overlapping markup is not permitted (an element
		cannot start inside one element and end inside
		another): element markup must nest
		hierarchically.</para> </listitem>
		<listitem>
		  <para>Most applications require the document to be
		    loaded to memory in its entirety before it can be
		    parsed and processed. This can become a problem
		    for truly huge documents (larger than the
		    addressable memory of a computer system).
		    Arguably, XML is the perhaps wrong tool to use for
		    files this size, but there are streaming systems
		    which will enable them to be processed.</para>
		</listitem>
		<listitem> 
		<para>Some of the software is truly mediocre.</para>
		</listitem> </itemizedlist> </glossdef> </glossentry>
		<glossentry id="rendering">
		<glossterm>Rendering</glossterm> <glossdef> 
	    <para>Using XSLT or XSL:FO transformation (or other
	    similar conversion systems), information marked up in XML
	    can be rendered to almost any target: HTML, PDF, audio,
	    Braille, and almost any plain-text format (eg
	    <LaTeX/>). How it appears (or sounds) is the result of
	    using stylesheets or other transformation logic activated
	    by the markup.</para> </glossdef> </glossentry>
	    <glossentry id="fp"> <glossterm remap="floating point
	    numbers integers">Floating-point</glossterm> <glossdef> 
	    <para>You cannot declare character data content or
	    attribute values as floating-point (or many other data
	    types) using DTDs. To do that
	    you need to use a Schema.</para> </glossdef> </glossentry>
	    <glossentry id="counting"> <glossterm
	    remap="counting">Enumeration</glossterm> <glossdef> 
	    <para>To count the number of occurrences of a node in an
	    XML document, you can use the <function>count</function>
	    function in XSL[T], eg</para> <programlisting><![CDATA[
<xsl:value-of select="count(//chapter)"/>
	    ]]></programlisting> 
	    <para>To apply a counter to a repetitive element type, use
	    the <function>xsl:number</function> element, eg</para>
	    <programlisting><![CDATA[ 
<xsl:number select="appendix" level="any" format="A"/> 
	      ]]></programlisting> 
	    <para>For more on XSLT, see <link
	    linkend="style"></link>.</para> </glossdef> </glossentry>
	    <glossentry id="xll"> <glossterm remap="hyperlinks xmllink
	    linking anchors">XLL</glossterm> <glossdef> 
	    <para>The XML Linking Language comprises the XLink
	    specification and the XPointer specification. For details,
	    see the <ulink
	    url="http://www.w3.org/XML/Linking.html">XML Linking
	    Working Group</ulink> at the W3C.</para> </glossdef>
	    </glossentry> <glossentry id="specialchars"> <glossterm
	    remap="&lt; &amp; % ! &gt; &quot; aquot > less than
	    greater ampersand percent exclamation mark sign symbol
	    tilde acute grave circumflex umlaut diaeresis">Special
	    characters</glossterm> <glossdef> 
	    <para>XML has only two special markup characters in normal
	    documents:</para> <itemizedlist> <listitem> 
		<para>The open angle bracket or less-than sign
	    (<literal><![CDATA[<]]></literal>) which begins a
	    start-tag or end-tag like
	    <literal><![CDATA[<report>]]></literal> or
	    <literal><![CDATA[</table>]]></literal>;</para>
	    </listitem> <listitem> 
		<para>The ampersand character
	    (<literal><![CDATA[&]]></literal>) which starts an
	    <firstterm>entity reference</firstterm> like
	    <literal><![CDATA[&aacute;]]></literal> for á or
	    <literal><![CDATA[&#x00A7;]]></literal> for &sect;.</para>
	    </listitem> </itemizedlist> 
	    <para>Contrary to popular opinion, the closing angle
	    bracket or greater-than (<literal>></literal>) and the
	    semicolon (<literal>;</literal>) are not special
	    characters in normal text: they only acquire their
	    temporary special meaning once one of the two markup
	    characters has been encountered.</para> 
	    <para>In DTDs, the percent sign (<literal>%</literal>) has
	    a special meaning in <firstterm>entity
	    declarations</firstterm>: it defines the entity as a
	    <firstterm>parameter entity</firstterm>, meaning that it
	    can only be used inside the DTD, not in a document text,
	    and only for data substitution (a kind of simple
	    macro).</para> 
	    <para>The exclamation mark (<literal>!</literal>) acquires
	    a special meaning immediately after a less-than sign: when
	    followed by one of the declaration keywords in a DTD it
	    signals the start of Declaration; when followed by two
	    dashes it signals the start of a comment (ended by another
	    two dashes and a greater-than sign.</para> </glossdef>
	    </glossentry> <glossentry id="loops"> <glossterm
	    remap="repetition">Loops</glossterm> <glossdef> 
	    <para>To process some XML repetitively, you need to use a
	    processing language which allows looping or the cyclical
	    handling of a defined set of nodes. For example in XSLT,
	    to output all chapter titles to make a table of contents
	    (ie out of natural document position), you could
	    say:</para> <programlisting><![CDATA[ 
<xsl:for-each select="//chapter"> 
  <li> 
    <xsl:value-of select="title"/>
  </li> 
</xsl:for-each> 
	      ]]></programlisting> </glossdef>
	    </glossentry> <glossentry id="uml">
	    <glossterm>UML</glossterm> <glossdef> 
	    <para>The <ulink url="http://www.uml.org/">Unified
	    Modeling Language</ulink> has nothing to do with XML,
	    although there are many points of contact, and <ulink
	    url="http://xml.coverpages.org/ni2001-10-10-a.html">some
	    software is available</ulink> to express some UML
	    structures in XML for the purposes of inter-process
	    messaging.</para> </glossdef> </glossentry> <glossentry
	    id="media"> <glossterm remap="include play avi mpg wmv
	    audio video">Multimedia</glossterm> <glossdef> 
	    <para>The <ulink
	    url="http://www.w3.org/AudioVideo/">Synchronized
	    Multimedia Integration Language</ulink> (SMIL) provides an
	    XML vocabulary for simple authoring of interactive
	    audiovisual presentations. SMIL is typically used for
	    <wordasword>rich media</wordasword>/multimedia
	    presentations which integrate streaming audio and video
	    with images, text or any other media type.</para>
	    </glossdef> </glossentry> <glossentry id="wellformed">
	    <glossterm remap="wellformed">Well-formed</glossterm>
	    <glossdef> 
	    <para>See <link linkend="wf"></link>.</para> </glossdef>
	    </glossentry> <glossentry id="sml">
	    <glossterm>SML</glossterm> <glossdef> 
	    <para>The <ulink url="">Spacecraft Markup Language</ulink>
	    is an application of XML.</para> 
	    <para>The <ulink
	    url="http://www.smlnj.org/sml97.html">Standard ML</ulink>
	    programming language is not.</para> 
	    <para>Did you mean <link xreflabel="simple"
	    linkend="whatissgml">SGML</link>?</para> </glossdef>
	    </glossentry> <glossentry id="sorting">
	    <glossterm>Sorting</glossterm> <glossdef> 
	    <para>To sort a repetitive set of XML elements in XSL[T],
	    use the <function>xsl:sort</function> element, eg</para>
	    <programlisting><![CDATA[ 
<xsl:for-each select="//acronym"> 
  <xsl:sort select="@abbrev"/>
  <xsl:value-of select="@abbrev"/> 
  <xsl:text>: </xsl:text>
  <xsl:apply-templates/> 
</xsl:for-each>
	    ]]></programlisting> </glossdef> </glossentry> <glossentry
	    id="wap"> <glossterm>WAP</glossterm> <glossdef> 
	    <para>The Wireless Application Protocol (WAP) is now
	    handled by the <ulink
	    url="http://www.openmobilealliance.org/tech/affiliates/wap/wapindex.html">Open
	    Mobile Alliance</ulink>.</para> </glossdef> </glossentry>
	    <glossentry id="gtt"> <glossterm>GTT</glossterm>
	    <glossdef> 
	    <para>The Gnome Time Tracker is a component of the Gnome
	    interface used extensively on Linux systems. Part of its
	    internal data is configured in XML.</para> 
	    <para></para> </glossdef> </glossentry> <glossentry
	    id="bpel"> <glossterm>BPEL</glossterm> <glossdef> 
	    <para>The <ulink
	    url="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsbpel">Business
	    Process Execution Language</ulink> is an XML-based
	    specification of the steps required for a cooperative
	    business process to take place between consenting
	    servers.</para> </glossdef> </glossentry> <glossentry
	    id="idempotent"> <glossterm
	    remap="idempotent">Idempotency</glossterm> <glossdef> 
	    <para>A term used in <ulink
	    url="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html">the
	    HTTP specification</ulink> to describe the
	    side-effect-free nature of repeated requests for a
	    resource.</para> </glossdef> </glossentry> <glossentry
	    id="rss"> <glossterm remap="news reader news feed
	    newsfeed">RSS</glossterm> <glossdef> 
	    <para>The <ulink
	    url="http://en.wikipedia.org/wiki/RSS_(protocol)">Really
	    Simple Syndication</ulink> format was designed to allow
	    news sites to process updates by machine, and it evolved
	    into a semi-standard format for blogs and other
	    frequently-changing sites to notify the world of
	    changes. Unfortunately it was never properly defined, and
	    has multiple incompatible and undocumented versions. It
	    was about to be superseded by a vastly better language
	    called Atom, but Microsoft have recently announced their
	    support for RSS, so it looks like we may be stuck with a
	    lemon for years to come.</para> 
	    <para><wordasword>Newsreaders</wordasword> (RSS readers)
	    are available for all platforms, both standalone and as
	    browser plugins. Do not confuse these with programs of the
	    same description designed to provide access to the Usenet
	    News service, which is a different thing entirely (and
	    which you will need to read <ulink type="news"
	    url="comp.text.xml">comp.text.xml</ulink>).</para>
	    </glossdef> </glossentry> <glossentry id="variables">
	    <glossterm>Variables</glossterm> <glossdef> 
	    <para>XML doesn't have variables or parameters, nor does
	    it have fields or records. These are all terms from
	    programming and database technology, and do not have exact
	    equivalents in XML.</para> 
	    <para>XML identifies your information with
	    <firstterm>elements</firstterm> and
	    <firstterm>attributes</firstterm>.</para> </glossdef>
	    </glossentry> <glossentry id="envvar">
	    <glossterm>Environment variables</glossterm> <glossdef> 
	    <para>XML is a markup language, not a programming
	    language, so it has no concept of environment
	    variables. However, if you are using a DTD, and accessing
	    your XML files under program control (eg in a script
	    rather than by hand) it is possible to modify the value of
	    declared attributes or entities (eg with a stream-editor
	    like sed) before the file is opened, and thereby to pass
	    values from the external environment into the document. A
	    similar approach would be possible with Schemas.</para>
	    </glossdef> </glossentry> <glossentry id="entities">
	    <glossterm remap="entitiese semicolon semi colon accents
	    diacriticals">Entities</glossterm> <glossdef> 
	    <para>An <firstterm>entity</firstterm> is a unit of
	    storage in XML. It can be as small as a character or as
	    large as a whole document. Four types of entity are
	    <firstterm>declarable</firstterm>:</para> <variablelist>
	    <varlistentry> <term>General entities</term> <listitem> 
		  <para>which can be like string-replacement
	    macros:</para> <programlisting><![CDATA[ 
<!ENTITY IBM "International Business Machines"> 
		    ]]></programlisting> 
		  <para>These can be used for shorthand data entry or
	    to guarantee uniform spelling like
	    <literal><![CDATA[&IBM;]]></literal> and they get replaced
	    when the file is parsed.</para> 
		  <para>They can also represent external files:</para>
	    <programlisting><![CDATA[ 
<!ENTITY chap5 SYSTEM "chapter5.xml"> 
		    ]]></programlisting> 
		  <para>which can be used as a file-inclusion
	    mechanism at the point where you insert
	    <literal><![CDATA[&chap5;]]></literal>. External general
	    file entities must not contain the XML Declaration or any
	    Document Type Declaration.</para> </listitem>
	    </varlistentry> <varlistentry> <term>Document
	    entities</term> <listitem> 
		  <para>These are like external general file entities
	    except that they specify the type of data they contain,
	    using a declared Notation, so that the parser and
	    application can decide how to handle them (eg include them
	    or hand them to another program specific to their type of
	    medium):</para> 
		  <programlisting><![CDATA[ 
<!ELEMENT link (#PCDATA)> 
<!ATTLIST link to ENTITY #REQUIRED>
... 
<!NOTATION PDF PUBLIC 
   "-//Adobe//NOTATION Portable Document Format//EN//PDF"
   "http://partners.adobe.com/public/developer/pdf/index_reference.html">
<!ENTITY pricelist SYSTEM "/sales/pricelist.pdf" NDATA PDF> 
... 
  <para>Please refer to our <link to="pricelist">current 
        price list</link>.</para>
	    ]]></programlisting> 
		  <para>This provides an extremely robust method of
	    defining an external entity once and allowing it to be
	    referenced multiple times (if the external filename
	    changes, you only have to update the entity
	    declaration).</para> </listitem> </varlistentry>
	    <varlistentry> <term>Character entities</term> <listitem> 
		  <para>like <literal><![CDATA[&aacute;]]></literal>
	    to represent characters that users without the required
	    keyboard features may want to enter like
	    <wordasword>á</wordasword>;</para> </listitem>
	    </varlistentry> <varlistentry> <term>Parameter
	    Entities</term> <listitem> 
		  <para>are like General Entities but can only be
	    referenced within a DTD. They are used for control of
	    content models, inclusion or exclusion of declarations,
	    and modification of modular constructs:</para>
	    <programlisting><![CDATA[ 
<!ENTITY % local.qandaset.mix "|bibliodiv"> 
		    ]]></programlisting> 
		  <para>(to use an example from the DTD for this FAQ)
	    where the mix of element types in the content model for
	    <sgmltag>qandaset</sgmltag> is specified by the entities
	    <literal>qandaset.mix</literal> (defined by DocBook)
	    <emphasis>and</emphasis> by
	    <literal>local.qandaset.mix</literal> (definable by the
	    user [me]) so that the DTD can be tweaked without having
	    to be edited.</para> </listitem> </varlistentry>
	    </variablelist> 
	    <para>General entity names, including XML document
	    entities and character entities, always start with an
	    ampersand (<literal><![CDATA[&]]></literal>) and end with
	    a semicolon (<literal>;</literal>), and can be used
	    anywhere in your document. Parameter entities can only be
	    used in a DTD: they start with a percent sign
	    (<literal>%</literal>) and end with a semicolon.</para>
	    </glossdef> </glossentry> <glossentry id="ajax">
	    <glossterm>AJaX</glossterm> <glossdef> 
	    <para>Asynchronous HTTP, Javascript, and XML. A technique
	    for improving the interactivity of web pages whereby
	    in-browser scripting detects user activity and pre-fetches
	    the required data asynchronously from an XML-backed
	    data-store, instead of waiting until the user clicks on a
	    link and requesting it synchronously from the
	    server.</para> </glossdef> </glossentry> <glossentry
	    id="pipelines"> <glossterm>Pipelining</glossterm>
	    <glossdef> 
	    <para>Technique for reducing complex sequential and
	    parallel processing requirements to a set of components
	    which can be completed under program control. The term is
	    taken from the Unix facility for redirecting the output of
	    one command into the input of another (called a
	    <wordasword>pipe</wordasword>), in effect creating a chain
	    or pipeline through which data passes on its way from
	    source to result.</para> 
	    <para>The W3C has a <ulink
	    url="http://www.w3.org/TR/2002/NOTE-xml-pipeline-20020228/">Note</ulink>
	    pending submission on an <citetitle>XML Pipeline
	    Definition Language</citetitle> which could be used to
	    define a pipeline in a portable, vendor-independent
	    manner.</para> </glossdef> </glossentry> <glossentry
	    id="attribs"> <glossterm remap="xml:id namechar
	    xml:lang">Attributes</glossterm> <glossdef> 
	    <para>These are items of <firstterm>metadata</firstterm>
	    or <firstterm>metainformation</firstterm> (information
	    about information) which can be added to the start-tag of
	    an element. Usually attributes are a way of refining the
	    meaning, function, or some other quality of an
	    element. They take the form of a name and a quoted value
	    joined by an equals sign, eg</para>
	    <programlisting><![CDATA[ 
<part id="B22" catnum="51N1573R" level="App">Left-handed Screwdriver</part>
	    ]]></programlisting> 
	    <para>Attribute names must follow the XML rules for Names
	    (see the <link linkend="spec"
	    xreflabel="simple">spec</link>).  If your application does
	    not use a DTD or Schema, the attribute values are treated
	    as plain text (CDATA) and cannot have any special meaning
	    to XML (with the exception of <sgmltag
	    class="attribute">xml:id</sgmltag> and <sgmltag
	    class="attribute">xml:lang</sgmltag>, see below). In a DTD
	    or Schema, attributes can be assigned datatypes, the most
	    common being (using DTD terminology for
	    simplicity):</para> <variablelist> <varlistentry
	    id="ididref"> <term>ID or IDREF</term> <listitem> 
		  <para>ID attribute values must be XML Names (no
	    spaces; must begin with a letter) and they must be unique
	    in a document. An IDREF attribute value can occur any
	    number of times, but it must be the value of an ID
	    attribute in the same document. ID and IDREF are most
	    frequently used for cross-referencing within
	    documents.</para> 
		  <para>Note that an ID attribute can have any name:
	    it doesn't have to be
	    <emphasis>called</emphasis>&nbsp;<wordasword>ID</wordasword>,
	    although it frequently is. Conversely&mdash;as a matter of
	    best practice&mdash;you should never use the name
	    <wordasword>ID</wordasword> (<wordasword>id</wordasword>)
	    for an attribute which is not of type ID, simply because
	    it's confusing. If your application has unique identity
	    values that the community calls IDs, and which are
	    <emphasis>not</emphasis> XML Names, either name the
	    attribute something different (eg
	    <wordasword>Product-ID</wordasword>) or document
	    <emphasis>heavily</emphasis> that the value is not an XML
	    ID.</para> 
		  <para>There is a <ulink
	    url="http://www.w3.org/TR/xml-id/">W3C
	    Recommendation</ulink> that document type designers should
	    use the <emphasis>attribute name</emphasis>&nbsp;<sgmltag
	    class="attribute">xml:id</sgmltag>, and this can be
	    interpreted by parsers as being a unique ID without the
	    need for the document to use a DTD or Schema.</para>
	    </listitem> </varlistentry> <varlistentry>
	    <term>CDATA</term> <listitem> 
		  <para>Just text.</para> </listitem> </varlistentry>
	    <varlistentry> <term>Token List</term> <listitem> 
		  <para>The attribute must have one of a restricted
	    number of values (specified in parentheses in the
	    declaration, separated by vertical bars), eg</para>
	    <programlisting><![CDATA[ 
<!ATTLIST part level (App|Jny|Mst) #REQUIRED> 
<!ATTLIST Q.27 resp (Yes|No) "Yes"> 
		    ]]></programlisting> 
		  <para>In the first example there is no default, and
		    a value is compulsory. In the second,
		    <wordasword>Yes</wordasword> is the default value
		    (if the attribute is omitted, the parser will take
		    the default value from the declaration).</para>
		</listitem>
	    </varlistentry> <varlistentry> <term>ENTITY</term>
	    <listitem> 
		  <para>The attribute value must be a declared <link
	    xreflabel="simple"
	    linkend="entities">Entity</link>.</para> </listitem>
	    </varlistentry> <varlistentry> <term>NMTOKEN</term>
	    <listitem> 
		  <para>An XML Name Token is like an ID value (no
	    spaces) but it <emphasis>can</emphasis> begin with a
	    non-letter (eg a digit or punctuation).</para> </listitem>
	    </varlistentry> <varlistentry> <term>Special
	    attributes</term> <listitem> 
		  <para>In addition to <sgmltag
	    class="attribute">xml:id</sgmltag> (mentioned above),
	    there are two others allowed by the XML
	    Specification:</para> <variablelist> <varlistentry>
	    <term>xml:space</term> <listitem> 
			<para>to signal an intention that in that
	    element, white space should be preserved by
	    applications;</para> </listitem> </varlistentry>
	    <varlistentry> <term>xml:lang</term> <listitem> 
			<para>to specify the language used in the
	    contents and attribute values of any element.</para>
	    </listitem> </varlistentry> </variablelist> 
		  <para>See sections 2.10 and 2.12 of the Spec for
	    more detail.</para> </listitem> </varlistentry>
	    </variablelist> 
	    <para>In Schemas a much greater range of datatypes is
	    available than in DTDs, and complex validation criteria
	    can be attached to each.</para> 
	    <para>Attributes in a DTD can be declared as <sgmltag
	    class="declparam">REQUIRED</sgmltag> (compulsory),
	    <sgmltag class="declparam">IMPLIED</sgmltag> (optional),
	    or <sgmltag class="declparam">FIXED</sgmltag> (predefined
	    and invariable).</para> 
	    <para>There is not intended to be any limit on the length
	    of an attribute value, but you should check that your
	    processing software can handle unusual data volumes if you
	    intend to use very large lengths.</para> </glossdef>
	    </glossentry> <glossentry id="uriparse"> <glossterm
	    remap="semicolon">URI parsing errors</glossterm>
	    <glossdef> 
	    <para>See <link linkend="semicolon"></link>.</para>
	    </glossdef> </glossentry> <glossentry id="tables">
	    <glossterm>Tables</glossterm> <glossdef> 
	    <para>You can define tables any way you wish in XML (see
	    <link linkend="makeup"></link>) but there are a few
	    existing table models which have become so widely-used
	    (and supported by software) that it would need a very
	    compelling reason to invent something new. There are more
	    details in <biblioref linkend="toolbook"/> &sect;2.3.7.</para>
	    <variablelist> <varlistentry> <term>HTML</term> <listitem> 
		  <para>HTML tables were invented by Mosaic (now
	    Netscape) and first appeared in the HTML2 DTD. In all
	    versions of HTML and XHTML they define a very simple but
	    practical model, with very few refinements, suitable for
	    web use and for rudimentary printing. Their chief
	    advantage is that in a browser the cell heights and widths
	    (and thus the column widths) expand or contract
	    automatically to accommodate the amount of text contained
	    in them. Most other table models assume the widths of the
	    columns and the height of the cells will be specified in
	    advance (which you can do in HTML but this is rarely
	    used).</para> </listitem> </varlistentry> <varlistentry>
	    <term>CALS</term> <listitem> 
		  <para>Computer-Aided Logistics and Support (and
	    several other acronyms over the years) was (is) part of
	    the US military project to ensure a consistent markup for
	    all documentation, originally in SGML, now in XML. As part
	    of this activity the CALS table model has become the most
	    widely-used in technical documentation, especially for
	    Interactive Electronic Technical Manuals (IETMs), with
	    extensive support in all the major editors, and it is the
	    default table model in the DocBook DTD and Schema. The
	    CALS definitions are very powerful but quite complex, and
	    can handle virtually all requirements for spanning,
	    ruling, and aligning.</para> </listitem> </varlistentry>
	    <varlistentry> <term>SASOUT</term> <listitem> 
		  <para>This model has been used extensively in the
	    social sciences and elsewhere for defining tables based on
	    the semantics of the data, rather than the appearance. At
	    one time they were an alternative in DocBook (enabled by a
	    simple parameter entity switch).</para> </listitem>
	    </varlistentry> <varlistentry> <term>TEI</term> <listitem> 
		  <para>The TEI model is designed to allow the encoder
	    to represent existing tables being transcribed from
	    historical, literary, or archive material, rather than for
	    the generation of new data. The markup is at the same
	    level of simplicity as the HTML model, but it is designed
	    to allow the inclusion of the much denser markup and
	    metadata needed in research texts.</para> </listitem>
	    </varlistentry> <varlistentry> <term><LaTeX/></term>
		  <listitem> 
		    <para>The <LaTeX/> model is not of direct concern
		      to the XML user except insofar as <LaTeX/> is a
		      common target for transformations from XML using
		      XSLT in order to create PDFs.  Like CALS,
		      <LaTeX/> tables can handle almost any
		      formatting, but the default alignments assume
		      that each column format is defined beforehand,
		      and that each cell will occupy one line of data:
		      an additional package
		      (<application>array</application>) is needed to
		      handle multi-line cells in the way that other
		      models do.</para>
		  </listitem> </varlistentry> </variablelist> 
	    <para>In XML, it is not necessary to use tables to mark up
	    lists as is often done in wordprocessors, because the
	    processing facilities of languages like XSLT allow you to
	    transform the document to use non-tabular methods (like
	    HTML's <sgmltag>div</sgmltag>s). Table markup should
	    therefore be confined to <wordasword>real</wordasword>
	    tables (data arranged in rows and columns) and not abused
	    simply because you want something displayed on a level
	    with something else: it is better to pick markup which is
	    designed to do the job properly rather than to distort
	    existing facilities.</para> 
	    <para>Wordprocessor users are usually unaware that many
	    structures that they currently use wordprocessor tables
	    for are in fact segmented lists, which wordprocessors are
	    incapable of handling correctly. One of the major reasons
	    for doing it properly is that the data can then be
	    reprocessed to make sense when read in the natural
	    order.</para> </glossdef> </glossentry> <glossentry
	    id="bom"> <glossterm>Byte Order Mark</glossterm>
	    <glossdef> 
	    <para>A two-byte signature (<literal>0xFEFF</literal>,
	    defined in Unicode and ISO 10646) which must be prepended
	    to the XML document when using the the UCS-2 encoding, in
	    order to allow processors to differentiate between the
	    UCS-2 and UTF-8 encodings.</para> </glossdef>
	    </glossentry> <glossentry id="ip" > <glossterm remap="who
	    owns xml copyright trademark symbol">Patents, Copyright,
	    and Intellectual Property</glossterm> <glossdef> 
	    <para>I'm not a lawyer, and this is not legal advice. If
	    you're worried, see a psychiatrist first &smile;&#x0308;</para> 
	    <para>Since the USA (and, increasingly, elsewhere) stopped
	    sanity-checking patent applications, pretty much anyone
	    can patent anything in these countries, regardless of
	    whether or not it already exists. If you are sufficiently
	    intellectually bankrupt, you can then start sending
	    invoices to companies and even individuals demanding
	    payment of license fees for continued use.</para> 
	    <para>XML was drafted during 1995 and first published in
	    1996, so anyone claiming they invented pointy-bracket
	    self-defining hierarchically-nested structured markup
	    after that is probably a few elements short of a DTD. XML
	    is based on SGML, which is an international standard
	    codified as ISO 8879:1986, and it was preceded by numerous
	    other closely-related markup systems, so anyone claiming
	    they invented it after that date is equally wide of the
	    markup.</para> 
	    <para>Lots of subsequent derivative technologies which owe
	    their existence to the SGML and XML groundwork quite
	    possibly <emphasis>are</emphasis> valid patents, in the
	    same way that fire was not originally patented but matches
	    and lighters were.</para> 
	    <para>Patents were originally designed for new physical
	    inventions. Their use for methodologies and algorithms
	    extended the concept into the realm of ideas, which many
	    people regard as deeply suspect. The patenting of natural
	    phenomena like genes (which are pre-existing parts of
	    Nature like politicians or pond scum), is meaningless and
	    intellectually void, although legally enforceable in the
	    USA and elsewhere.</para> 
	    <para>Copyright subsists automatically in anything you
	    create, but in some countries (notably the USA and France)
	    you cannot enforce this unless you register your
	    interest. Copyright persists for a number of years after
	    your death (EU: 75, different elsewhere) in order to let
	    your descendants benefit from sales of your work.</para> 
	    <para>Copyright is for the physical form of intellectual
	    expression like books, newspapers, works of art, web
	    sites, or computer programs. It exists to prevent others
	    stealing your work and selling it.  You can quote snippets
	    of other people's work without permission, such as a line
	    of a poem, or a bar of music, or a sentence from a novel,
	    provided you say whose it is and where to find it:
	    otherwise you need to ask permission beforehand. Copyright
	    already provides more than adequate protection for
	    computer programs, making the use of patents for them
	    unnecessary overkill.</para> 
	    <para>Intellectual Property identifies you as the owner of
	    the thoughts and ideas which may find their physical
	    manifestation in patentable inventions or copyrightable
	    publications. Even if you sell off your patents, and for
	    long after your copyrights have expired, you can still be
	    seen as the person who dreamed up the idea, and some
	    countries (eg the UK) allow you formally to assert your
	    right to be so identified, regardless of what happens to
	    the book or the gizzmo.</para> 
	    <para>You should <emphasis>always</emphasis> acknowledge
	    the intellectual property of others, especially when you
	    use it in furtherance of your own aims. Pretending that
	    someone else's smart ideas are your own is probably a
	    worse offence than trying to patent fire, water, the
	    wheel, or XML.</para> </glossdef> </glossentry>
	    <glossentry id="escape"> <glossterm remap="escape
	    characters sequences">Escaping</glossterm> <glossdef> 
	    <para>Escaping means temporarily switching the way a
		program works to do something different with the data.
		In SGML, it was conventional to use only ASCII
		characters in your documents because keyboards,
		screens, and fonts for other characters were often
		unavailable. To escape from the limitations of this
		format for non-ASCII characters like accents and
		symbols a set of mnemonic names was available,
		prefixed by an ampersand (<literal>&amp;</literal>) to
		turn the escapement on, and followed by a semicolon
		(;) to turn the it off, so an á was given as <sgmltag
		  class="genentity">aacute</sgmltag>.</para> 
	    <para>XML allows you to use Unicode, so any character or
		symbol in any language can be entered as itself. If
		you are using UTF-8 encoding in your documents, there
		is no need to use escaping except for the two markup
		symbols (<literal>&lt;</literal> and
		<literal>&amp;</literal>). However, not everyone has a
		Unicode editor, and complete Unicode fonts are very
		large, so it is conventional in alphabetic languages
		to pick an encoding which allows you to use the
		majority of the characters you need, and to use
		escaping for the occasional other characters.</para>
	    </glossdef>
	    </glossentry> <glossentry id="security"> <glossterm>XML
	    and security, privacy, and identity
	    standards</glossterm> <glossdef> 
	    <para>Eve</para> </glossdef> </glossentry> <glossentry
	    id="csv"> <glossterm remap="csv export convert">Data
	    export</glossterm> <glossdef> 
	    <para>A common requirement in the flat data model used in
	    many e-commerce systems is to export XML data to the CSV
	    (Comma-Separated Values) data format used as input to
	    spreadsheets. There is a simple example of a short script
	    to do this <ulink
	    url="http://silmaril.ie/downloads/software/xml2csv.zip">here</ulink>. More
	    complex and sophisticated routines could easily be written
	    using XSLT or other XML processing software. Users should
	    note that while conversion to CSV is adequate for simple
	    data formats, it is an inappropriate format for normal XML
	    text documents which use Mixed Content models.</para>
	    </glossdef> </glossentry> <glossentry id="imp"> <glossterm
	    remap="import load convert conversion">Data
	    import</glossterm> <glossdef> 
	    <para>Many XML projects require the import of existing
	    documents in non-XML formats. The import of existing HTML
	    documents is explained in <link linkend="conversion"/>,
	    and if you can convert your documents to XHTML; this is
	    probably the simplest method. OpenOffice saves Open
	    Document Format (ODF) files, which are the international
	    standard for office XML documents. Word files can be saved
	    as WordML (2003) or Office Open XML (2007: Microsoft's
	    alternative to ODF). In both cases an XSLT transformation
	    can be written to create a suitable XML import format. For
	    complex documents in other formats, however, specialist
	    conversion software is needed. Some XML editors are
	    beginning to offer inbuilt conversion of other formats,
	    and there are many standalone conversion systems available
	    (some at high cost) for formats which are otherwise not
	    easily machine-accessible via markup, like PDF,
	    PostScript, <LaTeX/>, Quark XPress, and most proprietary
	    document formats. The critical point is that almost all
	    non-XML (non-SGML) document are formatted to make them
	    human-readable and pretty, not to make them
	    machine-readable. It is therefore often the case that the
	    information required to make the document meaningful in
	    XML simply doesn't exist in these formats. The only
	    alternative for this class of documents is to have them
	    rekeyed or scanned into XML by one of the many companies
	    in the Indian subcontinent or the Pacific Rim.</para>
	    </glossdef> </glossentry> <glossentry id="htmlfunc">
	    <glossterm remap="checkboxes radiobuttons textareas">Text
	    document formatting functions</glossterm> <glossdef> 
	    <para>Because XML is a metalanguage to let you define and
	    name your <emphasis>own</emphasis> information structures,
	    it has no built-in knowledge of anything to start with. It
	    therefore has no inherent understanding of any document
	    specifics like bulleted lists, sections, footnotes, or any
	    of the common online features like drop-down menus, forms
	    (inputs, check boxes, radio buttons, and text areas),
	    scripts, mouseovers, or other bells and
	    whistles&mdash;these are things which
	    <emphasis>you</emphasis> have to use XML to define, in a
	    DTD or Schema for your specific application. Contrary to
	    the impression given by some manufacturers these things
	    are <emphasis>not</emphasis> built into XML itself. You
	    first choose or design a document type (Schema or DTD) to
	    represent your information accurately, then you can
	    generate effects like the above by using CSS styling, or
	    writing an XSL[T] transformation of your XML to HTML,
	    Word, <LaTeX/>, PDF, or whatever other format is capable
	    of instantiating them.</para> 
	    <para>There <emphasis>are</emphasis> additional native-XML
	    proposals and recommendations at the W3C for XML Forms
	    handling, XML Linking, XML Security, and a lot of other
	    features, but these are architectural enabling mechanisms,
	    not drop-in replacements for HTML.</para> </glossdef>
	    </glossentry> <glossentry> <glossterm></glossterm>
	    <glossdef> 
	    <para></para> </glossdef> </glossentry> <glossentry>
	    <glossterm></glossterm> <glossdef> 
	    <para></para> </glossdef> </glossentry> <glossentry>
	    <glossterm></glossterm> <glossdef> 
	    <para></para> </glossdef> </glossentry> <glossentry>
	    <glossterm></glossterm> <glossdef> 
	    <para></para> </glossdef> </glossentry> </glosslist>
	    </answer> </qandaentry> <qandaentry id="oldsoft"
	    revisionflag="added"> <question> <formalpara> <title>Lost
	    XML software</title> 
	  <para>Some of the best software that has disappeared</para>
	    </formalpara> </question> <answer remap="lost old good
	    software obsolete former early"> 
	<para>The most common cause of lost good software seems to be
	    that the company making it got taken over through no fault
	    of their own, by a corporate shark who didn't know what
	    they were buying, or who simply didn't care. In these
	    cases it wasn't the product that was at fault&mdash;often
	    it was popular and selling well; it just fell foul of
	    corporate stupidity.</para> <variablelist> <varlistentry>
	    <term><productname>Near&ampers;Far</productname>
	    (MicroStar)</term> <listitem> 
	      <para>A standalone visual (graphical) SGML DTD design
	    tool, originally for Microsoft <productname>Windows
	    95</productname>. N&ampers;F made it very easy to prototype a
	    new document type, although later stages of development
	    were usually hand-tuned. It was also an excellent tool for
	    displaying the structure of a newly-encountered DTD. When
	    XML arrived, they kept the internal SGML model but
	    provided a <wordasword>save-as</wordasword> in XML
	    syntax.</para> 
	      <para>Many current design tools have similar embedded
	    functionality (eg <productname>XML Spy</productname>), but
	    there is no equivalent standalone tool of the same
	    quality. A development to use
	    <productname>RelaxNG</productname> to generate different
	    syntaxes would be a major advance.</para> 
	      <para>MicroStar was bought by OpenText Corp and the
	    product was dropped on the floor just at the point when it
	    would have been most useful. If you have a copy (one was
	    embedded in the WordPerfect SGML/XML editor), it still
	    executes under XP, and in Codeweavers'
	    <productname>Wine</productname> under Linux.</para>
	    </listitem> </varlistentry> <varlistentry>
	    <term><productname>DynaWeb</productname> (EBT)</term>
	    <listitem> 
	      <para>A family of products:
	    <productname>DynaBase</productname>, the underlying SGML
	    database; <productname>DynaWeb</productname>, a Windows
	    server with a graphically-managed stylesheet system for
	    serving XML or SGML converted to HTML, and an excellent
	    markup search facility; and
	    <productname>DynaTag</productname>, a GUI system for
	    converting <productname>Word</productname> and
	    <productname>Frame</productname> documents to SGML or XML,
	    based on the original
	    <productname>RainbowMaker</productname> commandline
	    converter.</para> 
	      <para>EBT was bought up by Inso Corp, and the product
	    was ignored for several years. However, a page on Indo's
	    server now claims to provide details, but it is not known
	    if the product is still available. It appears that they
	    inherited some users, so for a while they still had a
	    <productname>DynaWeb</productname> training page.</para> 
	      <para>The good news is that Red Bridge Software now
	    occupies the old EBT factory (under the Red Bridge in
	    Providence, RI), selling a content management system that
	    includes <productname>DynaTag</productname> and some other
	    elements of the original range.</para> </listitem>
	    </varlistentry> <varlistentry id="panorama">
	    <term><productname>Panorama</productname>
	    (SoftQuad)</term> <listitem> 
	      <para>An SGML browser from <ulink
	    url="http://www.users.cloud9.net/~bradmcc/panorama-1.html">SoftQuad</ulink>
	    with an SGML-syntax stylesheet which worked both
	    standalone and as a Netscape plugin, based on Synex
	    <productname>Viewport</productname>. This let users open
	    direct links to SGML documents:
	    <productname>Panorama</productname> would download both
	    instance and DTD via an entity resolver, perform a
	    tokenised parse, and apply the specified
	    stylesheet.</para> 
	      <para>Its unique features included switching between
	    multiple stylesheets, a search result density indicator,
	    and the ability to implement double-ended HyTime links,
	    which let anyone publish their own set of links, even
	    multi-ended links, and even between documents that they
	    didn't own. The browser plugin was free, and the full
	    version included the stylesheet editor.</para> 
	      <para>SoftQuad faltered after Yuri Rubinsky passed away,
	    and was taken over by Corel
	    (<productname>WordPerfect</productname>), where the
	    product was ignored.</para> <note> 
		<para>SoftQuad's
	    <productname>Author/Editor</productname> SGML editor
	    product transmuted into <productname>XMeTaL</productname>,
	    which is still available from <ulink
	    url="http://na.justsystems.com/">JustSystems</ulink>.</para>
	    </note> </listitem> </varlistentry> </variablelist> 
	<para>If you have more information about useful products that
	    have disappeared, please email the editor.</para> 
      </answer> 
    </qandaentry> 
  </qandadiv>
</qandaset>

