Related
s
DOM
HTML
HTML5
MSXML
NAMESPACES
NOT SGML
SAX
SCHEMA
SGML
SVG
TEX
UNICODE
XML CHINESE
XML CONDENSED
XML DUTCH
XSL
C.3 What does an XML document actually look like (inside)?
The basic structure of XML is similar to other applications of SGML, including HTML. The basic components can be seen in the following examples. An XML document starts with a Prolog:
The XML Declaration
<?xml version="1.0" encoding="utf-8"?>
which specifies that this is an XML document;
Optionally a Document Type Declaration
<!DOCTYPE report SYSTEM "http://sales.acme.corp/dtds/salesrep.dtd">
which identifies the type of document and says where the Document Type Description (DTD) is stored;
The Prolog is followed by the document instance:
A root element, which is the
outermost (top level) element (start-tag plus end-tag) which encloses
everything else: in the examples below the root elements
are conversation and
titlepage;
A structured mix of descriptive or prescriptive elements enclosing the character data content (text), and optionally any attributes (‘name=value’ pairs) inside some start-tags.
XML documents can be very simple, with straightforward nested markup of your own design:
<?xml version="1.0" standalone="yes"?> <conversation> <greeting>Hello, world!</greeting> <response>Stop the planet, I want to get off!</response> </conversation>
Or they can be more complicated, with a Schema or question C.11, Document Type Description (DTD) or internal subset (local DTD changes in [square brackets]), and an arbitrarily complex nested structure:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE titlepage
SYSTEM "http://www.foo.bar/dtds/typo.dtd"
[<!ENTITY % active.links "INCLUDE">]>
<titlepage id="BG12273624">
<white-space type="vertical" amount="36"/>
<title font="Baskerville" alignment="centered"
size="24/30">Hello, world!</title>
<white-space type="vertical" amount="12"/>
<!-- In some copies the following
decoration is hand-colored, presumably
by the author -->
<image location="http://www.foo.bar/fleuron.eps"
type="URI" alignment="centered"/>
<white-space type="vertical" amount="24"/>
<author font="Baskerville" size="18/22"
style="italic">Vitam capias</author>
<white-space type="vertical" role="filler"/>
</titlepage>
Or they can be anywhere between: a lot will depend on how you want to define your document type (or whose you use) and what it will be used for. Database-generated or program-generated XML documents used in e-commerce is usually unformatted (not for human reading) and may use very long names or values, with multiple redundancy and sometimes no character data content at all, just values in attributes:
<?xml version="1.0"?>
<ORDER-UPDATE AUTHMD5="4baf7d7cff5faa3ce67acf66ccda8248"
ORDER-UPDATE-ISSUE="193E22C2-EAF3-11D9-9736-CAFC705A30B3"
ORDER-UPDATE-DATE="2005-07-01T15:34:22.46"
ORDER-UPDATE-DESTINATION="6B197E02-EAF3-11D9-85D5-997710D9978F"
ORDER-UPDATE-ORDERNO="8316ADEA-EAF3-11D9-9955-D289ECBC99F3">
<ORDER-UPDATE-DELTA-MODIFICATION-DETAIL ORDER-UPDATE-ID="BAC352437484">
<ORDER-UPDATE-DELTA-MODIFICATION-VALUE ORDER-UPDATE-ITEM="56"
ORDER-UPDATE-QUANTITY="2000"/>
</ORDER-UPDATE-DELTA-MODIFICATION-DETAIL>
</ORDER-UPDATE>