Your support for our advertisers helps cover the cost of hosting, research, and maintenance of this FAQ

The XML FAQ — Frequently-Asked Questions about the Extensible Markup Language

Section 4: Developers

Q 4.2: I'm trying to understand the XML Spec: why does it have such difficult terminology?

It has to be formal to be accurate.

For implementation to succeed, the terminology needs to be precise. Design goal eight of the specification tells us that ‘the design of XML shall be formal and concise’. To describe XML, the specification therefore uses formal language drawn from several fields, specifically those of document engineering, international standards and computer science. This is often confusing to people who are unused to these disciplines because they use well-known English words in a specialised sense which can be very different from their common meanings — for example: grammar, production, token, or terminal.

The specification does not explain these terms because of the other part of the design goal: the specification should be concise. It doesn't repeat explanations that are available elsewhere: it is assumed you know this and either know the definitions or are capable of finding them. In essence this means that to grok the fullness of the spec, you do need a knowledge of some SGML and computer science, and have some exposure to the language of formal standards.

Sloppy terminology in specifications causes misunderstandings and makes it hard to implement consistently, so formal standards have to be phrased in formal terminology. This FAQ is not a formal document, and the astute reader will already have noticed it refers to ‘element names’ where ‘element type names’ is more correct; but the former is more widely understood.

Those new to the terminology may find it useful to read something like the TEI P4: Guidelines for Electronic Text Encoding and Interchange (Sperberg-McQueen and Burnard, 2002) or XML: The Annotated Specification (DuCharme, 1999).