Your support for our advertisers helps cover the cost of hosting, research, and maintenance of this FAQ
The XML FAQ — Frequently-Asked Questions about the Extensible Markup Language
Section 1: Basics
Q 1.6: What is iXML?
Invisible XML is
a language for describing the implicit structure of data
without explicit markup.
Invisible XML is a pattern language for describing the implicit structure of data without explicit markup, and a set of technologies for making that structure explicit as XML. See https://invisiblexml.org.
This allows you to write a declarative description of the format or layout of some text or data which has a structure, and then use that description to transform the text or data into structured information in XML. There are two short examples below.
A Final Community Group Report by the W3C was published on 12 December, 2023 at https://www.w3.org/community/reports/ixml/CG-FINAL-ixml-20231212/. Development continues on the public-ixml@w3.org mailing list, with periodic virtual meetings with published minutes. The first iXML (online) conference is being held in early 2026.
Example 1: ‘Keyword=Value’ configuration format
The popular keyword="value" format commonly used in configuration files, eg
version="9" logging="no"
could be described in iXML with the following statements:
config: rule+. rule: keyword,-" "*,-"=",-" "*,value,-#a. keyword: ["a"-"z"]+. value: -'"',~['"';#a]+,-'"'.
These four lines say that
a config is made up of one or more (that's the plus sign) rules;
a rule is made up of a keyword followed by zero or more (that's the asterisk) spaces (the minus means we don't need to keep them), then an equals sign (also discardable), possibly more discardable spaces, and then a value and the end of a line (#a)
a keyword is made up of one or more of characters (here, letters) in the range a–z;
a value is made up of a discardable double-quote ("), one or more characters: anything except (that's the ~) another double-quote or a line-end, followed by a discardable double-quote.
That’s all an iXML processor such as coffeepot needs to work out the XML representation below (there is a list of processors on the InvisibleXML web site):
<config>
<rule>
<keyword>version</keyword>
<value>9</value>
</rule>
<rule>
<keyword>logging</keyword>
<value>no</value>
</rule>
</config>
The iXML language is much broader than this, and can be applied to any data or text in which the structure can be deduced from combinations of position, sequence, repetition, or spacing provided that they are consistent.
Example 2: File directory listing
A more complex example is to get a standard directory listing of your files into XML: here is the output of the ls -l command:
total 19168 drwxrwxr-x 5 peter peter 4096 Apr 26 2025 assets -rw-rw-r-- 1 peter peter 10509 Dec 17 14:21 authors.html -rw-rw-r-- 1 peter peter 8634 Dec 17 14:21 basics.html -rw-rw-r-- 1 peter peter 10036 Dec 17 14:21 bibliography.html drwxrwxr-x 3 peter peter 4096 Nov 17 22:13 cgi-bin -rw-rw-r-- 1 peter peter 19138 Dec 17 14:21 conversion.html drwxrwxr-x 4 peter peter 4096 Sep 17 22:57 css -rw-rw-r-- 1 peter peter 399779 Nov 28 22:49 faq.xml -rw-rw-r-- 1 peter peter 15086 Apr 26 2025 favicon.ico -rw-rw-r-- 1 peter peter 44950 Dec 17 14:21 index.html -rw-rw-r-- 1 peter peter 1633 Nov 28 22:53 Makefile drwxrwxr-x 9 peter peter 36864 Dec 8 22:51 recipes drwxrwxr-x 2 peter peter 4096 Apr 26 2025 sonnet18 -rw-rw-r-- 1 peter peter 134904 Oct 20 22:03 xmlwebsite.xsl
Here is one way to get this into an XML format that could be used in document processes, parts of pipelines, lookups, in fact anything where you need XML access a list of files. The output format could be any design you want: here I have condensed a lot of the categorical data into attributes (listing shortened:
<filelist space='19168'>
<file type='directory' links='5' owner='peter' group='peter' size='4096' month='04' day='26' year='2025'>
<permissions user='read/write/exec' group='read/write/exec' other='read/nowrite/exec'/>
<name>assets</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='10509' month='12' day='17' time='14:21'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>authors.html</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='8634' month='12' day='17' time='14:21'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>basics.html</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='10036' month='12' day='17' time='14:21'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>bibliography.html</name>
</file>
<file type='directory' links='3' owner='peter' group='peter' size='4096' month='11' day='17' time='22:13'>
<permissions user='read/write/exec' group='read/write/exec' other='read/nowrite/exec'/>
<name>cgi-bin</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='19138' month='12' day='17' time='14:21'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>conversion.html</name>
</file>
<file type='directory' links='4' owner='peter' group='peter' size='4096' month='09' day='17' time='22:57'>
<permissions user='read/write/exec' group='read/write/exec' other='read/nowrite/exec'/>
<name>css</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='399779' month='11' day='28' time='22:49'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>faq.xml</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='15086' month='04' day='26' year='2025'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>favicon.ico</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='44950' month='12' day='17' time='14:21'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>index.html</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='1633' month='11' day='28' time='22:53'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>Makefile</name>
</file>
<file type='directory' links='9' owner='peter' group='peter' size='36864' month='12' day='08' time='22:51'>
<permissions user='read/write/exec' group='read/write/exec' other='read/nowrite/exec'/>
<name>recipes</name>
</file>
<file type='directory' links='2' owner='peter' group='peter' size='4096' month='04' day='26' year='2025'>
<permissions user='read/write/exec' group='read/write/exec' other='read/nowrite/exec'/>
<name>sonnet18</name>
</file>
<file type='normal' links='1' owner='peter' group='peter' size='134904' month='08' day='20' time='22:03'>
<permissions user='read/write/noexec' group='read/write/noexec' other='read/nowrite/noexec'/>
<name>xmlwebsite.xsl</name>
</file>
</filelist>
To do this, the following iXML describes how the data is arranged, and which bits are needed and where.
Define a file list as an optional summary of how much space is used (question mark means optional), and then a list of zero or more files (asterisk means zero or more).
The space-used field has a label made of one or more letters (the Unicode group [L] and the + sign), some white-space, a number, and a line-end. The - prefix means the value gets discarded after having been matched: we want to find this line, but we're only interested in the number, not the label in English, ‘total’, nor the line-end character.
White-space is defined as being discardable, one or more space characters; and a number is defines as one or more of the characters from 0–9.
filelist: space?,file*. @space: -[L]+,s,n,-#a. -s: -" "+. n: ["0"-"9"]+.
Define a file as the items it is made up of, followed by a (discardable) line-end. In a standard linux file listing, recent files have the hour and minute, whereas older files have the year, so these two are options (semicolon).
file: type, permissions, links, owner, group, size, month, day, (year;time), name, -#a.The file type is a d for a directory, or a - for a normal file, but there are many other types which we will provide even though they are less common. Again, prefixing the value matched with - makes it discarable, but the + provides a replacement value to put in the output. The @ means this value will be output as an attribute, not an element.
@type: (-"-",+"normal"; -"b",+"block special"; -"c",+"character special"; -"d",+"directory"; -"l",+"symbolic link"; -"p",+"pipe"; -"n",+"network"; -"s",+"socket").File permissions are three 3–character blocks defining what can be done with the file by the user, other users in their group, and any other user on the system (followed by white-space). Again, we use the - and + to substutite readable values for the letters.
permissions: user,group,other,s. @user: (-"r",+"read";-"-",+"noread"), (-"w",+"/write/";-"-",+"/nowrite/"), (-"x",+"exec";-"-",+"noexec"; -"s",+"setid";-"t",+"sticky"). @group: (-"r",+"read";-"-",+"noread"), (-"w",+"/write/";-"-",+"/nowrite/"), (-"x",+"exec";-"-",+"noexec"; -"s",+"setid";-"t",+"sticky"). @other: (-"r",+"read";-"-",+"noread"), (-"w",+"/write/";-"-",+"/nowrite/"), (-"x",+"exec";-"-",+"noexec"; -"s",+"setid";-"t",+"sticky").
The number of hard links to this file is just a number. The owner and the group are strings of any character except a space (the tilde is the negation). The size, like the links, is just a number. All these values are marked as attributes.
@links: n,s. @owner: id. @group: id. -id: ~[" "]+, s. @size: n,s.
The month names get substituted by their numbers using the same technique as before: this is to enable subsequent processes to use ISO dates like 2025-09-05 without having to bother parsing month names. The day is either a single digit to which we prefix a zero, or it's a double digit. A year is a number; the time is two numbers seoparated by a colon.
@month: (-"Jan",+"01";-"Feb",+"02";-"Mar",+"03"; -"Apr",+"04";-"May",+"05";-"Jun",+"06"; -"Jul",+"07";-"Aug",+"08";-"Sep",+"09"; -"Oct",+"08";-"Nov",+"11";-"Dec",+"12"),s. @day: (+"0",d;d,d),s. d: ["0"-"9"]. @year: n,s. @time: n,":",n,s.Finally, the filename is any sequence of characters except a line-end.
name: ~[#a]+.
Processing the output of the ls -l command with the catenation of the above fragments and an iXML processor will produce the XML version of the listing (some processors can also produce JSON, CSV, or other formats). A separate block of month names would be needed if your operating system is set to a language other than English. (Thanks to Steven Pemberton and Fredrik Öhrström for this example.)
Note that in iXML you cannot rearrange or reorder the data: it is processed strictly as it occurs. If it needs rearranging, use XSLT to process it further.
