Understanding XML → TypeScript
XML has three things JSON doesn't.
Attributes, text content, and repeating-element-as-array semantics — every XML-to-TS converter has to pick a representation for each.
XML's three peculiarities.
XML elements carry three things JSON keys don't. <book id="7">…</book> — the id is an attribute, not a child. <title>Hi</title> — the "Hi" is the element's text content, distinct from its children. <book/> <book/> <book/> — three siblings with the same name form an array, not three keys. A faithful XML-to-TS converter must encode each.
Attributes — three common encodings.
The three conventions: $ prefix (attributes go into a $ sub-object), @ prefix (each attribute becomes @attrName), or merged (attributes and children sit on the same object, attribute names prefixed with _ to avoid collisions). The xml2js library uses $; the fast-xml-parser library defaults to @_ for attributes. Codegen tools pick one; the resulting TypeScript types reflect it.
Text content — the #text problem.
A leaf element like <title>Hi</title> can become a plain string. A mixed element like <title lang="en">Hi</title> can't — it has both attributes and text. The convention is a special #text key that holds the content, with the attributes living alongside. Pure-leaf elements often get flattened to strings; elements with attributes preserve the structure. Codegen tools have to pick a uniform convention or accept type complexity.
Repeating elements — array detection.
In a sample document with a single <book>, is the parent's book field a single object or an array of one? The parser doesn't know until it sees a second sibling. Some libraries always emit arrays; others emit single objects unless a sibling exists; others let you configure "always-array names". Each choice is wrong for some real document. Codegen tools that emit Book | Book[] for the field are pessimistic but safe; ones that guess wrong silently produce broken types.
A worked example.
From <catalog>
<book id="7"><title>Hi</title></book>
<book id="8"><title>There</title></book>
</catalog> a reasonable inferrer emits: interface Catalog {
book: Book[];
}
interface Book {
$: { id: string };
title: string;
} The repeating book element became an array; the id attribute moved into the $ sub-object; the leaf title became a plain string.
Attributes vs children
<book id="7"><title>Hi</title></book>
Attributes go into $; children become typed properties.
{ $: { id }, title }
= Book interface
Namespaces.
XML namespaces (xmlns:foo="http://..." with <foo:book> elements) are mostly ignored by JSON-style converters. The element name becomes a string with a colon in it, which makes for invalid TypeScript property names — requires bracket access in the consumer or a renamed key. Real-world SOAP-flavoured XML often uses heavy namespacing; converting it to clean TypeScript types is a significant additional step.
Why this conversion is hard to automate.
XML carries strictly more information than JSON. Round-tripping XML → TypeScript → XML loses comments, processing instructions, whitespace decisions, and the element-vs-attribute distinction unless the encoding preserves them deliberately. For one-way "I just need to read this XML in TypeScript" use cases, the converters work well. For "I want to round-trip XML through a typed system", the right tool is usually XML Schema-aware code generation (xsd-to-ts), not sample-based inference.
When to skip and use a runtime parser.
If the XML is heterogeneous, deeply nested, or carries non-trivial semantics, a typed wrapper is more pain than it's worth. Parse with xml2js or fast-xml-parser into a plain object; access fields as needed with type assertions. For XML you control end-to-end, generated types are useful; for XML you receive from a third party, expect to fight the converter once.