MATLAB: How to write special characters in XML files using “xmlwrite”

accentampersandcharacterMATLABspecialunicodexmlxmlwrite

I want to write XML files with special characters, such as accented characters, through the MATLAB function "xmlwrite". However, if I use the numerical representation for such characters (such as "&#xE9" to represent "é"), "xmlwrite" substitutes "&" with "&".
How can I use "xmlwrite" so that the ampersand "&" is not substituted by "&amp"?

Best Answer

Please note that, as far as having "&" characters in an XML file, this is technically not supported according to the XML Specification, and a standards-compliant XML encoder should always escape "&" characters. Please refer to the following snippet from the XML Specification (<https://www.w3.org/TR/xml/#syntax>):

"The ampersand character (&) and the left angle bracket (<) must not appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively."

You can use the Unicode notation for the special characters instead of the numerical representation that you used. For instance, the "é" character is represented by the number 233 in Unicode. Please find below a simple example that shows how to write special characters in XML files using Unicode characters.

>> docNode = com.mathworks.xml.XMLUtils.createDocument('test');
>> curr_node = docNode.createElement('testitem');
>> curr_node.appendChild(docNode.createTextNode(native2unicode(233))); % 233 is the Unicode for the accented e
>> docNode.getDocumentElement.appendChild(curr_node)
>> xmlwrite('testAccent.xml',docNode);

The code above generates the file "testAccent.xml" which shows as follows:

<?xml version="1.0" encoding="utf-8"?>
<test>
   <testitem>é</testitem>
</test>