Now I understand: fpspreadsheet must remove the special codes, and this works. But you want to keep them, and this does not work.
Of course you can pass the extracted string to the function UTF8TextToXMLText of unit fpsxmlcommon - it just replaces the line breaks and other special characters by the xml equivalents (set "ProcessLineEndings" to true in order to replace #10 by ' ').
Kind of cumbersome though: First the xml reader removes them, and UTF8TextToXMLText brings them back in... It would be better to force the xml reader to keep them in the first place. I don't know, however, how to do this.
But what exactly do you want to achieve? Maybe laz2_dom and laz2_xmlread are not the correct units for your purpose.
The strange output of the RebuildChildNodes procedure is due to the fact that you do not initialize the string parameter (s) passed to this function. RebuildChildNodes is a recursive function and always adds the node name, node attributes, and node content to this string which gets longer with every recursion level. You simply must set s := '' before you call RebuildChildNodes:
if nodeName = 'Data' then
begin
s := GetAttrValue(data_node, 'ss:Type');
if (s = 'String') or (s = 'Number') then
begin
WriteLN(Format('GetNodeValue(data_node): "%s"', [GetNodeValue(data_node)]));
WriteLN(Format('data_node.TextContent: "%s"', [data_node.TextContent]));
s := ''; // <--------------------- ADDED -----------------<
RebuildChildNodes(data_node, s);
WriteLN('After rebuild: '+s);
WriteLN(Format('GetNodeValue(data_node): "%s"', [GetNodeValue(data_node)]));
WriteLN(Format('data_node.TextContent: "%s"', [data_node.TextContent]));
ReadLN;
end
else
WriteLN('');
end;