Ah yes, that's the one I wrote but I can't believe it was back in 2016! I thought it was last year! There is one or two others, written by fellow users, that I recall reading over the last few weeks as well.
But the thread, whilst replied to, it still doesn't really solve the issue. The basic solution stated there is to effectively rename all such files as zips, and then use a zip traversal procedure to find the inner "document.xml" file (in the case of MS Word), open that, and then use XML traversal. But I don't really want my program renaming files, and then renaming them back again when it is finished. And even if I did, it seems like a bit of "hack" to achieve what must be a fairly common need these days. Writing tools that can create and read Office files (Mircosoft, LibreOffice etc) must be a common requirement. And, as this post describes, one such library seems to exist (fpVectorial) but I just can't currently see how it is used to "get" the content that would be listed in "document.xml".