Recent

Author Topic: XML what's wrong with my file?  (Read 3315 times)

mirce.vladimirov

  • Full Member
  • ***
  • Posts: 220
XML what's wrong with my file?
« on: September 14, 2015, 11:03:00 am »
XML Editors are opening the attached XML file, but my lazarus program reports an error, see the attached picture.

rvk

  • Hero Member
  • *****
  • Posts: 4227
Re: XML what's wrong with my file?
« Reply #1 on: September 14, 2015, 11:06:35 am »
You have a BOM-signature in your XML file.
(first 3 characters are 0xEF,0xBB,0xBF.

I don't think this is allowed for XML-files (even encoded as UTF-8). You specify the file is specified as UTF-8 in the header-tags so there is no need for a BOM-signature. Remove it and the file should read fine.

mirce.vladimirov

  • Full Member
  • ***
  • Posts: 220
Re: XML what's wrong with my file?
« Reply #2 on: September 14, 2015, 11:17:56 am »
My application should import data from third party, I recieve the file from the third party application so I dont know how to remove this BOM signature.
Other XML editors are doing fine with the file but Lazarus XML component reports an error.

rvk

  • Hero Member
  • *****
  • Posts: 4227
Re: XML what's wrong with my file?
« Reply #3 on: September 14, 2015, 11:27:32 am »
What component are you using to read the XML? And what version of Lazarus are you using?

(If that component has a ReadFromStream you could create a stream from file and set the position at character 4)

B.T.W. I checked ReadXML (ReadXMLFile) and that one does take BOM-signatures into account. But that one trips over the #00s at the end of your file.
« Last Edit: September 14, 2015, 11:30:04 am by rvk »

mirce.vladimirov

  • Full Member
  • ***
  • Posts: 220
Re: XML what's wrong with my file?
« Reply #4 on: September 14, 2015, 11:43:27 am »
Here's my code. I use lazarus 1.0.2. So far I was exporting data to XML from my applications, it's my first time to import. Exporting was all right, all worked from scratch.

Code: [Select]
unit Unit1;

{$mode objfpc}{$H+}

interface

uses
  Classes, SysUtils, FileUtil, Forms, Controls, Graphics, Dialogs, StdCtrls,
  LCLType, laz2_DOM, laz2_XMLRead ;

type

  { TForm1 }

  TForm1 = class(TForm)
    Button1: TButton;
    Button2: TButton;
    edit_filename: TEdit;
    Label1: TLabel;
    memo_dok: TMemo;
    memo_stavki: TMemo;
    procedure Button1Click(Sender: TObject);
    procedure Button2Click(Sender: TObject);
  private
    { private declarations }
  public
    { public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.lfm}

{ TForm1 }

procedure TForm1.Button2Click(Sender: TObject);
begin
  close;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  Doc: TXMLDocument;
  Child: TDOMNode;
  j: Integer;
begin
  memo_dok.Clear;
  memo_stavki.Clear;
  doc:=TXMLDocument.Create;
  ReadXMLFile(doc, 'ispratnici.xml');
  Child:=doc.DocumentElement.FirstChild;
  memo_dok.Append('Vcitav ' + child.NodeName + ':' + child.NodeValue);
  while assigned(child) do begin
     memo_dok.Append('Vcitav ' + child.NodeName + ':' + child.NodeValue);
     child:=child.NextSibling;
  end;
   doc.free;

end;

end.

rvk

  • Hero Member
  • *****
  • Posts: 4227
Re: XML what's wrong with my file?
« Reply #5 on: September 14, 2015, 12:18:37 pm »
First of all...Lazarus 1.0.2 is an old version. Lazarus 1.4 does read your BOM-signature of the XML-file fine. But it trips over the #0s at the end (see attached image). When I remove them manually the file reads fine (with Laz1.4).

You can look in your source lazarus\fpc\2.6.0\source\packages\fcl-xml\src\xmlread.pp if it has BOM-reading in TXMLDecodingSource.Initialize.

With what version are you writing the XML-files? Why are there so many #0s at the end?

eny

  • Hero Member
  • *****
  • Posts: 1600
Re: XML what's wrong with my file?
« Reply #6 on: September 14, 2015, 12:40:54 pm »
In addition, remove the line:
Code: [Select]
doc:=TXMLDocument.Create;It creates a memory leak.
All posts based on: Win10 (Win64); Lazarus 2.0.10 'stable' (win64) unless specified otherwise...

mirce.vladimirov

  • Full Member
  • ***
  • Posts: 220
Re: XML what's wrong with my file?
« Reply #7 on: September 14, 2015, 12:41:43 pm »
With what version are you writing the XML-files? Why are there so many #0s at the end?

It's not generated by my application but a third party one. I dont know how it's done.

rvk

  • Hero Member
  • *****
  • Posts: 4227
Re: XML what's wrong with my file?
« Reply #8 on: September 14, 2015, 12:49:22 pm »
So that application produces #0 after the file. I'm not sure if this is really allowed according to the specifications of XML. But you need to strip them.

Here is a small example you can use. It reads the file to a TMemoryStream and strips the first 3 characters (BOM-sig) which is only needed for Laz1.0.2. After that it strips all #0 characters. After that the XML is recursively read.

Note. Node.NodeValue is empty for <tag>abc</tag> and you need to do FirstChild.NodeValue to read the abc. (see tutorial at http://wiki.lazarus.freepascal.org/XML_Tutorial)

Code: [Select]
uses laz2_DOM, laz2_XMLRead;

// from http://stackoverflow.com/a/18802225/1037511
procedure RemoveNullFromMemoryStream(Stream: TMemoryStream);
var
  i: integer;
  pIn, pOut: PByte;
begin
  pIn := Stream.Memory;
  pOut := pIn;
  for i := 0 to Stream.Size - 1 do
  begin
    if pIn^ <> 0 then
    begin
      pOut^ := pIn^;
      Inc(pOut);
    end;
    Inc(pIn);
  end;
  Stream.SetSize(NativeUInt(pOut) - NativeUInt(Stream.Memory));
end;

procedure TForm1.Button1Click(Sender: TObject);

  procedure RecurseXML(Node: TDOMNode; Margin: string);
  begin
    while assigned(Node) do
    begin
      if Node.ChildNodes.Count = 1 then
        memo_dok.Append(Margin + Node.NodeName + ' : ' + Node.FirstChild.NodeValue)
      else
      if Node.ChildNodes.Count > 0 then
        RecurseXML(Node.FirstChild, Margin + '      ');
      Node := Node.NextSibling;
    end;
  end;

var
  Doc: TXMLDocument;
  Stream1: TMemoryStream;
begin
  memo_dok.Clear;
  // memo_stavki.Clear;
  Stream1 := TMemoryStream.Create;
  // Doc := TXMLDocument.Create;      // <-- create is already done in ReadXMLFile
  try
    Stream1.LoadFromFile('ispratnici.xml');
    RemoveNullFromMemoryStream(Stream1);
    Stream1.Position := 3; // only for Lazarus 1.0.2. Otherwise you can ommit this
    ReadXMLFile(Doc, Stream1);
    try
      RecurseXML(Doc.DocumentElement.FirstChild, '');
    finally
      Doc.Free;
    end;
  finally
    Stream1.Free;
  end;
end;

Result from your file:
Code: [Select]
      Broj : 100102
      Datum : 10032015
      PrevoznikSifra : 0001
      Prevoznik : Transport DOOEL
      SopstvenPrevoz : 0
      KomintentEMBG : 2303985310005
      KomintentName : Гоце Георгиевски
      KomintentAdresa : Илинденска 98
      KomintentMesto : Скопје
      PodruznicaSifra : 1
      Podruznica : Š.S. "MALEŠEVO" - BEROVO

      SumskoStopanskaEdinicaSifra : 1-1
      SumskoStopanskaEdinica : Bregalnica
                  VidNaDrvoSifra : BK
                  VidNaDrvo : BUKA
                  SortimentSifra : 006
                  Sortiment : KOLARSKO DRVO
                  EdinecnaMera : m3
                  Kolicina : 5,00
                  VidNaDrvoSifra : CB
                  VidNaDrvo : CRN BOR
                  SortimentSifra : 002
                  Sortiment : PILANSKI TRUPCI II
                  EdinecnaMera : m3
                  Kolicina : 5,00


Edit: Yes, like eny says... Doc := TXMLDocument.Create does create a leak because TXMLDocument is already created in ReadXMLFile.
« Last Edit: September 14, 2015, 01:30:04 pm by rvk »

 

TinyPortal © 2005-2018