Recent

Author Topic: Reading a complex text file in Pascal  (Read 9123 times)

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Reading a complex text file in Pascal
« on: October 24, 2019, 09:00:12 pm »
Dear ALL,

I have a text file with data structured as follows:

#1. brand/
       1. ibm <the one and only>/
       2. osborn/
       3. amstrad/
       4. dick Smith <possibly The worst computer in the world>/
       5. epson/
       6. compaq/
       7. nec/
       8. datamini/

#2. processor <type of micro processor>/
       1. 8088 <trash80's>/
       2. 80186/
       3. 80286/
       4. 80386/
       5. 80486/

#3. harddisk size <capacity>/
       MB/

#4. memory size <capacity>/
       MB/

#5. number of drives/
       drives/

#6. appearance <what it looks like>/

#7. soundcard/
       1. present/
       2. absent/

#8. soundcard type <type of soundcard>/
       1. adlib/
       2. adlib gold/
       3. soundblaster one point five/
       4. soundblaster two/
       5. soundblaster pro/
       6. boom board/
       7. thunder board <rules!>/
       8. D-A converter/
       9. disney sounds/
       10. roland/

#9. monitor/
       1. present/
       2. absent/

Each '#' defines the start of a "record", which may have several characteristics ("states") attached to it, as microprocessor type, number of drives or size of harddisk. A blank line is used to separate each record.

What I want to do is (at first), to read each record and find out how many states each record has (so, record #1 has 8 states, whereas record #5 has one, and record #6 has none).

I have been having a hard time attempting to correctly psrse such data, with the skeleton code below,

Code: Pascal  [Select][+][-]
  1. program readRecords;
  2.  
  3. {$APPTYPE CONSOLE}
  4. {$MODE DELPHI}
  5.  
  6. uses
  7.     Classes, SysUtils, StrUtils;
  8.        
  9. var
  10.         infile: TextFile;
  11.         recNum: string;
  12.         recName: string;
  13.         ch: char;
  14.         recCount: integer;
  15.        
  16. Function LastPos(Ch: Char; St: String): Integer;
  17. Var
  18.   i: Integer;
  19. Begin
  20.   i := Length(St);
  21.   While (i > 0) And (St [i] <> Ch) Do Dec(i);
  22.   Result := i;
  23. End;
  24.        
  25. begin
  26.         AssignFile(infile, 'chars_pc');
  27.         Reset(infile);
  28.         while not EoF(infile) do
  29.         begin
  30.                 Read(infile, ch);
  31.                 if (ch = '#') then begin
  32.                         Inc(recCount);
  33.                         recName := '';
  34.                         repeat
  35.                                 Read(Infile, ch);
  36.                                 if (ch <> #13) and (ch <> #10) then recName := Concat(recName, ch);
  37.                         until ((ch = '/') and not EoF(infile));
  38.                         recName := Copy(recName, Pos('.', recName) + 1, Length(recName));
  39.                         recName := Copy(recName, 1, LastPos('/', recName) - 1);
  40.                         WriteLn(recName);
  41.                 end;           
  42.         end;   
  43.         WriteLn(IntToStr(recCount), ' records read');
  44.         CloseFile(infile);
  45. end.
  46.  

I get to read each record name, but could not see how to read the characteristics associated to each record.

Could someone give me any hints?

Thanks in advance!

Best wishes,
« Last Edit: October 24, 2019, 10:02:15 pm by maurobio »
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #1 on: October 24, 2019, 09:31:31 pm »
This format is generated by a third-part software. Anyway, even if I intended to convert it into another format, I still would have to read this file anyway.
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

Handoko

  • Hero Member
  • *****
  • Posts: 5425
  • My goal: build my own game engine using Lazarus
Re: Reading a complex text file in Pascal
« Reply #2 on: October 24, 2019, 09:35:40 pm »
It shouldn't be hard. But it's late night here, I'm going to sleep now. If no one provide your the solution, I will write it for you tomorrow.

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #3 on: October 24, 2019, 09:40:57 pm »
Thank you very much, Handoko. There is no rush anyway!
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

MarkMLl

  • Hero Member
  • *****
  • Posts: 8393
Re: Reading a complex text file in Pascal
« Reply #4 on: October 24, 2019, 09:49:48 pm »
Record 1 has five states? Why?

For something open-ended I think I'd write it like a parser.

Assume that "read" works on individual characters and first looks at a temporary variable so that you can "put a character back" (i.e. into that variable) if you don't  like it. Different languages or data representations need different amounts of backtracking.

You're at the start of the file, expect a well-formed record.

You're at the start of a record, discard whitespace and expect # (anything else is an error i.e. report back that you don't have a well-formed record).

While you've got digits save them temporarily as a number. When you've not got a digit "put it back".

Expect . anything else is an error.

Discard whitespace.

Save everything until / as the record name.

Expect a sequence of content lines.

If blank or EOF you're at the end of the record.

Otherwise discard whitespace and expect a number (as above), incrementing the count of lines in that record.

And so on.

So your parser becomes something like

Code: Pascal  [Select][+][-]
  1.   repeat
  2.   until not wellFormedRecord();
  3.  

and wellFormedReord() is defined like

Code: Pascal  [Select][+][-]
  1.   function wellFormedRecord(): boolean;
  2.  
  3.   begin
  4.      result := true;
  5.      if not wellFormedHeaderLine() then
  6.       exit(false);
  7.      if not wellFormedContentSequence() then
  8.       exit(false)
  9.   end;
  10.  

And so on. You will find looking at https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form will help, basically you write a function for every rule in the grammar and in cases where you have multiple possibilities you expect one of a number of functions to return success otherwise you report an error.

If you were parsing something like Pascal, C, HTML or an XML-based data file you would find lots of tools to help. But when you've got a completely custom format you will find that being able to write a custom parser with moderate proficiency is a useful skill.

Alternatively you could read entire lines and process each as a regex (regular expression) deciding whether it was a header, content line and so on, but I really don't recommend trying that.

MarkMLl
« Last Edit: October 24, 2019, 09:52:03 pm by MarkMLl »
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #5 on: October 24, 2019, 10:05:33 pm »
In fact, record #1 has eight states, not five. It was a typo (which I just corrected).

Thanks for the tips, MarkML. In fact, a parser should be the best way of handling such a file, but I would be happy with something simpler, and faster.
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: Reading a complex text file in Pascal
« Reply #6 on: October 24, 2019, 10:18:16 pm »
Hi!

Complicated thing for a newbie.

Try it this way:

Code: Pascal  [Select][+][-]
  1.     program readTheRecords;
  2.      
  3.     {$APPTYPE CONSOLE}
  4.     {$MODE DELPHI}
  5.      
  6.     uses
  7.         Classes, SysUtils;
  8.    
  9.     procedure readRecords;
  10.  
  11.      Type
  12.       TOneRec = Record
  13.                             Header: String;
  14.                             data : TStringList;
  15.                          end;
  16.  
  17.     var
  18.               Records : array[1..9] of TOneRec;
  19.               infile: TextFile;
  20.               s,msg : string;
  21.               RecNo,i : integer;
  22.               fname : String;
  23.  
  24.  begin    
  25.           fname := ExtractFilePath(Application.ExeName)+'data.txt'; // change the name
  26.            AssignFile(infile, fname);
  27.            Reset(infile);
  28.            RecNo := 0;
  29.             while not EoF(infile) do
  30.             begin
  31.                 readln (infile,s); // dont care about CR & LF
  32.                 s := trim(s); // strip leading and trailing blanks
  33.                 if  length(s) > 0  then // forget empty strings
  34.                      begin
  35.                         if s[1] = '#' then
  36.                                     begin
  37.                                          inc(RecNo);
  38.                                          Records[RecNo].header :=  s;
  39.                                          Records[RecNo].data := TStringList.create;
  40.                                     end else
  41.                                             Records[RecNo].data.add(s);
  42.                     end; // if length
  43.              end; // while
  44.  
  45.              CloseFile(infile);
  46.              msg := '';
  47.              for i := 1 to 9 do msg := msg+records[i].header+lineEnding+
  48.                                        IntToStr(records[i].data.count) + lineEnding;
  49.             showMessage (msg);
  50.  
  51.              msg := '';
  52.              for i := 1 to 9 do
  53.                msg := msg+Records[i].data.text+'================'+LineEnding;
  54.             showMessage (msg);
  55.  
  56.            for i := 1 to 9 do records[i].data.free;
  57. end;
  58.  
  59.  
  60. begin
  61. readRecords;
  62. end.
  63.  

If any questions then ask!

Winni








maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #7 on: October 25, 2019, 12:10:49 am »
Many thanks, Winni! Although I keep getting access violation errors with the code you provided, nonetheless the core part of it, that is, reading correctly the file in just one pass, works well and I hope to used it as inspiration for something workable.

(BTW, the line where I get the Access violation error is this: Records[RecNo].data.add(s); It looks right, but the program only runs if this line is commented out.).
« Last Edit: October 25, 2019, 12:59:43 am by maurobio »
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

jamie

  • Hero Member
  • *****
  • Posts: 6892
Re: Reading a complex text file in Pascal
« Reply #8 on: October 25, 2019, 01:35:13 am »
That code assumes you have 9 records to read, no more no less.

 None of the records are 0 filled so there is no guarantee all the fields are NIL when you do the 1 to 9 loop to free all of the objects. Because a NIL object calling free will be ignored but if there is garbage data there then it will make an attempt to free it..

 Maybe a better solution would be to use the RecNum variable as that is reporting the number of actual records defined

 For I := 1 to RecNo do Records.Data.Free;
etc

 also
   the code looks like it could skip over a line and miss a '#', if so...

 If Records[RecNo].Data <> nil Then Records[RecNo].Data.Add(….);

and so on

The only true wisdom is knowing you know nothing

440bx

  • Hero Member
  • *****
  • Posts: 5302
Re: Reading a complex text file in Pascal
« Reply #9 on: October 25, 2019, 01:54:48 am »
@maurobio

The method that MarkMLI mentioned above is the better way of parsing the text. 

Your file has a very simple structure, a line either starts with a # or a digit (skipping and ignoring leading spaces.)  If it's a #, you know it is the start of a group, if it's a digit, it's an element of the group, CRLF/newline seems to be "ornamental" and likely to simply be ignored, anything else (in the start of a line) is a "syntactic" error.   Every line is ended with a / and is expected to be followed by a CRLF.  That's a very simple tokenizer.

If you are using Windows, I'd suggest you simply map the file in memory and build an array of structures with pointers, length and type (i.e, group or element)  of each token.  Blazingly fast and very easy to debug.

HTH.


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #10 on: October 25, 2019, 02:07:08 am »
@jamie, I still get Access violation errors after adding the modification you suggested.

@44bx, the code example provided by @Winni has been quite helpful for reading the file and it suffices for pratical purposes without the time-consuming process of implementing a full parser.

The problem now, as I see it, is a different one: how to solve memory access errors when add to the string list of characteristics.

Cheers,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: Reading a complex text file in Pascal
« Reply #11 on: October 25, 2019, 03:36:31 am »
Hi!

The code is exactly for the data you showed us.

If you add one or more records then the code will crash, see: array[1..9] of ...

If you got now 12 records you must replaces all 9 in the code with 12.
If you do some writing or display on the screen it all must be done, before the
records.data.free!!!

If you changed the data then show us the data please.
And if you changed the code then show us the code please.
That is much easier than walking through the fog.

We will get it riunning!

Winni



maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #12 on: October 25, 2019, 04:33:38 am »
Winni, I am quite aware of what Pascal records and static arrays are. Also, I surely would not change the code, much less the data, in ways it would not work, if just what I want is that you guys help me!

That said, here is the code exactly as I am trying to run it:

Code: Pascal  [Select][+][-]
  1. program readTheRecords;
  2.      
  3. {$APPTYPE CONSOLE}
  4. {$MODE DELPHI}
  5.      
  6. uses
  7.         Classes, SysUtils;
  8.    
  9.         procedure readRecords;
  10.  
  11. Type
  12.         TOneRec = Record
  13.                 Header: String;
  14.                 data: TStringList;
  15.         end;
  16.  
  17. var
  18.         Records: array[1..9] of TOneRec;
  19.         infile: TextFile;
  20.         s, msg: string;
  21.         RecNo, i: integer;
  22.         fname: String;
  23.  
  24.  begin    
  25.      //fname := ExtractFilePath(Application.ExeName)+'data.txt'; // change the name
  26.     fname := 'data.txt';
  27.     AssignFile(infile, fname);
  28.     Reset(infile);
  29.     RecNo := 0;
  30.     while not EoF(infile) do
  31.       begin
  32.           readln (infile,s); // dont care about CR & LF
  33.           s := trim(s); // strip leading and trailing blanks
  34.           if  length(s) > 0  then // forget empty strings
  35.                begin
  36.                   if s[1] = '#' then
  37.                               begin
  38.                                 inc(RecNo);
  39.                                 Records[RecNo].header :=  s;
  40.                                 Records[RecNo].data := TStringList.create;
  41.                               end else
  42.                                  Records[RecNo].data.add(s);
  43.  
  44.               end; // if length
  45.     end; // while
  46.  
  47.     CloseFile(infile);
  48.     msg := '';
  49.     for i := 1 to 9 do msg := msg+records[i].header+lineEnding+
  50.                                  IntToStr(records[i].data.count) + lineEnding;
  51.     //showMessage (msg);
  52.         WriteLn(msg);
  53.  
  54.     msg := '';
  55.     for i := 1 to 9 do
  56.          msg := msg+Records[i].data.text+'================'+LineEnding;
  57.     //showMessage (msg);
  58.         WriteLn(msg);
  59.  
  60.     for i := 1 to 9 do records[i].data.free;
  61. end;
  62.  
  63. begin
  64. readRecords;
  65. end.
  66.  

The only changes I have made were in using a constant for the file name and getting rid of calls to 'ShowMessage()', that I replaced for the old and good WriteLn, since I am running this program outside from Lazarus, using a console window and calling fpc from the prompt. However, I keep getting the dreadful Access violation error anyway, either running from a console window or from inside Lazarus. Otherwise, the program runs without errors, if the line Records[RecNo].data.add(s); is commented out (which, of course, it shouldn't be).

I thank you again for your patience and, above all, for providing me with code which can be tested and experimented with.

Best regards,
« Last Edit: October 25, 2019, 04:39:48 am by maurobio »
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

maurobio

  • Hero Member
  • *****
  • Posts: 640
  • Ecology is everything.
    • GitHub
Re: Reading a complex text file in Pascal
« Reply #13 on: October 25, 2019, 05:27:22 am »
Mysteriously, now the code works fine, with the Records[RecNo].data.add(s); in place!  :o

Not really sure of what happened, but perhaps there were some characters in the end of my text file which I haven't noticed before.

I will now proceed to get rid of the array of records of fixed size, and try using collections (instead of tlist's and dreadful pointers) to store the data.

Best regards,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 3.8 - FPC 3.2.2 on GNU/Linux Mint 19.1/20.3, Windows XP SP3, Windows 7 Professional, Windows 10 Home

440bx

  • Hero Member
  • *****
  • Posts: 5302
Re: Reading a complex text file in Pascal
« Reply #14 on: October 25, 2019, 06:59:26 am »
Mysteriously, now the code works fine, with the Records[RecNo].data.add(s); in place!  :o

Not really sure of what happened, but perhaps there were some characters in the end of my text file which I haven't noticed before.
It isn't my intention to give you a hard time but, one of the important reasons it would be good to parse the file is to ensure it is structured as desired and won't cause unexpected problems if it isn't.

...(instead of tlist's and dreadful pointers) to store the data.
Poor pointers... they get such a bad rap.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018