Recent

Author Topic: Best way to parse file.  (Read 3842 times)

BSaidus

  • Sr. Member
  • ****
  • Posts: 453
  • lazarus 1.8.4 Win8.1 / cross FreeBSD
Re: Best way to parse file.
« Reply #15 on: January 27, 2023, 04:44:12 pm »
I think your way of doing is just wrong.
When you can not gather that information by code, what purpose shall it have?
In my thinking, when I am not able to get information by code, I would simple display output of console in my app somewhere.
I need this to get the ethernet interfaces names, Mac addresses and finally parse the file "/var/run/dmesg.boot" to get the driver names associated to each of these network interfaces.
    If you have a code in pascal that can get these informations without parsing and calling others helper programs, you are welcome to help.
    :)
lazarus 1.8.4 Win8.1 / cross FreeBSD
dhukmucmur vernadh!

KodeZwerg

  • Hero Member
  • *****
  • Posts: 1195
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Best way to parse file.
« Reply #16 on: January 27, 2023, 05:47:48 pm »
Take a look here and tell if that works for you. I just have windows and there I got Api's to call.
« Last Edit: Tomorrow at 31:76:97 by KodeZwerg »

domasz

  • Full Member
  • ***
  • Posts: 202
Re: Best way to parse file.
« Reply #17 on: January 27, 2023, 05:53:59 pm »
Probably not the best, but the first one which came to my mind: Read the file into a stringlist and then split the lines at the fixed positions given by the start of the header columns by means of the good-old Copy command.
I believe this is the best way.

Curt Carpenter

  • Full Member
  • ***
  • Posts: 229
Re: Best way to parse file.
« Reply #18 on: January 27, 2023, 06:05:49 pm »
A philosophical note:  "best" is I think a normative rather than a a descriptive concept, related to the problem of "what should I do" as opposed to "what can I do" or "what works."  "Best" often depends on our objectives and our frame of reference.  As any "C" programmer can tell you, what's "best" from an efficiency perspective may be the absolute worst from a readability-six-months-later perspective :)

domasz

  • Full Member
  • ***
  • Posts: 202
Re: Best way to parse file.
« Reply #19 on: January 27, 2023, 06:56:34 pm »
As any "C" programmer can tell you, what's "best" from an efficiency perspective may be the absolute worst from a readability-six-months-later perspective :)
From my experience Pascal programmers rather favor readable code over optimized code as long as it's not terribly slow. Solution with "Copy" is both fast enough and readable enough. It also won't fail on data on which TStringList.Split might fail.

KodeZwerg

  • Hero Member
  • *****
  • Posts: 1195
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Best way to parse file.
« Reply #20 on: January 27, 2023, 07:31:03 pm »
This is how I would do.
Code: Pascal  [Select][+][-]
  1. program project1;
  2. {$IFDEF MSWINDOWS}{$APPTYPE CONSOLE}{$ENDIF}
  3.  
  4. uses
  5.   SysUtils, Classes;
  6.  
  7. type
  8.   Tif_net = record
  9.     Name,
  10.     Mtu,
  11.     Network,
  12.     Address,
  13.     Ipkts,
  14.     Ifail,
  15.     Opkts,
  16.     Ofail,
  17.     Colls : String;
  18.   end;
  19.   Tif_nets = array of Tif_net;
  20.  
  21. var
  22.   xxx: Tif_nets;
  23.   sl: TStrings;
  24.   i: Integer;
  25. begin
  26.   xxx := nil;
  27.   sl := TStringList.Create;
  28.   try
  29.     sl.LoadFromFile(ExtractFilePath(ParamStr(0)) + 'test.txt');
  30.     SetLength(xxx, sl.Count);
  31.     for i := 0 to Pred(sl.Count) do
  32.       begin
  33.         xxx[i].Name := Trim(Copy(sl.Strings[i], 1, 8));
  34.         xxx[i].Mtu := Trim(Copy(sl.Strings[i], 8, 6));
  35.         xxx[i].Network := Trim(Copy(sl.Strings[i], 14, 12));
  36.         xxx[i].Address := Trim(Copy(sl.Strings[i], 26, 21));
  37.         xxx[i].Ipkts := Trim(Copy(sl.Strings[i], 47, 6));
  38.         xxx[i].Ifail := Trim(Copy(sl.Strings[i], 53, 8));
  39.         xxx[i].Opkts := Trim(Copy(sl.Strings[i], 62, 6));
  40.         xxx[i].Ofail := Trim(Copy(sl.Strings[i], 68, 6));
  41.         xxx[i].Colls := Trim(Copy(sl.Strings[i], 74, 6));
  42.       end;
  43.     for i := Low(xxx) to High(xxx) do
  44.       begin
  45.         WriteLn('Name: ' + xxx[i].Name);
  46.         WriteLn('Mtu: ' + xxx[i].Mtu);
  47.         WriteLn('Network: ' + xxx[i].Network);
  48.         WriteLn('Address: ' + xxx[i].Address);
  49.         WriteLn('Ipkts: ' + xxx[i].Ipkts);
  50.         WriteLn('Ifail: ' + xxx[i].Ifail);
  51.         WriteLn('Opkts: ' + xxx[i].Opkts);
  52.         WriteLn('Ofail: ' + xxx[i].Ofail);
  53.         WriteLn('Colls: ' + xxx[i].Colls);
  54.       end;
  55.   finally
  56.     sl.Free;
  57.   end;
  58.   {$IFDEF MSWINDOWS}ReadLn;{$ENDIF}
  59. end.
« Last Edit: Tomorrow at 31:76:97 by KodeZwerg »

Curt Carpenter

  • Full Member
  • ***
  • Posts: 229
Re: Best way to parse file.
« Reply #21 on: January 27, 2023, 07:59:45 pm »
Quote
From my experience Pascal programmers rather favor readable code over optimized code as long as it's not terribly slow.
I certainly agree -- although I guess "readability" is a function on one's experience.  And with the maturity of experience I think people develop a sense of aesthetics and what makes "beautiful" code, just as mathematical maturity seems to lead to an appreciation for the beauty of a definition or a proof.

KodeSwerg's code above could be more beautiful if -- IMHO!!! -- he would adopt a more descriptive name strategy.  "xxx" looks too much like something I'd do, when "ANetRecord" (or something like that) would be more readable.  Otherwise -- every nice :)  Still, woe when somebody in a universe far away changes the tabs used in the creation of the records...

KodeZwerg

  • Hero Member
  • *****
  • Posts: 1195
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Best way to parse file.
« Reply #22 on: January 27, 2023, 09:20:13 pm »
KodeSwerg's code above could be more beautiful
I agree you, I just typed that while I was eating :P
« Last Edit: Tomorrow at 31:76:97 by KodeZwerg »

BSaidus

  • Sr. Member
  • ****
  • Posts: 453
  • lazarus 1.8.4 Win8.1 / cross FreeBSD
Re: Best way to parse file.
« Reply #23 on: January 28, 2023, 12:59:38 pm »
Hello.
Thank you all for your suggestions and advices. Now I have a choice between parsing using TStringList or RegExpr.
   Thanks you Men.
lazarus 1.8.4 Win8.1 / cross FreeBSD
dhukmucmur vernadh!

Kays

  • Sr. Member
  • ****
  • Posts: 494
  • Whasup!?
    • KaiBurghardt.de
Re: Best way to parse file.
« Reply #24 on: January 28, 2023, 03:21:31 pm »
I need this to get the ethernet interfaces names,
Your netstat(1) output will possibly contain ethernet as well as IEEE 802.11 and other devices.
Mac addresses
This will evidently fail for devices that do not have a MAC address, notably lo0.
[…] parse the file "/var/run/dmesg.boot"
Just the wording “parse” dmesg is outrageous. >:(  dmesg is not (strictly) structured. (If you were suffering from SystemDisease, you could at least get structured message objects via the JournalD subsystem.)
to get the driver names associated to each of these network interfaces.
On FreeBSD, OpenBSD and probably other OSs the devices are named according to their chosen driver. Device re0 uses the re driver. Device em0 uses the em driver. Evidently this fails for lo0, which doesn’t need a driver to begin with.

Now this does not take account of interface renaming (ifconfig xl0 name foobar), but from your design I gather durability isn’t a major concern?

[That‘s] the unix philosophy, instead of having to have one app with all the functionality, you have multiple specialized programs that do one thing very good and whose output can be reused by other programs. […]
This is absolutely legitimate for
  • machine-readable output, e. g. FreeBSD’s netstat provides the ‑‑libxo parameter to select a machine-readable format, or
  • standardized output, e. g. the output of ls is standardized (cf. § Stdout).
Yours Sincerely
Kai Burghardt

Warfley

  • Hero Member
  • *****
  • Posts: 1075
Re: Best way to parse file.
« Reply #25 on: January 28, 2023, 04:53:43 pm »
Hello.
Thank you all for your suggestions and advices. Now I have a choice between parsing using TStringList or RegExpr.
   Thanks you Men.
You must be careful with the Manual Parsing/TStringlist approach, because a simple implementation only works if the line is actually in that format. So with the example from KodeZwerg, if the line is not correctly formatted (e.g. there was an error and the program stopped halfway through a line, or worse, started printing an error message in the middle of the line), then it will not throw an error, but just read garbage input into your record.
So to get back to my example from before:
Code: Pascal  [Select][+][-]
  1.     function read_config(const line: String): if_net;
  2.     var
  3.       parts: TStringArray;
  4.       has_address: boolean;
  5.     begin
  6.       parts := line.split([' ', #9], TStringSplitOptions.ExcludeEmpty);
  7.       has_address = length(parts) = 9;
  8.       With Result do
  9.       begin
  10.         Name := parts[0];
  11.         Mtu := parts[1];
  12.         Network := parts[2];
  13.         // very lazy way to only load address if the boolean is true
  14.         Address := ifthen(has_address, parts[3], '');
  15.         // ord(boolean) = 1 if true, 0 if false, so it will be offset by 1 if there is the address
  16.         Ipkts := parts[3 + ord(has_address)];
  17.         Ifail := parts[4 + ord(has_address)];
  18.         Opkts := parts[5 + ord(has_address)];
  19.         Ofail := parts[6 + ord(has_address)];
  20.         Colls := parts[7 + ord(has_address)];
  21.       end;  
  22.     end;
Would at most give you a range error if the split does not give enough fields, but to have some true error detection, you would need things like this:
Code: Pascal  [Select][+][-]
  1. type
  2.   TIFNet = record
  3.     Name: String;
  4.     MTU: Integer;  
  5.     Network: String;
  6.     Address: String;
  7.  
  8.     Ipkts: Integer;
  9.     Ifail: Integer;
  10.     Opkts: Integer;
  11.     Ofail: Integer;
  12.     Colls: Integer;
  13.   end;
  14.  
  15. function read_config(const line: String): TIFNet;
  16. var
  17.   parts: TStringArray;
  18.   has_address: boolean;
  19. begin
  20.   parts := line.split([' ', #9], TStringSplitOptions.ExcludeEmpty);
  21.   if (Length(parts) < 8) Or (Length(parts) > 9) then
  22.     raise Exception.Create('Invalid Line');
  23.   has_address := length(parts) = 9;
  24.   With Result do
  25.   begin
  26.     Name := parts[0];
  27.     Mtu := parts[1].ToInteger; // Will raise an exception if not an integer
  28.     Network := parts[2];
  29.     // very lazy way to only load address if the boolean is true
  30.     Address := specialize IfThen<String>(has_address, parts[3], '');
  31.     // ord(boolean) = 1 if true, 0 if false, so it will be offset by 1 if there is the address
  32.     Ipkts := parts[3 + ord(has_address)].ToInteger; // Will raise an exception if not an integer
  33.     Ifail := parts[4 + ord(has_address)].ToInteger; // Will raise an exception if not an integer
  34.     Opkts := parts[5 + ord(has_address)].ToInteger; // Will raise an exception if not an integer
  35.     Ofail := parts[6 + ord(has_address)].ToInteger; // Will raise an exception if not an integer
  36.     Colls := parts[7 + ord(has_address)].ToInteger; // Will raise an exception if not an integer
  37.   end;
  38. end;

Regex has the advantage that it not just parses the data into groups, but also gives you an indication if the line has acutally the format that you expect at this point. So you don't need to do all of that error management in code.

Also note that the hardcoded offsets only work for your current system, but the alignment is dynamically chosen depending on the length of the fields. So for example if you have any larger IPv6 Subnet Prefixes, the Network column might be larger. So you cannot simply hardcode the offsets.

BSaidus

  • Sr. Member
  • ****
  • Posts: 453
  • lazarus 1.8.4 Win8.1 / cross FreeBSD
Re: Best way to parse file.
« Reply #26 on: January 29, 2023, 11:33:58 am »
I need this to get the ethernet interfaces names,
Your netstat(1) output will possibly contain ethernet as well as IEEE 802.11 and other devices.
Mac addresses
This will evidently fail for devices that do not have a MAC address, notably lo0.
[…] parse the file "/var/run/dmesg.boot"
Just the wording “parse” dmesg is outrageous. >:(  dmesg is not (strictly) structured. (If you were suffering from SystemDisease, you could at least get structured message objects via the JournalD subsystem.)
to get the driver names associated to each of these network interfaces.
On FreeBSD, OpenBSD and probably other OSs the devices are named according to their chosen driver. Device re0 uses the re driver. Device em0 uses the em driver. Evidently this fails for lo0, which doesn’t need a driver to begin with.

Now this does not take account of interface renaming (ifconfig xl0 name foobar), but from your design I gather durability isn’t a major concern?

[That‘s] the unix philosophy, instead of having to have one app with all the functionality, you have multiple specialized programs that do one thing very good and whose output can be reused by other programs. […]
This is absolutely legitimate for
  • machine-readable output, e. g. FreeBSD’s netstat provides the ‑‑libxo parameter to select a machine-readable format, or
  • standardized output, e. g. the output of ls is standardized (cf. § Stdout).

   Hello Kays,
   Yes I know that
Code: [Select]
netstat will output other devices like ( ppp|sl|gif|faith|lo|ng|ngwan|vlan|wlan|tun|enc|ipfw|bridge|usbus|pflog ) that I will skip.
   I will only get the ethernet devices with mac addresses.
   Yes the BSD systems names interfaces according to the driver name ( em for intel, rl for realtech ... ).
   I have no other choice then try to get driver names with caracteristique from dmeg
Code: [Select]
OpenBSD 7.2 (GENERIC.MP) #4: Mon Dec 12 06:06:42 MST 2022
    root@syspatch-72-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2130640896 (2031MB)
avail mem = 2048720896 (1953MB)
....
pcn0 at pci0 dev 3 function 0 "AMD 79c970 PCnet-PCI" rev 0x10, Am79c970A, rev 0: apic 2 int 19, address 08:00:27:0a:65:2d
...
pcn1 at pci0 dev 8 function 0 "AMD 79c970 PCnet-PCI" rev 0x40, Am79c973, rev 0: apic 2 int 16, address 08:00:27:78:3c:29
em0 at pci0 dev 9 function 0 "Intel 82540EM" rev 0x02: apic 2 int 17, address 08:00:27:8d:4a:c6
em1 at pci0 dev 10 function 0 "Intel 82543GC" rev 0x02: apic 2 int 18, address 08:00:27:00:0c:c8
...
root on wd0a (d735dcbb2deb7613.a) swap on wd0b dump on wd0b
WARNING: clock gained 2 days
WARNING: CHECK AND RESET THE DATE!
 
   Concerning
lazarus 1.8.4 Win8.1 / cross FreeBSD
dhukmucmur vernadh!

BSaidus

  • Sr. Member
  • ****
  • Posts: 453
  • lazarus 1.8.4 Win8.1 / cross FreeBSD
Re: Best way to parse file.
« Reply #27 on: January 29, 2023, 11:59:21 am »
Hello, @Warfley
I think there will be not ipv6 because the command requests only for ipv4
 
Code: [Select]
netstat -in -f inet
lazarus 1.8.4 Win8.1 / cross FreeBSD
dhukmucmur vernadh!

Warfley

  • Hero Member
  • *****
  • Posts: 1075
Re: Best way to parse file.
« Reply #28 on: January 29, 2023, 12:27:34 pm »
The example you've posted in the first post already contains the IPv6 loopback address:
Code: Pascal  [Select][+][-]
  1. lo0     32768 ::1/128     ::1                      0     0        0     0     0
This is why I mention this

BSaidus

  • Sr. Member
  • ****
  • Posts: 453
  • lazarus 1.8.4 Win8.1 / cross FreeBSD
Re: Best way to parse file.
« Reply #29 on: January 29, 2023, 01:45:18 pm »
The example you've posted in the first post already contains the IPv6 loopback address:
Code: Pascal  [Select][+][-]
  1. lo0     32768 ::1/128     ::1                      0     0        0     0     0
This is why I mention this

Yes, but it will be skipped since it is loopback,
here is the code
Code: Pascal  [Select][+][-]
  1.  
  2.   tBsd_Interface = record
  3.     if_name: String;
  4.     if_mac: String;
  5.     if_ip: String;
  6.     if_mask: String;
  7.     if_ip6: String;
  8.     if_drv: String;
  9.     if_desc: String;
  10.   end;
  11.  
  12.   tBsd_Interfaces = array[0..11] of tBsd_Interface;
  13.  
  14.  
  15. var
  16.   gBsd_Interfaces: tBsd_Interfaces;
  17.  
  18. function fpGet_interfaces_list_Lt(): Boolean;
  19. var
  20.   I, J : Integer;
  21.   lNetdata: TStringList;
  22.   lNetdataLine, lName,
  23.   lLink: String;
  24. begin
  25.   Result := False;
  26.   lNetdata := TStringList.Create();
  27.   try
  28.     // Get Interfaces name & mac
  29.     lNetdata.Text := fpExecCmdNetStat();
  30.     lNetdata.Delete(0);
  31.     J := 0 ;  // Index for the gBsd_Interfaces[J]
  32.     for I := 0 to Pred(lNetdata.Count) do
  33.     begin
  34.       lNetdataLine := lNetdata.Strings[I];
  35.       lName        := SysUtils.Trim( system.Copy( lNetdataLine,  1,  8 ) );
  36.       lLink        := system.LowerCase( SysUtils.Trim(system.Copy( lNetdataLine, 15, 12 ) ) );
  37.       // bypass lo0, enc0, pflog0, ....
  38.       if ( system.Pos('lo'    , lName   ) <> 0 ) or ( system.Pos('enc'   , lName   ) <> 0 ) or
  39.          ( system.Pos('pflog' , lName   ) <> 0 ) or ( system.Pos('ppp'   , lName   ) <> 0 ) or
  40.          ( system.Pos('sl'    , lName   ) <> 0 ) or ( system.Pos('gif'   , lName   ) <> 0 ) or
  41.          ( system.Pos('faith' , lName   ) <> 0 ) or ( system.Pos('ng'    , lName   ) <> 0 ) or
  42.          ( system.Pos('ngwan' , lName   ) <> 0 ) or ( system.Pos('vlan'  , lName   ) <> 0 ) or
  43.          ( system.Pos('wlan'  , lName   ) <> 0 ) or ( system.Pos('tun'   , lName   ) <> 0 ) or
  44.          ( system.Pos('ipfw'  , lName   ) <> 0 ) or ( system.Pos('bridge', lName   ) <> 0 ) or
  45.          ( system.Pos('usbus' , lName   ) <> 0 ) or ( system.Pos('<link>', lLink   )  = 0 ) then
  46.       begin
  47.         Continue;
  48.       end;
  49.       // delete the * from ex: pcn0*
  50.       if (System.Pos('*', lName) <> 0) then
  51.       begin
  52.         lName := system.Copy( lName, 1, system.Length(lName)-1 );;
  53.       end;
  54.       gBsd_Interfaces[J].if_name := lName;
  55.       gBsd_Interfaces[J].if_mac  := SysUtils.Trim( system.Copy( lNetdataLine, 27, 21) ) ;
  56.       J := J + 1;
  57.     end;
  58.  
  59.     // Get driver names
  60.     lNetdata.Text := fpExecCmddMesg();
  61.     for I := 0 to 11 do
  62.     begin
  63.       lName := gBsd_Interfaces[I].if_name;
  64.       if lName = '' then
  65.       begin
  66.         Exit;
  67.       end;
  68.  
  69.       for J := 0 to Pred(lNetdata.Count) do
  70.       begin
  71.         lNetdataLine :=  lNetdata.Strings[J];
  72.         if System.Pos( lName, lNetdataLine) = 1 then
  73.         begin
  74.           gBsd_Interfaces[I].if_drv := system.Copy( lNetdataLine, system.Pos('"', lNetdataLine), system.Length(lNetdataLine));
  75.           Break;
  76.         end;
  77.       end;
  78.     end;
  79.  
  80.   finally
  81.     lNetdata.Free();
  82.   end;
  83. end;
  84.  
« Last Edit: January 29, 2023, 04:28:14 pm by BSaidus »
lazarus 1.8.4 Win8.1 / cross FreeBSD
dhukmucmur vernadh!

 

TinyPortal © 2005-2018