Recent

Author Topic: [SOLVED] Putting unstructured data in one large string into a record  (Read 529 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 751
I have searched for an answer to this, and one may exist, but I'm not quite sure of the right terms to use to find it, so...

I am interested in trying to put a fairly unstructured buffer of data (actually a large widestring) into a record, so I can selectively use certain parts of the string data, but not all of it.

The raw data string (lets call it MyRawData) contains many values that can be of varied length, seperated only by a CRLF (0x0D0A), so there's no ':' or ',' etc. For example :

Name Peter Pan
Age 25
Street My Great Street
Town My Awesome Town
County Some County
Country UK
Occupation Some Career

What I would like to do in one part of my program, instead of seeing the entire content of MyRawData is :

MyData.Name
MyData.Address
MyData.Occupation

but not use the rest. And somewhere else I might want to use the rest, or some of it :

MyData.Age
MyData.County
MyData.Country

I am not sure, though, how to ensure the right parts of the data string (MyRawData) go into the right part of the record (MyData) when they can be of varied length, with one or more space char, and only a CRLF to seperate the values? I'm sure it is possible, but I am not sure how. In other words, how do you "parse" an unpredictable string of data and break it into individual records that may only need to hold 2 or 3 characters, or many? Can anyone help?
« Last Edit: February 26, 2021, 10:56:18 am by Gizmo »
Lazarus 2.0.10 and fpc 3.2.0 - Linux Mint 19 LTS, Windows 10 64 and Mac OSX Catalina
Useful Page to remember : http://wiki.freepascal.org/Cross_compiling#From_Linux_x64_to_Linux_i386

MarkMLl

  • Hero Member
  • *****
  • Posts: 2368
Re: Putting unstructured data in one large string into a record
« Reply #1 on: February 25, 2021, 08:35:51 pm »
I'd suggest that it would be worth changing your subject line to "parsing unstructured data": leave the "record" bit out of it since in practical terms you might want to put it into a database in which case "row" would be a more accurate term... in any event "record" strongly suggests the Pascal data structure of that name which might not be quite what you want.

Naively, I suggest looking at TStringList.Text as described at https://www.freepascal.org/docs-html/current/rtl/classes/tstrings.text.html  Use that to break your initial string up into multiple strings in a list, then repeatedly iterate over them to trim leading/trailing space, eliminate embedded = should it occur, then use heuristics (technical term: guess) what type of data each line represents and get it into key=value form.

MarkMLl
Turbo Pascal v1 on CCP/M-86, multitasking with LAN and graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Gizmo

  • Hero Member
  • *****
  • Posts: 751
Re: Putting unstructured data in one large string into a record
« Reply #2 on: February 25, 2021, 11:59:50 pm »
Yes the SL idea might be a more practical compromise. Especially as the order in which this data appears is always the same, even if the content varies. So I could then use SL.Lines(x). Thanks for the idea.

I do think a record as implied would be useful to me, and ideas to implement that would work best for me I think but as you suggest, it might not be a practical solution in this situation.
Lazarus 2.0.10 and fpc 3.2.0 - Linux Mint 19 LTS, Windows 10 64 and Mac OSX Catalina
Useful Page to remember : http://wiki.freepascal.org/Cross_compiling#From_Linux_x64_to_Linux_i386

GetMem

  • Hero Member
  • *****
  • Posts: 3978
Re: Putting unstructured data in one large string into a record
« Reply #3 on: February 26, 2021, 07:26:29 am »
@Gizmo

You can try something like this:
Code: Pascal  [Select][+][-]
  1. type
  2.   TData = record
  3.     FName: String;
  4.     FAge: String;
  5.     FStreet: String;
  6.     FTown: String;
  7.     FCountry: String;
  8.   end;
  9.  
  10. var
  11.   DataList: array of TData;
  12.  
  13. procedure AddRawDataToList(const ARawData: String);
  14. var
  15.   SL: TStringList;
  16.   Data: TData;
  17.   I: Integer;
  18.   Str: String;
  19. begin
  20.   SL := TStringList.Create;
  21.   try
  22.     SL.Text := ARawData;
  23.     for I := 0 to SL.Count - 1 do
  24.     begin
  25.       Str := SL.Strings[I];
  26.       case I of
  27.         0: begin
  28.              Delete(Str, 1, Length('Name') + 1);
  29.              Data.FName := Str;
  30.             end;
  31.         1: begin
  32.              Delete(Str, 1, Length('Age') + 1);
  33.              Data.FAge := Str;
  34.            end;
  35.         2: begin
  36.              Delete(Str, 1, Length('Street') + 1);
  37.              Data.FStreet := Str;
  38.            end;
  39.         3: begin
  40.              Delete(Str, 1, Length('Town') + 1);
  41.              Data.FTown := Str;
  42.            end;
  43.         4: begin
  44.              Delete(Str, 1, Length('Country') + 1);
  45.              Data.FCountry := Str;
  46.            end;
  47.       end;
  48.     end;
  49.   finally
  50.     SL.Free;
  51.   end;
  52.   I := Length(DataList);
  53.   SetLength(DataList, I + 1);
  54.   DataList[I] := Data;
  55. end;

Add a few record to list:
Code: Pascal  [Select][+][-]
  1. var
  2.   RawData: String;
  3. begin
  4.   RawData := 'Name Peter Pan' + sLineBreak +
  5.              'Age 25' + sLineBreak +
  6.              'Street My Great Street' + sLineBreak +
  7.              'Town My Awesome Town' + sLineBreak +
  8.              'Country Some County';
  9.   AddRawDataToList(RawData);
  10.  
  11.   RawData := 'Name X Y' + sLineBreak +
  12.              'Age 47' + sLineBreak +
  13.              'Street The Street' + sLineBreak +
  14.              'Town The Town' + sLineBreak +
  15.              'Country Some other County';
  16.   AddRawDataToList(RawData);          
  17. end;

Loop through the list and display the records:
Code: Pascal  [Select][+][-]
  1. var
  2.   I: Integer;
  3. begin
  4.   for I := Low(DataList) to High(DataList) do
  5.     ShowMessage('Name: ' + DataList[I].FName  + sLineBreak +
  6.                 'Age: ' + DataList[I].FAge  + sLineBreak +
  7.                 'Street: ' + DataList[I].FStreet + sLineBreak +
  8.                 'Town: ' + DataList[I].FTown + sLineBreak +
  9.                 'Country: ' + DataList[I].FCountry);            
  10. end;

egsuh

  • Hero Member
  • *****
  • Posts: 682
Re: Putting unstructured data in one large string into a record
« Reply #4 on: February 26, 2021, 07:51:47 am »
TStringList is the best option.

MarkMLl

  • Hero Member
  • *****
  • Posts: 2368
Re: Putting unstructured data in one large string into a record
« Reply #5 on: February 26, 2021, 08:52:22 am »
Beware of anybody who tells you to use a neural network :-)

MarkMLl
Turbo Pascal v1 on CCP/M-86, multitasking with LAN and graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Gizmo

  • Hero Member
  • *****
  • Posts: 751
Re: [SOLVED] Putting unstructured data in one large string into a record
« Reply #6 on: February 26, 2021, 10:57:49 am »
Thank you, and GetMem for taking so much time and effort to providing that example.

I've marked this as solved as I clearly have several options and ways to think about it. I just wanted to check I wasn't missing some easy and obvious way. Thanks all
Lazarus 2.0.10 and fpc 3.2.0 - Linux Mint 19 LTS, Windows 10 64 and Mac OSX Catalina
Useful Page to remember : http://wiki.freepascal.org/Cross_compiling#From_Linux_x64_to_Linux_i386

 

TinyPortal © 2005-2018