Recent

Author Topic: 7,970,312 As it turns out.  (Read 1570 times)

JLWest

  • Sr. Member
  • ****
  • Posts: 340
7,970,312 As it turns out.
« on: January 04, 2019, 08:34:24 pm »
I have to deal with them. They won't load in a TStringList or a Listbox.

I guess I could break them up in 10 or 20 files, search for what I need to do and put them all back together again.

I believe they load them into Oracle add new data and or edit and then dump to text. Personal Oracle is limited to 1,000,000, MySQL 300 - 400 K.

So I'm thinking of a boxcar with a glass floor. The railroad ties are the text lines in the files and they pass thru a listbox. There read but not stored. Just rewritten until I get to the ones I need to edit. After editing I rewrite them close the reads and writes to the file.

I don't know how the reads and writes open and closed work. Do I have to go to the temp file and delete/rename?
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

lucamar

  • Hero Member
  • *****
  • Posts: 734
Re: 7,970,312 As it turns out.
« Reply #1 on: January 04, 2019, 11:03:31 pm »
Use one (or two) stream(s), but take extra-care of limit conditions: I can tell you, by bitter experience, that it's very easy to mangle a big file while using read/write windows to it.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus 1.8.4/FPC 3.0.4 on:
(K)Ubuntu 12..16, Windows XP SP3 (Home/Prof.) and various DOS incarnations.

jamie

  • Hero Member
  • *****
  • Posts: 1123
Re: 7,970,312 As it turns out.
« Reply #2 on: January 04, 2019, 11:56:41 pm »
why not just read in a small chunk at a time, process them and then write them out and the read in another
small chunk etc..

JLWest

  • Sr. Member
  • ****
  • Posts: 340
Re: 7,970,312 As it turns out.
« Reply #3 on: January 05, 2019, 12:29:09 am »
I don't know. I don't think I know the best way for my skill level.

If lucamar has trouble it's probably a bit over my pay scale.
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

rvk

  • Hero Member
  • *****
  • Posts: 3525
Re: 7,970,312 As it turns out.
« Reply #4 on: January 05, 2019, 12:40:54 am »
Reading this topic is really confusing if you didn't read your previous one  %)

What is it what you want exactly?
You have a file with 8 million lines and you want to convert to another file taken some fields from the big one?

How large (in bytes) is the big file?

If you post about 100 lines of that file and a few sample lines of your desired result I'm sure someone can give you a code snippet to convert it.

lucamar

  • Hero Member
  • *****
  • Posts: 734
Re: 7,970,312 As it turns out.
« Reply #5 on: January 05, 2019, 12:41:20 am »
I don't know. I don't think I know the best way for my skill level.

If lucamar has trouble it's probably a bit over my pay scale.

It was a lot of years ago, when a "big" file might be anything over 100 Kb and I was just a junior programmer. :)

The problem here is that we don't really know what you have to do so it's difficult to guess which approach would work best. Can't you give us a more detailed specification?

Is this related to your other post "Just advise"?

ETA: BTW, if you are looking for an industrial-strenght database check PostgreSQL
« Last Edit: January 05, 2019, 12:47:18 am by lucamar »
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus 1.8.4/FPC 3.0.4 on:
(K)Ubuntu 12..16, Windows XP SP3 (Home/Prof.) and various DOS incarnations.

JLWest

  • Sr. Member
  • ****
  • Posts: 340
Re: 7,970,312 As it turns out.
« Reply #6 on: January 05, 2019, 03:14:55 am »
Working on a reply and some sample data. Had it typed in but it timed me out and I lost my reply.
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

JLWest

  • Sr. Member
  • ****
  • Posts: 340
Re: 7,970,312 As it turns out.
« Reply #7 on: January 05, 2019, 04:15:36 am »
When I read it I count 7.9 million records.
Explorer says it's 285,956 KB.

Why am I bothering you with this? I suppose because I can. And you guys are the Holy Grail.

If you Google X-Plane 11 KLAX you will be be looking at KLAX as it appears in X-Plane 11. Everything except buildings and airplanes is described in the AptDotDat file. Signs, taxiways, ground markings, runways, lights, ground vehicles and on and on and on.

A record with a '1 ' in the first column is the start of an airport. the record I'm interested in is the '1301 ' ramp start metadata or where the airplanes are parked, who parks there and what size airplane is allowed for what kind of operation. Cargo, Airline, Military.

Some airports may have 15 or 20 of these records some 300 - 400 hundred. Tried to cut KLAX out with Notepad++ but it blows to the desktop so I have to write a quick and dirty.

Why do I need access. A Canadian wrote a program called World Traffic 3 that is suppose to populate aircraft to an airport park them at the gates. Land and take off. Makes an airport look real.

He is a one-man-band hasn't released a version update in over a year and his program dosen't work. Claims he still in the game.

To be truthful it never worked well.

One day I decided I would try and figure it out. Bought a payware airport KPDX Portland OR. and set it up in custom scenery. In custom scenery you only have to deal with the AptDotDat for a single airport. You can set the gate assignments a lot easier.

Now someone else supplied the flight information for WT3 called AFRE_Real_Time_Flights. Comes from an X-Planer out of Eastern Europe. Everyone was really big on that. Seldom got a flight parked at an airport with that data.

So I got KPDX with taxi routes and gate assignments and hand targeted flight information. Started X-Plane set the WT3 slider to 10,000 flights, Airport density 80% and pointed it at my hand edited targeted flight data. Got 78% density, takeoff is 8 min and landings 3 min later.  Now I'm waiting in line to take off.

I guess you could buy all the airports and set them up in Custom scenery but the data really needs to be in the global. It's the kind of thing people develop as an add on for X-Plane and sell.

Have no interest in that at all.Don't need the money at my age and would like to see WT3 flourish. So if I can do this I'll post it as a free download.
 
I'll be working on some data and start a new subject.
Thanks for the interest.
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

HeavyUser

  • Sr. Member
  • ****
  • Posts: 252
Re: 7,970,312 As it turns out.
« Reply #8 on: January 05, 2019, 04:26:08 am »
When I read it I count 7.9 million records.
Explorer says it's 285,956 KB.
That's impossible. Assuming that the format follows some standards (ee each record ends with a #10 character) then counting only the end of record bytes that would make your file 7.6MB long assuming that it contains more than empty record and assuming an average of 10 characters per record a file with 7.9 million records should be around 76 MB. There is no way it will be on the KB range. So I guess you did not count properly or ..... ?.

Blaazen

  • Hero Member
  • *****
  • Posts: 2731
  • POKE 54296,15
    • Eye-Candy Controls
Re: 7,970,312 As it turns out.
« Reply #9 on: January 05, 2019, 04:56:38 am »
Quote
Explorer says it's 285,956 KB.

It should mean ~280 MB (I guess that "," is thousand separator here, not decimal).
Lazarus 2.1.0 r59757M FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.11.2, Plasma 5.14.2
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

engkin

  • Hero Member
  • *****
  • Posts: 2194
Re: 7,970,312 As it turns out.
« Reply #10 on: January 05, 2019, 05:26:42 am »
Quote
I'm reading 8,000,000+ 7,970,312 records out of a text file into a listbox. But I only need to examine 78,000 plus or minus. The rest I just throw on the floor.

Here is a simple way:
Code: Pascal  [Select]
  1. var
  2.   fileName: string;
  3.   f: Text;
  4.   Line: string;
  5.   RecordList: TStringList;
  6. begin
  7.   fileName := 'C:\Path\To\Your\FileName.txt';
  8.  
  9.   Assign(f, fileName);
  10.   reset(f);
  11.  
  12. { prepare a place for important records }
  13.   RecordList := TStringList.Create;
  14.   try
  15.     while not EOF(f) do  { Not End of File? }
  16.     begin
  17.       ReadLn(f, Line); { read one line/record from the file }
  18.  
  19.       //do you need this line/record?
  20.       if not ThrowOnTheFloor(Line) then
  21.         RecordList.Add(Line);
  22.     end;
  23.  
  24. { Do what you want with the important records }
  25.     for aLine in RecordList do
  26.       ExamineRecord(Line);
  27.  
  28.   finally
  29.     RecordList.Free;
  30.   end;

JLWest

  • Sr. Member
  • ****
  • Posts: 340
Re: 7,970,312 As it turns out.
« Reply #11 on: January 05, 2019, 06:03:15 am »
I double checked the code.

RCDCtr : Integer = 0;

 readln(TFIn, TFRecord);
 Inc(RCDCtr);
lblReads.Caption := IntToStr(RCDCtr);   


Will try and post the code along with a small data file with 29 records.
Bear in mind it only loads 13 records but reads 29. You can check that with notepad++.
The reads are counted in   procedure TForm1.ProcessAptDotDatIn;   

       
« Last Edit: January 05, 2019, 06:09:23 am by JLWest »
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

JLWest

  • Sr. Member
  • ****
  • Posts: 340
Re: 7,970,312 As it turns out.
« Reply #12 on: January 05, 2019, 06:07:41 am »
@engkin

Thaat was a few steps back. Now I need to keep the file in tack. All 7,970,312 records. Just need to edit a few thousand.
JLWEST 
 FPC 3.0.4, Lazarus IDE v1.8.2 Windows 10 Pro
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig 1.5 Terabyte SSD
3 Terabyte conventional disk space.

HeavyUser

  • Sr. Member
  • ****
  • Posts: 252
Re: 7,970,312 As it turns out.
« Reply #13 on: January 05, 2019, 08:10:18 am »
@engkin

Thaat was a few steps back. Now I need to keep the file in tack. All 7,970,312 records. Just need to edit a few thousand.
Then you have to remember the record's position in the file as well as the record data(by increasing a counter every time you read a line for example). After you finished editing the data start reading all the lines that you do not have in memory from the original file and saving them to a temporary or new file one by one, when you reach the first line you have in memory save it from the memory instead of the file (but read the old data to force the file cursor to move to the next record) and continue writing the lines you read from the file until you reach the next line you have in memory and so on.

Handoko

  • Hero Member
  • *****
  • Posts: 2766
  • My goal: build my own game engine using Lazarus
Re: 7,970,312 As it turns out.
« Reply #14 on: January 05, 2019, 09:33:59 am »
I have to deal with them. They won't load in a TStringList or a Listbox.
Are you sure?

Below is a demo showing a generated text file that contains 10,000,000 records. It runs perfectly okay on my Linux 64-bit machine.

Code: Pascal  [Select]
  1. unit Unit1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. interface
  6.  
  7. uses
  8.   Classes, SysUtils, Forms, Controls, Dialogs, StdCtrls,
  9.   ComCtrls, ExtCtrls, LCLType, LCLIntf;
  10.  
  11. type
  12.  
  13.   { TForm1 }
  14.  
  15.   TForm1 = class(TForm)
  16.     btnGenerate: TButton;
  17.     btnSearch: TButton;
  18.     Label1: TLabel;
  19.     Memo1: TMemo;
  20.     Panel1: TPanel;
  21.     ProgressBar1: TProgressBar;
  22.     procedure btnGenerateClick(Sender: TObject);
  23.     procedure btnSearchClick(Sender: TObject);
  24.     procedure FormCreate(Sender: TObject);
  25.   private
  26.     function RandomChars(Count: Integer): string;
  27.   end;
  28.  
  29. var
  30.   Form1: TForm1;
  31.  
  32. implementation
  33.  
  34. const
  35.   DataFileName = 'Data.txt';
  36.  
  37. {$R *.lfm}
  38.  
  39. { TForm1 }
  40.  
  41. procedure TForm1.btnGenerateClick(Sender: TObject);
  42. const
  43.   MaxRecordCount = 10000000;
  44. var
  45.   OutputFile : TextFile;
  46.   i          : LongInt;
  47. begin
  48.  
  49.   // Preparing
  50.   btnSearch.Enabled  := False;
  51.   Panel1.Caption     := 'Generating ' + MaxRecordCount.ToString + ' items';
  52.   Panel1.Visible     := True;
  53.   ProgressBar1.Min   := 0;
  54.   ProgressBar1.Max   := MaxRecordCount;
  55.   ProgressBar1.Style := pbstNormal;
  56.   AssignFile(OutputFile, DataFileName);
  57.   Rewrite(OutputFile);
  58.  
  59.   // Generating
  60.   for i := 1 to MaxRecordCount do
  61.   begin
  62.     WriteLn(OutputFile, 'Item' + i.ToString + ' ' + RandomChars(8));
  63.     ProgressBar1.Position := i;
  64.     if (i mod 50) <> 0 then Continue; // this line is for better performance
  65.     Application.ProcessMessages;
  66.     if GetKeyState(VK_ESCAPE) < 0 then Break; // Allow user cancellation
  67.   end;
  68.  
  69.   // Finishing
  70.   CloseFile(OutputFile);
  71.   ShowMessage(ProgressBar1.Position.ToString + ' items generated.');
  72.   Panel1.Visible    := False;
  73.   btnSearch.Enabled := True;
  74.   Memo1.Clear;
  75.  
  76. end;
  77.  
  78. procedure TForm1.btnSearchClick(Sender: TObject);
  79. var
  80.   InputFile : TextFile;
  81.   Count     : LongInt;
  82.   S         : string;
  83. begin
  84.  
  85.   if not(FileExists(DataFileName)) then
  86.   begin
  87.     ShowMessage('Data file not found.' + LineEnding +
  88.       'Please generate the data first.');
  89.     Exit;
  90.   end;
  91.   // Preparing
  92.   btnGenerate.Enabled := False;
  93.   Panel1.Caption      := 'Searching';
  94.   Panel1.Visible      := True;
  95.   ProgressBar1.Min    := 0;
  96.   ProgressBar1.Max    := 100;
  97.   ProgressBar1.Style  := pbstMarquee;
  98.   Memo1.Visible       := False;
  99.   Memo1.Clear;
  100.   AssignFile(InputFile, DataFileName);
  101.   Reset(InputFile);
  102.   Count := 0;
  103.  
  104.   // Searching
  105.   while not EOF(InputFile) do
  106.   begin
  107.     ReadLn(InputFile, S);
  108.     if Pos('hi', S) > 0 then
  109.     begin
  110.       Memo1.Append(S);
  111.       Inc(Count);
  112.       if (Count mod 50) <> 0 then Continue; // this line is for better performance
  113.       Application.ProcessMessages;
  114.       if GetKeyState(VK_ESCAPE) < 0 then Break; // Allow user cancellation
  115.     end;
  116.   end;
  117.  
  118.   // Finishing
  119.   CloseFile(InputFile);
  120.   ShowMessage('Found ' + Count.ToString + ' records containing ''hi''');
  121.   Memo1.Visible       := True;
  122.   Panel1.Visible      := False;
  123.   btnGenerate.Enabled := True;
  124.  
  125. end;
  126.  
  127. procedure TForm1.FormCreate(Sender: TObject);
  128. begin
  129.   Memo1.Clear ;
  130.   Memo1.ScrollBars                  := ssAutoVertical;
  131.   ProgressBar1.BorderSpacing.Around := 4;
  132.   ProgressBar1.Align                := alBottom;
  133.   Panel1.Align                      := alBottom;
  134.   Panel1.Visible                    := False;
  135. end;
  136.  
  137. function TForm1.RandomChars(Count: Integer): string;
  138. var
  139.   S: string;
  140.   i: Integer;
  141. begin
  142.   if Count < 1  then Count := 1;
  143.   if Count > 20 then Count := 20;
  144.   S := '';
  145.   for i := 1 to Count do
  146.     S := S + chr(Ord('a')+Random(26));
  147.   Result := S;
  148. end;
  149.  
  150. end.

Note:
  • The generated text file is ±208 MB
  • It may run very slow on old computers
  • You can change line #43 if you want to test bigger data
  • Line #64 and #112 are needed to improve performance by reducing the program responsiveness
  • You should add exception block (which I rarely use)