Recent

Author Topic: The function ReadLn(file, str) ignores tab characters.  (Read 1371 times)

Galmer

  • New Member
  • *
  • Posts: 10
The function ReadLn(file, str) ignores tab characters.
« on: April 07, 2024, 01:02:27 pm »
Hi all!
Tell me why the ReadLn function ignores tab characters when reading from a file?

I have a file in which the data is separated by tab characters, but when this function reads from the file, no tab characters are found. At the same time, if you use the Read(file, char) function, the tab character (#10) is detected.

If nothing can be done about this, can someone tell me how to create strings character by character? I'm just new to Pascal, before this I programmed more in C.

Thaddy

  • Hero Member
  • *****
  • Posts: 15507
  • Censorship about opinions does not belong here.
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #1 on: April 07, 2024, 01:08:54 pm »
The TAB character is #9, NOT #10: that is a linefeed (LF).(that is the same in C, btw)
It ignores the linefeed, because it is not needed for readln, depending on platform. You can use the predefined constant sLineBreak for cross-platform use.(or sLineEnding, which is the same)
These resolve to #13#10 on Windows and DOS and family and just #10 on Unix. On the windows/dos family readln does not have to read the #10 part if #13 is encountered. #13 means CR, a carriage return, again same in C.

Things change a bit when you are misusing strings to read binary data which is a common mistake that people coming from C make.
you can not assume * char is binary data, since char refers to a Pascal char, not a byte. If that is what you need report back and I write you a simple example on how to handle binary data that assumes * char in C.
« Last Edit: April 07, 2024, 01:26:46 pm by Thaddy »
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

Galmer

  • New Member
  • *
  • Posts: 10
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #2 on: April 07, 2024, 01:23:21 pm »
Thank you very much, you helped me a lot! But now I'm ashamed of my stupidity. Spending so much time simply by mixing up the number of tab characters...

Please tell me why the search for '\t' doesn't work then?

Thaddy

  • Hero Member
  • *****
  • Posts: 15507
  • Censorship about opinions does not belong here.
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #3 on: April 07, 2024, 01:34:29 pm »
Because \t uses \ which is an escape character: it means that a C compiler needs to interpret the literal value, and not 't'. In pascal you already escape the character by using the # notation. Pascal is not aware of C escapes, only Pascal escapes. So \t translates to #9 in Pascal. Both translate to 9 in the ASCII table. Not to '9'. Neither C nor Pascal store \ or #, just the value 09 as a byte value.
Escape characters are not part of the binary data and usually not stored unless stored as string.
« Last Edit: April 07, 2024, 01:43:31 pm by Thaddy »
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

Galmer

  • New Member
  • *
  • Posts: 10
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #4 on: April 07, 2024, 01:49:04 pm »
Thanks again!

Should I delete this topic if the problem is my stupid mistake?
Will it make it difficult to search the forum?

Thaddy

  • Hero Member
  • *****
  • Posts: 15507
  • Censorship about opinions does not belong here.
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #5 on: April 07, 2024, 02:28:33 pm »
No, just leave it. Although it has been answered many times in some form or other It is always good that is shows up easily in a search and not all answers are correct. Mine is.
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2269
  • Fifty shades of code.
    • Delphi & FreePascal
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #6 on: April 07, 2024, 02:29:20 pm »
can someone tell me how to create strings character by character? I'm just new to Pascal, before this I programmed more in C.
Welcome to the World of Pascal!
I would suggest to learn about what Lazarus/FreePascal offers to make things way easier.
Out of above conversation I am still unsure about the usage factor so I just show a super easy to follow way that loads a plain ascii file into an array of strings.
(just written in a hurry in here, untested)
You need to include unit SysUtils and Classes.
Code: Pascal  [Select][+][-]
  1. var
  2.   sl: TStringList;
  3.   i: Integer;
  4. begin
  5.   // pre-check if target exists
  6.   if (not FileExists('full path to a textfile.txt')) then
  7.     Exit; // exit the current method without executing following code
  8.   // create a stringlist instance
  9.   sl := TStringList.Create;
  10.   try
  11.     // load the file into the stringlist
  12.     sl.LoadFromFile('full path to a textfile.txt');
  13.     // at this point everything about loading is already done
  14.     // we just check if it had found something by doing ...
  15.     if (sl.Count > 0) then // Count property is the number of loaded lines
  16.       begin
  17.         // do something with the content
  18.         // you can access every char inside the array
  19.         // you can also split each line into another array to have an array of "words" if splitted by a space character
  20.         // for demonstration I just let code show content line by line in CLI
  21.         for i := 0 to Pred(sl.Count) do // we are working now in two worlds, variable "i" is used as index for the array and the array starts at index 0 not 1 so we need to substract from count one to stay in bounds
  22.           WriteLn(sl.Strings[i]); // here we access a line to be displayed
  23.       end;
  24.   finnaly
  25.     sl.Free; // free the created instance
  26.   end;
  27. end;
The TStringList class offers way more possibilities out-of-the-box, like delimited text, so you could have by default an array of "words".
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

jamie

  • Hero Member
  • *****
  • Posts: 6518
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #7 on: April 07, 2024, 06:54:05 pm »
I wonder what would happen if!

Code: Pascal  [Select][+][-]
  1.  
  2. //After a File Open maybe and before actual use.
  3.  
  4. TTextRec(File).LineEnd := #10;
  5.  
  6.  
The only true wisdom is knowing you know nothing

PeterHu

  • New Member
  • *
  • Posts: 20
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #8 on: August 04, 2024, 04:53:16 am »
can someone tell me how to create strings character by character? I'm just new to Pascal, before this I programmed more in C.
Welcome to the World of Pascal!
I would suggest to learn about what Lazarus/FreePascal offers to make things way easier.
Out of above conversation I am still unsure about the usage factor so I just show a super easy to follow way that loads a plain ascii file into an array of strings.
(just written in a hurry in here, untested)
You need to include unit SysUtils and Classes.
Code: Pascal  [Select][+][-]
  1. var
  2.   sl: TStringList;
  3.   i: Integer;
  4. begin
  5.   // pre-check if target exists
  6.   if (not FileExists('full path to a textfile.txt')) then
  7.     Exit; // exit the current method without executing following code
  8.   // create a stringlist instance
  9.   sl := TStringList.Create;
  10.   try
  11.     // load the file into the stringlist
  12.     sl.LoadFromFile('full path to a textfile.txt');
  13.     // at this point everything about loading is already done
  14.     // we just check if it had found something by doing ...
  15.     if (sl.Count > 0) then // Count property is the number of loaded lines
  16.       begin
  17.         // do something with the content
  18.         // you can access every char inside the array
  19.         // you can also split each line into another array to have an array of "words" if splitted by a space character
  20.         // for demonstration I just let code show content line by line in CLI
  21.         for i := 0 to Pred(sl.Count) do // we are working now in two worlds, variable "i" is used as index for the array and the array starts at index 0 not 1 so we need to substract from count one to stay in bounds
  22.           WriteLn(sl.Strings[i]); // here we access a line to be displayed
  23.       end;
  24.   finnaly
  25.     sl.Free; // free the created instance
  26.   end;
  27. end;
The TStringList class offers way more possibilities out-of-the-box, like delimited text, so you could have by default an array of "words".

I tried this,it is great that it can read and print file content in the console(Win10) properly with various encodings(ascii,unicode,utf8/16 with/without BOM),except:
1. UTF8 without BOM shows garbage information;
2. filename with Chinese characters --->FileExists(fname) seems return false.

Help would be appreciated.

Thaddy

  • Hero Member
  • *****
  • Posts: 15507
  • Censorship about opinions does not belong here.
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #9 on: August 04, 2024, 07:29:52 am »
The console default on windows 10 is usually NOT UTF8, but CP_ACP which is an ansi codepage.
You can set the console codepage through this somewhere near the top of your program:
Code: Pascal  [Select][+][-]
  1.   SetConsoleCP(CP_UTF8);  
  2.   SetConsoleOutputCP(CP_UTF8);
Then the UTF8 with or without bom will be shown correctly.
« Last Edit: August 04, 2024, 07:33:01 am by Thaddy »
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

PeterHu

  • New Member
  • *
  • Posts: 20
Re: The function ReadLn(file, str) ignores tab characters.
« Reply #10 on: August 05, 2024, 06:10:23 am »
The console default on windows 10 is usually NOT UTF8, but CP_ACP which is an ansi codepage.
You can set the console codepage through this somewhere near the top of your program:
Code: Pascal  [Select][+][-]
  1.   SetConsoleCP(CP_UTF8);  
  2.   SetConsoleOutputCP(CP_UTF8);
Then the UTF8 with or without bom will be shown correctly.

Thank you!

 

TinyPortal © 2005-2018