Recent

Author Topic: readln and lineEnding  (Read 4639 times)

jack616

  • Sr. Member
  • ****
  • Posts: 268
readln and lineEnding
« on: March 31, 2015, 12:07:42 am »

When readln reads a text line in it silently drops the line ending.
Is there a variable that the actual line ending that was found holds?
I'm looking for the simplest way to just re-inject a blank line in the
correct source format without knowing what the originating text is
using beforehand.

TIA

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: readln and lineEnding
« Reply #1 on: March 31, 2015, 12:22:35 am »
You could read part of the text file as a binary file into an appropriately sized buffer, and scan the buffer for a linefeed character, then check if there is an adjacent carriage return or not.

ChrisF

  • Hero Member
  • *****
  • Posts: 542
Re: readln and lineEnding
« Reply #2 on: March 31, 2015, 03:07:35 am »
It's probably not working in all cases (depending mostly of the file data) and a lots of control are missing, but just for the fun...

Code: [Select]
program ATest;

uses SysUtils;

{$IOCHECKS OFF} // Don't crash

function WhichLineBreak(var TF: TextFile; var sLineEndFound: string): boolean;
var c1, c2: Char;
begin
  result := false;
  sLineEndFound := '';
  if (TextRec(TF).BufEnd < 1) or (TextRec(TF).BufPos > (TextRec(TF).BufEnd) + 1) then
    exit;
  c1 := TextRec(TF).Bufptr^[TextRec(TF).BufPos - 1];
  if c1 in [#10, #13] then
    begin
      c2 := TextRec(TF).Bufptr^[TextRec(TF).BufPos - 2];
      if c1=#10 then
        if c2=#13 then
          sLineEndFound := #13 + #10
        else
          sLineEndFound := #10
      else
        if c2=#10 then
          sLineEndFound := #10 + #13
        else
          sLineEndFound := #13;
      result := true;
    end;
end;

var TF: TextFile;
var LineNum: integer;
var s, sr: string;
begin
  AssignFile(TF ,'MyTextFile.txt');
  Filemode := fmOpenRead;
  Reset(TF);
  LineNum := 0;
  while not Eof(TF) do
    begin
      ReadLn(TF, s);
      inc(LineNum);
      if WhichLineBreak(TF, s) then
        begin
          sr := IntToStr(Ord(s[1]));
          if Length(s) = 2 then
            sr := sr + ' ' + IntToStr(Ord(s[2]));
          WriteLn('LineEnd found for line ' + IntToStr(LineNum) + ' = ' + sr);
        end
      else
        WriteLn('No LineEnd found for line ' + IntToStr(LineNum));
    end;
  CloseFile(TF);
end.

You may have a look at the Eoln function in text.inc for a few more controls, for instance.

jack616

  • Sr. Member
  • ****
  • Posts: 268
Re: readln and lineEnding
« Reply #3 on: March 31, 2015, 01:53:30 pm »
Thanks for the ideas guys.
I'm really looking for a shortcut to speed up the process
without having to add code to read blocks of data myself.
I thought maybe - readln might be an easy way
but I cant find the source file for it. Maybe I could modify
readln to grab it.  That way I dont need to modify any other code.
(Is the source for readln included - I cant seem to find it)
 

rvk

  • Hero Member
  • *****
  • Posts: 6163
Re: readln and lineEnding
« Reply #4 on: March 31, 2015, 02:22:26 pm »
I'm really looking for a shortcut to speed up the process without having to add code to read blocks of data myself.
I thought that's what ChrisF was doing. His code doesn't read any extra blocks but uses the buffer (after a readln) to determine if it is LF or CRLF. If you look at WhichLineBreak you won't see any readblocks etc. so it doesn't have any speed-penalty.

But if you really want you could hack the source of readln yourself, it's somewhere in fpc\rtl\inc\text.inc (fpc_ReadLn_End).

Code: [Select]
sLineEndFound := #10 + #13
Wow, you won't see LF+CR much these days but it's still good to take it into account :)

ChrisF

  • Hero Member
  • *****
  • Posts: 542
Re: readln and lineEnding
« Reply #5 on: March 31, 2015, 03:46:06 pm »
I thought that's what ChrisF was doing.
You are right, indeed.

Wow, you won't see LF+CR much these days but it's still good to take it into account :)
Well, as you've written, it's just theoretical because I think that readln doesn't process them correctly (at least for the current OS - win, linux, macos -).


@jack616:

I think I've missed a necessary control. So:
Code: [Select]
if (TextRec(TF).BufEnd < 1) or (TextRec(TF).BufPos > (TextRec(TF).BufEnd) + 1) then
to modify in
if (TextRec(TF).BufEnd < 1) or (TextRec(TF).BufPos < 2) (TextRec(TF).BufPos > (TextRec(TF).BufEnd) + 1) then

As I've said before, it's -most probably- not working in all cases, especially:
- when there are 2 characters for the line end, and when they are splitted in 2 buffer reads.
- when there is 1 character, and when it's at the beginning of the buffer (see the additional control above),
- ...

Concerning the readln modification, rvk has already given to you the necessary pieces in information.

For the TextRec structure itself, see 'textrec.inc' or http://www.freepascal.org/docs-html/rtl/sysutils/textrec.html


***  Edit ***  BTW, I guess it's safer to use TTextRec (= TextRec in SysUtils) instead of TextRec directly; and it's also compliant with Delphi.
« Last Edit: March 31, 2015, 06:35:58 pm by ChrisF »

jack616

  • Sr. Member
  • ****
  • Posts: 268
Re: readln and lineEnding
« Reply #6 on: April 01, 2015, 12:30:49 am »
yes... things start to get complicated - for my situation - reading back I didn't point out
this isnt a new project - I'm maintaining a lot of existing code at high speed - or trying.

Thanks rvk - no wonder I couldnt find it - I was grep'ing for 'function readln' or procedure - damn.

My thinking on the readln idea was that readlin is already widely used in all the code
and each iteration it must already have identified an EOL. If I can patch it to store that
in a variable I dont actually have to mess with any existing code.  This should be very simple
to do and can remain totally transparent to anything not using that (global) variable.

This just seemed a bit more elegant than patching in an entirely new custom function
Its just an idea - but I'll give it a go and see how it works out.

If nothing else this has reminded me to be wary of EOF markers (which had skipped
what little grey matter I have entirely).  :-[






 

TinyPortal © 2005-2018