Recent

Author Topic: Incorrect buffering in multithread programs  (Read 12608 times)

Threadwalker

  • Newbie
  • Posts: 4
Incorrect buffering in multithread programs
« on: July 01, 2021, 06:30:14 am »
I have tried to write my own program in Free Pascal using multiple  threads and I found out, that my program produced correct output when its output is terminal. However, when it's redirected, the output becomes mangled.

At first, I thought it was a problem with my own program, however, I have been able to reproduce the same problem with a generic multi-thread example from freepascal Wiki.

The problem with this program only appears from time to time, so if you get correct output ("10"), try hitting Up button and Enter again to rerun the same command multiple times:

Code: Bash  [Select][+][-]
  1. $ ./wiki_threadtest | grep -v '^thread [0-9]* \(thri [0-9]* Len(S)= [0-9]*\|started\|finished\)$'
  2. thread 5 tthread 3 started
  3. hri 18 Len(S)= 19
  4. thread 1 t10
  5. hri 18 Len(S)= 19
  6.  

The compiler was Free Pascal Compiler version 3.2.2 for x86_64, the compilation was done with simple fpc wiki_threadtest.pas

As it's easy to see in the example, my grep program filters out all the correct lines from the output except final "10". Sometimes mangled pieces of other lines appear and grep invocation makes them easy to spot.

As I figured out, the most likely reason is that every thread buffers output in blocks of 256 bytes independently, which means that pieces of 256 bytes long from different threads can appear on redirected output in arbitrary order, so lines of output from one thread can be torn apart when they appear on boundaries of multiples of 256 bytes.

I think the best solution of this problem would be to make FPC buffer output of a multi-thread program in one common buffer instead of creating instances of buffer for every thread. This is especially important if threads have to communicate with each other to produce certain output in correct order. When you view the output, it's fine, but when you want to analyze the output with external tools, such as grep, it's mangled. I think, that's not good.
« Last Edit: July 01, 2021, 06:49:09 am by Threadwalker »

alpine

  • Hero Member
  • *****
  • Posts: 1038
Re: Incorrect buffering in multithread programs
« Reply #1 on: July 01, 2021, 08:56:45 am »
Just add:
Code: Pascal  [Select][+][-]
  1.   flush(output);
after the write/writeln.

And why you'd expect nice output from threads without some rudimentary synchronization?
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Incorrect buffering in multithread programs
« Reply #2 on: July 01, 2021, 08:58:10 am »
I think the best solution of this problem would be to make FPC buffer output of a multi-thread program in one common buffer instead of creating instances of buffer for every thread. This is especially important if threads have to communicate with each other to produce certain output in correct order. When you view the output, it's fine, but when you want to analyze the output with external tools, such as grep, it's mangled. I think, that's not good.

No, we have decided to provide each thread with its own buffer, cause then the RTL does not need to do any synchronization. This was done by choice and won't be changed.

If you want uncorrupted output per thread then you need to synchronize yourself and use Flush in each thread to ensure that the output buffer is flushed.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #3 on: July 01, 2021, 09:44:28 am »
If you want uncorrupted output per thread then you need to synchronize yourself and use Flush in each thread to ensure that the output buffer is flushed.

Presumably going directly to the filehandle- which is a per-process rather than a per-thread resource- would work as OP expected.

I think the Wiki example could usefully have a note that to be strictly correct there should be a flush since the extent to which WriteLn() etc. buffers should be assumed to be implementation-dependant.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Threadwalker

  • Newbie
  • Posts: 4
Re: Incorrect buffering in multithread programs
« Reply #4 on: July 01, 2021, 12:12:34 pm »
Code: Pascal  [Select][+][-]
  1.   flush(output);
Thank you.
This literal line has worked, after I have inserted it to my code. Though, is output a file descriptor for stdout?
I have already tried to use this procedure before your suggestion, but it didn't work, because I wrote stdout between parentheses. This page suggests that flush's parameter is a text file descriptor, as far as I understand.

Quote
And why you'd expect nice output from threads without some rudimentary synchronization?
If we are talking about the example, I have expected writeln(); invocation to be atomic. Suppose two threads call for writeln in same time,  what is going to happen? I don't think, having random mix of chars from one and other thread makes any sense. And if writeln is atomic, then it makes sense that if it outputs a complete line, it won't get mangled in the resulting output, only it might get between lines output by different threads, depending on synchronization.

 
No, we have decided to provide each thread with its own buffer, cause then the RTL does not need to do any synchronization. This was done by choice and won't be changed.

If you want uncorrupted output per thread then you need to synchronize yourself and use Flush in each thread to ensure that the output buffer is flushed.

Well, if every writeln call had to be followed by Flush(), then it's going to be terribly inefficient, and what's worse, what if threads switch between writeln and Flush?

I have another suggestion in this case. What if every thread flushed its buffer if the next line cannot fit within it, instead of splitting the line between two blocks? And if write/writeln output is bigger than block size, then it should flush both before and after this command automatically.

This way, while order of lines between threads isn't guaranteed to be preserved without additional code, at least complete lines won't be arbitrarily broken. I think that output should not depend on whenever it's done onto the terminal directly or through a pipe to tee, less, grep or whatever.

Presumably going directly to the filehandle- which is a per-process rather than a per-thread resource- would work as OP expected.

Did you mean, that if I use writeln(output, 'some message'); on place of every writeln('some message'), it should solve the problem?  I have tried, it didn't work.
« Last Edit: July 01, 2021, 12:15:24 pm by Threadwalker »

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #5 on: July 01, 2021, 12:16:07 pm »
Presumably going directly to the filehandle- which is a per-process rather than a per-thread resource- would work as OP expected.

Did you mean, that if I use writeln(output, 'some message'); on place of every writeln('some message'), it should solve the problem?  I have tried, it didn't work.

I did not.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #6 on: July 01, 2021, 12:34:18 pm »
If we are talking about the example, I have expected writeln(); invocation to be atomic. Suppose two threads call for writeln in same time,  what is going to happen? I don't think, having random mix of chars from one and other thread makes any sense. And if writeln is atomic, then it makes sense that if it outputs a complete line, it won't get mangled in the resulting output, only it might get between lines output by different threads, depending on synchronization.

I'm not quite sure that "atomic" is the right term here, but everybody knows what's meant from the context.

WriteLn() is atomic: in the context of the thread. But interaction with the underlying OS is handled via a filehandle owned at the process level, and there is no guarantee- in any field, not just I/O- that something done at the thread level is atomic at the process level unless the programmer takes appropriate precautions (critical sections in the case of memory accesses, Flush() in the case of I/O, and so on).

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

alpine

  • Hero Member
  • *****
  • Posts: 1038
Re: Incorrect buffering in multithread programs
« Reply #7 on: July 01, 2021, 12:57:17 pm »
*snip*
This literal line has worked, after I have inserted it to my code. Though, is output a file descriptor for stdout?
Pascal's input,output are equivalents of stdin,stdout in C/C++.

I have already tried to use this procedure before your suggestion, but it didn't work, because I wrote stdout between parentheses. This page suggests that flush's parameter is a text file descriptor, as far as I understand.
It is implicitly assumed output when the first parameter was omitted in writeln.

Quote
And why you'd expect nice output from threads without some rudimentary synchronization?
If we are talking about the example, I have expected writeln(); invocation to be atomic. Suppose two threads call for writeln in same time,  what is going to happen?
Chances are they will overlap.

I don't think, having random mix of chars from one and other thread makes any sense. And if writeln is atomic, then it makes sense that if it outputs a complete line, it won't get mangled in the resulting output, only it might get between lines output by different threads, depending on synchronization.
No such thing as atomic. The files are shared resources as e.g. any global variable or other object from the global context. For that are synchronization techniques intended.

Well, if every writeln call had to be followed by Flush(), then it's going to be terribly inefficient, and what's worse, what if threads switch between writeln and Flush?
Nothing bad since (in the example) strings are way too short than the buffer size. If you want to be efficient, dedicate a thread only for the output and make a synchronized message queue.
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

Threadwalker

  • Newbie
  • Posts: 4
Re: Incorrect buffering in multithread programs
« Reply #8 on: July 01, 2021, 01:41:21 pm »
Nothing bad since (in the example) strings are way too short than the buffer size. If you want to be efficient, dedicate a thread only for the output and make a synchronized message queue.

I mean, in general, flush() after every writeln() does not solve the problem, since it still does not guarantee that lines won't be mangled, only makes it less likely to happen.

However, if FPC implementation was adjusted according to my suggestion above, namely made it so that writeln argument is never split to different blocks, it should work. Since it is decided, that each thread has its own output buffer, it can be done by flushing the buffer every time, writeln argument doesn't fit within its remainder. In the current implementation, it seems that if output doesn't fit within the block, then it is split between output blocks. I suggest to flush incomplete block in such a case and start a new one. Or, if writeln tries to output something big (>256 bytes at a time), then it can just be sent to output directly, bypassing the buffer, after flushing the incomplete output block.
« Last Edit: July 01, 2021, 01:43:01 pm by Threadwalker »

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Incorrect buffering in multithread programs
« Reply #9 on: July 01, 2021, 01:49:33 pm »
No, we have decided to provide each thread with its own buffer, cause then the RTL does not need to do any synchronization. This was done by choice and won't be changed.

If you want uncorrupted output per thread then you need to synchronize yourself and use Flush in each thread to ensure that the output buffer is flushed.

Well, if every writeln call had to be followed by Flush(), then it's going to be terribly inefficient, and what's worse, what if threads switch between writeln and Flush?

If you don't want the performance impact with flushing then you need to use synchronization.

I have another suggestion in this case. What if every thread flushed its buffer if the next line cannot fit within it, instead of splitting the line between two blocks? And if write/writeln output is bigger than block size, then it should flush both before and after this command automatically.

This way, while order of lines between threads isn't guaranteed to be preserved without additional code, at least complete lines won't be arbitrarily broken. I think that output should not depend on whenever it's done onto the terminal directly or through a pipe to tee, less, grep or whatever.

We don't desire a change in the current behavior.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #10 on: July 01, 2021, 03:52:07 pm »
Pascal's input,output are equivalents of stdin,stdout in C/C++.

Oh no they're not, and that's central to OP's confusion. stdin, stdout and stderr are numeric file handles, associated with the process that's animating the program's code. Input and Output are files of the special type Text, implemented by Pascal runtimes (aka libraries etc.), and with an application-specific amount of per-thread buffering.

Output /eventually/ maps to stdout. However while every occurrence of (per-thread) Output maps to the same (per-process) stdout you can't say the reverse since the per-thread Output files operate aynchronously (which is, after all, the whole idea of having threads).

This leaves aside any consideration of redirection, which could only muddy the water further.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #11 on: July 01, 2021, 03:55:33 pm »
I mean, in general, flush() after every writeln() does not solve the problem, since it still does not guarantee that lines won't be mangled, only makes it less likely to happen.

It will guarantee the expected behaviour provided that every line fits into the relevant buffer.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

alpine

  • Hero Member
  • *****
  • Posts: 1038
Re: Incorrect buffering in multithread programs
« Reply #12 on: July 01, 2021, 05:07:26 pm »
Pascal's input,output are equivalents of stdin,stdout in C/C++.

Oh no they're not, and that's central to OP's confusion. stdin, stdout and stderr are numeric file handles, associated with the process that's animating the program's code. Input and Output are files of the special type Text, implemented by Pascal runtimes (aka libraries etc.), and with an application-specific amount of per-thread buffering.
The stdin, stdout and stderr are of type FILE* which is analogue to Text (or TextRec). Like FPC, C stdio also supports buffering, text/binary mode, etc. They're not numeric handles.

Output /eventually/ maps to stdout. However while every occurrence of (per-thread) Output maps to the same (per-process) stdout you can't say the reverse since the per-thread Output files operate aynchronously (which is, after all, the whole idea of having threads).
Although I'm not a 100 percent sure (never dug into the details), I doubt that input,output files have a special treatment or separate copies for threads.

@Threadwalker
You can implement your suggestion easily by allocating separate buffer and assigning it to output with SetTextBuf(output, Buf, BufSize) and then:
Code: Pascal  [Select][+][-]
  1.   WriteStr(S, 'whatever to be written', ...);
  2.   if not (TextRec(output).BufPos + Length(S) + Length(LineEnding) < BufSize) then Flush(output);
  3.   WriteLn(S);
  4.  
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Incorrect buffering in multithread programs
« Reply #13 on: July 02, 2021, 09:07:28 am »
Output /eventually/ maps to stdout. However while every occurrence of (per-thread) Output maps to the same (per-process) stdout you can't say the reverse since the per-thread Output files operate aynchronously (which is, after all, the whole idea of having threads).
Although I'm not a 100 percent sure (never dug into the details), I doubt that input,output files have a special treatment or separate copies for threads.

The standard I/O Text variables are simply declared as threadvar, that's all that's needed for "thread safety".

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Incorrect buffering in multithread programs
« Reply #14 on: July 02, 2021, 10:30:07 am »
Output /eventually/ maps to stdout. However while every occurrence of (per-thread) Output maps to the same (per-process) stdout you can't say the reverse since the per-thread Output files operate aynchronously (which is, after all, the whole idea of having threads).
Although I'm not a 100 percent sure (never dug into the details), I doubt that input,output files have a special treatment or separate copies for threads.

The standard I/O Text variables are simply declared as threadvar, that's all that's needed for "thread safety".

... and you had already explicitly said that there's per-thread buffering, so I suspect that somebody's posted without reading the whole topic.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

 

TinyPortal © 2005-2018