Problem with pipes on infinite streams

Here is an example code that shows the issue I have:

#!/bin/bash
counter() {
  seq 1000 | while read NUM; do
    echo $NUM
    echo "debug: $NUM" >&2 
    sleep 0.1 # slow it down so we know when this loop really ends
  done
}

counter | grep --line-buffered "[27]" | head -n1
debug: 1
debug: 2
2
debug: 3
debug: 4
debug: 5
debug: 6
debug: 7

If I understand it correctly, "head" finishes on the first match (as expected), but "grep" is not aware of it until it tries to write the next line (the second match). When it does, it finds out the pipe is closed so it also finishes.

That's normally not a problem, but if you have an infinite input stream containing only one match, it won't never stop. Any solution?

Hello, tokland:

If using GNU grep:

counter | grep -m1 '[27]'

If that's not available:

counter | sed -n '/[27]/{p;q;}'

Regards,
Alister

Hi!

Thanks, those are good solutions. However, the grep in my code was just an example, let's imagine you cannot change how the stream is generated:

stream_generator | head -n1

By the way, using process substitution "works":

head -n1 <(stream_generator)

but it keeps the generator running on the background until the next match.

You can run the stream generator in the background, asynchronously, and use a named pipe to communicate with it:

mkfifo sg_pipe
stream_generator > sg_pipe &
head -n1 sg_pipe
kill %?stream

Regards,
Alister

---------- Post updated at 10:51 PM ---------- Previous update was at 10:38 PM ----------

For the example code you used in your original post:

#!/bin/bash
counter() {
  seq 1000 | while read NUM; do
    echo $NUM
    echo "debug: $NUM" >&2 
    sleep 0.1 # slow it down so we know when this loop really ends
  done
}

mkfifo p
counter | grep --line-buffered "[27]" > p &
head -n1 p
kill %?counter

Outputs:

$ ./tokland.sh 
debug: 1
debug: 2
2

Alister

Very nice. Named pipes are not so cool, but now we have full control over the job (I guess that's not so easy with process substitution)

Thanks Alister!

If you make it sleep shorter time or make your machine busier, you will observe your original problem with named pipe (or any other methods). A Unix pipe has at least 4k buffer size and I don't think there is a way to make it smaller. Without a way to reduce the pipe size and not able to modify the streaming code, I see no way to solve your problem.

If you're saying that the loop may run a few more times, sure. You are quite correct. The generator will write a few bytes into the pipe's buffer, never filling the buffer, and will loop until its timeslice is exhausted. But a few extra loop iterations is not the same as his original problem, in which the generator would run without end.

Regards,
Alister

Hi alister,
In your code:

What does %? stands for?
Thanks for your help.
Bye

it is from bash job control - see the 'advanced bash scripting guide' online

I believe it is not part of POSIX, just a bash extension.

1 Like

Thanks jim mcnamara,
When you say job, do you mean any process or only those that are displayed by the command jobs?

The latter.

1 Like