Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
270 views
in Technique[技术] by (71.8m points)

GNU Parallel as job queue -- last commands not executed

Trying to follow GNU Parallel as job queue with named pipes with GNU parallel 20201222, I run into issues of parallel not executing the last commands piped into it via tail -n+0 -f.

To demonstrate, I have 3 terminals open:

# terminal 1
true > jobqueue
tail -n+0 -f jobqueue | parallel
# terminal 2
tail -n+0 -f jobqueue | cat

Adding a single small test command to the queue:

# terminal 3
echo "echo test" >> jobqueue

Only terminal 2 prints "echo test", gnu parallel does not output anything.

# terminal 3
for i in `seq 10`; do echo "echo $i" >> jobqueue; done

Only terminal 2 prints "echo 1", ..., "echo 10" (one in each line), gnu parallel does not output anything.

# terminal 3
for i in `seq 100`; do echo "echo $i" >> jobqueue; done

Terminal 2 prints "echo 1", ..., "echo 100". Terminal 1 prints the lines "test", "1", ..., "10", "1", ..., "99", the last line "100" is missing.

Rerunning tail -n+0 -f jobqueue | parallel outputs all up to "99". Rerunning this with --resume --joblog log appended, outputs one more line ("100") but then also lags behind once new lines are added to joblog. For GNU parallel 20161222, the initial run only gets to line "84".

How can I force gnu parallel to flush its input queue on every line?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

From man parallel:

There is a a small issue when using GNU parallel as queue system/batch manager: You have to submit JobSlot number of jobs before they will start, and after that you can submit one at a time, and job will start immediately if free slots are available. Output from the running or completed jobs are held back and will only be printed when JobSlots more jobs has been started (unless you use --ungroup or --line-buffer, in which case the output from the jobs are printed immediately). E.g. if you have 10 jobslots then the output from the first completed job will only be printed when job 11 has started, and the output of second completed job will only be printed when job 12 has started.

In other words: The jobs are running. Output is delayed. It is easier to see if you instead of using echo in your example use touch unique-file-name.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...