shell - Is a semicolon prohibited after NAME in `for NAME do ...`?

Question

Welcome To Ask or Share your Answers For Others

shell - Is a semicolon prohibited after NAME in `for NAME do ...`?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

shell - Is a semicolon prohibited after NAME in `for NAME do ...`?

The bash manual lists the syntax for the for compound statement as

for name [ [ in [ word ... ] ] ; ] do list ; done

which implies that the semicolon before do is optional if the in clause is omitted. [Note 2].

However, the Posix specification lists only the following three productions for for_clause:

for_clause       : For name linebreak                            do_group
                 | For name linebreak in          sequential_sep do_group
                 | For name linebreak in wordlist sequential_sep do_group
                 ;

For reference, linebreak is a possibly-empty sequence of NEWLINE while sequential_sep is either a semicolon or a NEWLINE, possibly followed by a sequence of NEWLINE:

newline_list     :              NEWLINE
                 | newline_list NEWLINE
                 ;
linebreak        : newline_list
                 | /* empty */
                 ;
separator        : separator_op linebreak
                 | newline_list
                 ;
sequential_sep   : ';' linebreak
                 | newline_list
                 ;

As far as I can see, that prohibits the syntax for foo; do :; done.

In practice, all the shells I tried (bash, dash, ksh and zsh) accept both for foo; do :; done and for foo do :; done without complaint, regardless of Posix or their own documentation [Note 3].

Is this an accidental omission in the grammar in the Posix standard, or should the use of the semicolon in that syntax be considered a (commonly-implemented) extension to the standard?

Addendum

In the XCU description of the for loop, Posix seems to insist on newlines:

The format for the for loop is as follows:

for name [ in [word ... ]] do compound-list done

However, in the Rationale volume, it is made clear that the grammar is intended to be the last word:

The format is shown with generous usage of <newline> characters. See the grammar in XCU Shell Grammar for a precise description of where <newline> and <semicolon> characters can be interchanged.

Notes

Apparently this is the first SO question which pairs shell and language-lawyer. There is no idle-curiosity, which might have been more appropriate.
The bash manual is not entirely explicit about newlines; what it says is:

In most cases a list in a command's description may be separated from the rest of the command by one or more newlines, and may be followed by a newline in place of a semicolon.

That makes it clear that the semicolon preceding done can be replaced by a newline, but does not seem to mention that the same transformation can be performed on the semicolon preceding do.
Both ksh and zsh seem to insist that there be either a semicolon or a newline after the name, although the implementations don't insist on it.

The ksh manpage lists the syntax as:

for vname [ in word ... ] ;do list ;done

(I believe that the semicolon in ;do and ;done represents "a semicolon or a newline". I can't find any definite statement to that effect but it is the only way to make sense of the syntax description.)

The zsh manual shows:

for name ... [ in word ... ] term do list done
????where term is at least one newline or ;.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T17:59:14+0000

Nicely spotted! I don't have a definite answer, but here is what the source code says about it:

It's indeed not valid in the original Bourne shell from AT&T UNIX v7:

(shell has just read `for name`):
       IF skipnl()==INSYM
       THEN chkword();
        t->forlst=item(0);
        IF wdval!=NL ANDF wdval!=';'
        THEN    synbad();
        FI
        chkpr(wdval); skipnl();
       FI
       chksym(DOSYM|BRSYM);

Given this snippet, it does not appear to be a conscious design decision. It's just a side effect of the semicolon being handled as part of the in group, which is skipped entirely when there is no "in".

Dash agrees that it's not valid in Bourne, but adds it as an extension:

        /*
         * Newline or semicolon here is optional (but note
         * that the original Bourne shell only allowed NL).
         */

Ksh93 claims that it's valid, but says nothing of the context:

/* 'for i;do cmd' is valid syntax */
else if(tok==';')
    while((tok=sh_lex(lexp))==NL);

Bash has no comment, but explicitly adds support for this case:

for_command:    FOR WORD newline_list DO compound_list DONE
            {
              $$ = make_for_command ($2, add_string_to_list (""$@"", (WORD_LIST *)NULL), $5, word_lineno[word_top]);
              if (word_top > 0) word_top--;
            }
...
    |   FOR WORD ';' newline_list DO compound_list DONE
            {
              $$ = make_for_command ($2, add_string_to_list (""$@"", (WORD_LIST *)NULL), $6, word_lineno[word_top]);
              if (word_top > 0) word_top--;
            }

In zsh, it is's just a side effect of the parser:

while (tok == SEPER)
    zshlex();

where (SEPER is ; or linefeed). Due to this, zsh happily accepts this loop:

for foo; ; 
;
; ; ; ; ;
; do echo cow; done

To me, this all points to an intentional omission in POSIX, and widely and intentionally supported as an extension.

Categories

shell - Is a semicolon prohibited after NAME in `for NAME do ...`?

shell - Is a semicolon prohibited after NAME in `for NAME do ...`?

Addendum

Notes

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags