Peculiar behavior due to IFS

aa=|
echo $aa

The above echo works but the below echo fails. Why please?

IFS=:
aa=|
echo $aa
echo $IFS

The later 'echo' command will work if variable is put in codes.

echo "$aa"
echo "$IFS"

I summarize that when IFS is set to ':' or '|', echo used with variable doesn't work unless the variable is quoted.
It took quite some time to conclude this.

All string arguments, regardless if literal or variables, should be quoted:

echo "$aa"
echo "$IFS"

Special characters in a literal assignment should be quoted, too:

aa="|"

Does it? I don't think so. The | char opens a pipe and makes the shell wait for the next command in the pipe.

The shell command parsing step which looks for pipes has already concluded by the time parameter expansion occurs. There's nothing special about a pipe character in a variable. You'd have to use eval to make it so.

---------- Post updated at 10:10 AM ---------- Previous update was at 10:01 AM ----------

What's happening is perfectly normal. It's called field splitting.

Field splitting is one of the final steps in sh command parsing. After the variable is replaced with its value, that value is split into fields using IFS. An IFS character will never be seen in a command argument unless it's quoted (since quoting tells the shell not to perform field splitting).

Regards,
Alister

The command that was given in post#1 was

aa=|
echo $aa

Executing this on the two systems I have at hand (Linux, FreeBSD) issues the secondary prompt (asking for the next command in pipe, I assume). The variable aa is not assigned to, it is not even defined afterwards. So I think my statement holds.
Of course, assigning an escaped | is totally different and absolutely OK.

It does indeed. My mistake.

What I said is technically correct, but it does not apply here since the pipe is unquoted and never assigned to the variable.

Regards,
Alister

Yes Rudic, you are right. One mistake from my side:
In place of "|" , it should be ":".

IFS=:
aa=:
echo $aa

Now echo won't say the output because of the reason given by Alister.
Thanks Alister for the elaborate clarification.

Thanks Alister for the concept. I have 3 related queries please:
1)

$ IFS=ab
$ echo `echo cabcow`
$ c  cow
$ echo cabcow
$ cabcow

It's confirmed that SHELL parses or splits the fields based on IFS at command substitution or variable evaluation.
But how a command line is split into fields? Here IFS isn't considered!

2)

IFS=ab
echo $IFS | od -bc
040 012
       \n
echo "$IFS" | od -bc
141 142 012
  a    b    \n

why the space character displayed when not quoted?
3) The effect of double quotes :

echo `ls`
echo "`ls`"

The former command displays the filenames in a single line with a space in between 2 names.
The latter displays as the 'ls' output is.i,e. one name with newline character in each line.
May I conclude that when quoted, the command hasn't converted the newline character to a space which it was doing when not quoted. That means, here double quotes is interpreted by command also and not only by SHELL.

1 Like

Ad 1.
Here the field is split in three parts:

  • the "c" before the IFS character "a"
  • The empty field between "a" and "b"
  • And the string "cow" behind the "b"
    These 3 fields are separated by spaces.

Ad 2.
The space is diplayed because in IFS there are two characters, "a" and "b". the first field is the empty string before that "a". The second field is the empty string between "a" and "b". The fields are split using a single space, which accounts for the space showing up in the od -c output. Since IFS acts as a field terminator rather then a separator, there is no third empty field after "b".

Ad 3. That is not correct. The output of the commands are input to the echo command. The first is unquoted, therefore it is split according to $IFS . Whereas the second one quoted and thus not split.

What a concept Scrutinizer and especially the 2nd one!!
Thanks a lot!
By the way, my 1st question is still left!

But how a command line is split into fields? Here IFS isn't considered!

By this I meant: (for ex.)

grep us file.txt

How the command line is parsed. Or, in other words, how grep understand what is 1st argument and the 2nd one? Here it isn't considering IFS!

That is correct. IFS is only used after parameter expansion, command substitution and arithmetic expansion and with the read command.
So the shells always uses whitespace (space, tab, newline) as a separator between the various arguments and the command.

IFS is only used to split the results of unquoted expansions (such as $var, $(cmd)) and by the read command. The shell does not use it during its initial scan of the command line. During that early step, whitespace always delimits words (tokens, to be more precise), regardless of the value of IFS.

With regard to how grep finds its first argument, that's more involved. Once the shell has finished parsing the command line, it forks (or clones) itself. If all goes well, the new subshell calls one of the functions in the exec family with the command to run and a list of arguments to pass to it. This replaces the subshell with the command that was exec'd. The command's arguments are found in the array argv, with the first argument at argv[1] (argv[0] is the command's name).

The details of creating the process and locating the list of arguments (and the environment) are overseen by the kernel and the c runtime.

Also, should you need to know every last detail of how your shell interprets command lines, you should read your shell's manual page in its entirety. There is also the POSIX standard's documentation, which will give you detailed knowledge of a common UNIX baseline:

POSIX
sh
Shell Command Language
exec family

Those links aren't intended to silence you (your questions are very welcome). I provide them only in case you are not aware of them.

Regards,
Alister

Thanks Alister and Scrutinizer for the basics!

Thank you too ravisingh for your good questions!

$ cat emp.lst
Rob Mills
Jack Thompson
Steffi Blues

a=`cat emp.lst`
echo $a

The output of this echo shows the 3 names in a single line which is perfectly fine.

But my doubt is with the below:

echo "$a"

The output of this shows the 3 names in 3 lines.
Please see, now also I expected the same output because the variable a is already defined. When above I defined the variable "a", the command cat emp.lst was not quoted and hence variable "a" should not have accepted a newline character.
But this output says that the var. "a" has accepted the newline character.
How come this happen?

What was IFS set to at the time?

Variable assignments bypass field splitting and pathname expansion (aka file globbing). IFS is irrelevant.

Since double quotes prevent field splitting and pathname expansion from occurring (the only sh parsing steps which can increase the number of elements in a command line), a="`cat emp.lst`" is equivalent to your unquoted version.

The newlines that you mention are in $a. The difference you observe is a result of quoting or not quoting echo's argument. The newlines are consumed by field splitting when the shell parses echo $a . The difference between the two echo commands is that in the quoted version, the shell is invoking echo with one argument, which contains three lines of names. In the unquoted version, the shell itself looks at the contents of $a, splits it on whitespace, consuming the newlines and (this is important) the spaces as well. echo is then invoked with 6 arguments, one for each word of the names. It is echo's job to then take its 6 arguments, join them with a single space between each, and print the result. In the quoted version, since echo is only passed a single argument, it does not add any space characters.

Note that command substitution always strips trailing newlines (not embedded). This has nothing to do with quoting, field splitting, IFS, nor variable assignment. It's how command substitution is designed. You may not have noticed that because echo, besides joining its arguments with a space character, appends a newline. If, however, your file has multiple newlines at the end, you will notice that when you echo the contents of a double-quoted $a, only a single newline is present (all trailing newlines were stripped by the command substitution but only one is added by echo).

Regards,
Alister

Interesting disctinction. I've also noticed that IFS applies here:

set -- `cat file`

But not here:

set -- $(cat file)

It makes sense when you consider the effect that field splitting or pathname expansion could have. a=$1 could yield a=word1 word2 which would then execute an unintended command, word2, with a modification to its environment.

This bypassing of actions which can increase the number of tokens also occurs in the case statement: In case word in ... , word is not subject to field splitting and pathname expansion.

I have not been able to reproduce such behavior nor have I been able to find any documentation implying that such behavior is intentional. Perhaps there's a bug in your shell? Can you specify the shell and demonstrate how you trigger such behavior?

I tried the following on bash 3.1.17 and did not observe any command substitution syntax-dependent difference in field splitting:

$ echo 1 2 3 > file

$ # With the default value of IFS, <space><tab><newline>
$ # Unquoted

$ set --; echo $#
0
$ set -- `cat file`; echo $#
3
$ set --; echo $#
0
$ set -- $(cat file); echo $#
3

$ # Quoted
$ set --; echo $#
0
$ set -- "`cat file`"; echo $#
1
$ set --; echo $#
0
$ set -- "$(cat file)"; echo $#
1

Regards,
Alister

Interesting. I have bash 4, and thought I could depend on $( ) not splitting. Apparently not so.