Questions related to if in awk context and if without awk context

I wrote this code, questions follow

#! /bin/bash -f 

 # Purpose - to show how if syntax is used within an awk
  clear;
  ls -l;

  echo "This will print out the first two columns of the inputted file in this directory";
  echo "Enter filename found in this directory";
  read input;
  awk '{if($input == echo$(ls)) 
         {
          print $1 "\t" $2;
         }
         else
          {
          echo "file " $input " not in this Directory";
          exit
          }
       }' $input

Question 1 - how would I write this better so that my commands are executed under the else statement instead of getting a fatal error when $input is not a file in the Directory?

Question 2 - how would I represent this if statement outside this script and not within an awk command? When using if [ $input == echo$(ls) ] or if [[ $input == echo$(ls) ]]
where $input is a file name in the Directory I get a "too many arguments" and false return or just a false return respectively.

Thank you in advance.

Is this a homework assignment? Homework and coursework questions can only be posted in this forum under special homework rules.

Please review the rules, which you agreed to when you registered, if you have not already done so.

If this is homework, please explain where the code you presented came from. There are so many syntax errors that it is hard to know where to start explaining the problems.

If you did post homework in the main forums, please review the guidelines for posting homework and repost.

1 Like

(url's from your quote deleted as I have less than 5 posts)

This is not a homework assignment. I am learning scripting on my own.

(url deleted from quote, for same reason above)

I have reviewed the rules. This is not a homework assignment. I am asking for input from people with more knowledge than me to improve my learning. 
 Again, this is not a homework assignment, and I came up with the code myself. The code works in it's simple purpose, but I am looking to improve it as stated in my first question, including any syntax errors. I am also seeking to understand the difference in the if function between awk and non-awk, as stated in my second question. 

(urls deleted)

 This is not a homework assignment, and as I am not enrolled in any type of course or program, any posts I make in the foreseeable future will not be. 

Thank you.

Please always tell us what operating system and shell you're using when you start a new thread in the Shell Programming and Scripting forum. Many of the commands you're using in your bash script will behave differently on different operating systems. It is much easier to make suggestions that will work in your environment if we know what environment you're using.

When I run your script in a directory containing two files (one named problem and one named tester ) and enter either of those names when prompted, I get the output:

total 24
-rw-r--r--  1 dwc  staff  1077 Jan 30 22:34 problem
-rwxr-xr-x  1 dwc  staff   457 Jan 30 23:34 tester
This will print out the first two columns of the inputted file in this directory
Enter filename found in this directory
problem
awk: illegal field $(), name "input"
 input record number 1, file problem
 source line number 1

When I enter any other value at the prompt, I get the output:

total 24
-rw-r--r--  1 dwc  staff  1077 Jan 30 22:34 problem
-rwxr-xr-x  1 dwc  staff   457 Jan 30 23:34 tester
This will print out the first two columns of the inputted file in this directory
Enter filename found in this directory
unknown
awk: can't open file unknown
 source line number 10
awk: can't open file unknown
 source line number 8

where unknown is whatever string I entered when prompted.

This is what I got when using bash and awk on macOS Mojave version 10.14.2. Different versions of awk might give you different diagnostic messages or might always treat the expression in your awk if statement:

if($input == echo$(ls)) 

as if it had been written as:

if($0 == $0) 

(because you have defined the shell variable input by your bash read statement, but you have not defined a variable named input in your awk script. And using an undefined variable in an awk script causes it to be treated as either an empty string (when a string is expected) or as 0 (when a number is expected). Since none of the awk variables in your awk if statement are defined, I would expect that input will be treated as 0 (because a field number is expected after a dollar sign in awk ), echo will be evaluated as an empty string because you are concatenating two strings (whatever echo expands to and whatever $(ls) expands to) and since ls expanded to an empty string on my version of awk I got a syntax error. If the code you showed us does print the 1st two columns from each line of a file in the currrent directory when you enter the name of a file in the current directory, apparently ls expanded to a 0 in your version of awk . In that case comparing the entire contents of any input line ( $0 ) to itself ( $(0) ) will always yield true and print the 1st two fields of each line of the file using <tab> as a separator in the output.

No matter what version of awk you use, if you invoke it with one pathname operand and that pathname does not name an existing file, you will get an error message similar to the one I specified above or the one you alluded to in question 1 in post #1 in this thread.

Since you didn't really give us a definition of what you are trying to do with your script, I can only make wild guesses. If what I am guessing is correct, there is no reason to use awk at all. It can all be done in bash (or any shell that conforms to the POSIX standards) with something like:

#!/bin/bash

# This script provides a long listing of the current directory and asks the user
# to end the name of one of the files in the directory.  If the name of a file
# in the current directory is given, the first two fields on each line will be
# printed separated by a <tab> character.  Otherwise a diagnostic message will
# be printed.

# Clear the screen.
clear

# Provide a long listing of the files in the current directory.
ls -l

# Prompt for and read the name of a file to be processed.
echo 'This will print out the first two columns of the inputted file in this directory'
printf 'Enter the filename of a regular file found in this directory: '
read input

if [ -f "$input" ]
then	# The name of an existing regular file was given.  Print the first two fields.
	while read -r field1 field2 rest
	do	printf '%s\t%s\n' "$field1" "$field2"
	done < "$input"
else	# The name given does not name an existing file.
	echo "file \"$input\" not in this Directory"
	exit 1
fi

Note that I only tested for the presence of a regular file. Trying to read a directory or most other file types using the read utility isn't something you're ready to try handling yet.

The shell if statement in the above code could be replaced by a printf piped into an awk script, but it would be considerably more complicated than the above shell script.

Is this approximately what you were trying to do?

1 Like
 
OS: Linux Lite 4.0

 uname -a = Linux seth-desktop 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
 
 
  bash -version = GNU bash, version 4.4.19(1)-release (x86_64-pc-linux-gnu)
 Copyright (C) 2016 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later (link removed do to not having 5 posts yet)

 This is free software; you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.
 

-Thank you for replying, Don. My code works when $input is a file in the pwd, but as stated I get a fatal error when $input is not actually a file in the working directory, and commands under else do not execute. Finding where I went wrong is what I'm looking for in the 1st question. When $input is actually a file in pwd, the if condition returns true, and the fields specified ($1 and $2) are printed along with the tab separating them. So code works but only if $input is a file in the directory. Else doesn't work if $input isn't in the directory.

The way I have coded the script allows for the if-expression to take the condition as expressed and returns true if the file is present,  but if I am not using awk, for instance outside the script, as in: \(at prompt\)

Note* AfileinDir is a file present in the pwd, not calling for any of the info in it, just trying to understand why awk/if returns true as in my script, and why if with the bracket syntax outside the script doesn't.

input=�AfileinDir� 

          if [ $input == echo$(ls) ];  # or if [ $input == $(ls) ]
             then echo $input " was found via if condition"
             else echo $input " was not found via if condition"
          fi
          returns this:  
          bash: [: too many arguments
          AfileinDir was not found via if condition
 
          you see how this is different from when I used awk in my script with the alternate syntax. The awk/if statement yielded a true response and no errors \(if the file was in the directory\) and it printed out just the fields requested with the tab separating them. Using if[] outside the script and without awk as above returns a negative response and �too many arguments� error. Using the double brackets [[]] around the condition yields a negative response but no error. I am trying to understand the why of all this to better understand the commands and syntax rather than look for an alternative method. I will, however, look over your code to see what I can learn from it, thank you.

I really appreciate your sole quest to master *nix and associates challenges, and I'd like to support you with it. Let's start with a few comments:
Don't intermix the various topics / fields, which you do in your first post, mixing shell and awk syntax. Concentrate on one only first, then extent your work to others. Don't work with non-*nix tools (windows(?) editors) so not to introduce off-system errors (e.g. locale double quotes) leading to syntax problems.
Let's start with some comments on your last post:

input=�AfileinDir�

Syntax error! The double quotes are NOT the required ASCII " (oct 042, dec 34, hex 22). What editor do you use? Change to a genuine *nix one, or at least switch yours to *nix mode. While we're at it, make sure your script lines are terminated correctly (\n in *nix, NO \r!)

if [ $input == echo$(ls) ];  # or if [ $input == $(ls) ] 

bash: [: too many arguments : echo$(ls) evaluates to like "echofile1 file2 etc", i.e. the string "echo" immediately followed by the list of files in your directory. Will rarely fit anything in your $input variable. Would ls $input do what you want (check if file exists in current working dir)?

Don Cragun already pointed out many aspects of the failure of your approach. I just want to emphasize that without a basic understanding of the operation of both shell and awk and the sometimes subtle differences between the two, and, no less important, the ways and methods they interact, it will become difficult to compose a working solution.

I explained the reasons why the awk code you supplied won't work. You can choose to ignore my comments and continue to wonder why your code doesn't work. Trying to use shell variables in awk and assume that shell variable expansions will work in awk will never work the way you have used them because awk is not bash . After defining a variable in bash to be the name of a file with:

read input

and then using:

awk -v input="$input" 'BEGIN { print "input contains: \"" input "\""}'

shows you how you can turn a shell variable into an awk variable that can be used inside an awk script. Note that in shell:

echo $VAR

prints the value of the string stored in the variable named VAR (assuming that the string assigned to VAR does not contain any <backslash> characters, does not start with a <hyphen> possibly following leading <space> and/or <tab> characters). (If those constraints are broken, the output produced by echo varies from shell to shell and operating system to operating system.)

You get roughly the same output inside awk with the awk statement:

print VAR

if and only if VAR is also an awk variable containing the same string as the shell variable VAR .

Using $1 in a shell script refers to the contents of the 1st command line argument passed to your shell script. Using $1 in an awk script refers to the contents of the 1st field on the current record you are processing from the current file you are reading with awk .

If the shell variable input contains the string 5 and the awk variable input input contains the string 5 then in shell code $input expands to the string 5 , but in awk code $input expands to the string that is the contents of field number 5 in the current input line in the current input file.

In a directory where I have hundreds of files and the first file in the directory (sorted alphanumerically) is named 1999_08-09.sum , the shell command

echo$(ls)

produces the output:

bash: echo1999_08-09.sum: command not found

because the shell didn't find a utility named echo1999_08-09.sum after concatenating the output produced by the command substitution $(ls) with the string echo .

The same code in awk (when there is no variable in awk named echo and no variable in awk named ls with your version of awk expands to $(0) which awk treats as the contents of the current input line from the current input file. This obviously has absolutely nothing to do with the name of a file stored in a shell variable.

You can't mix random shell statements and random awk statements and assume that that mix will magically be interpreted the way you want it to be interpreted.

The shell command language and the awk command language are not the same no matter how much you want that to be true. If you want to use shell variables in an awk script you have to create an awk variable that contains the contents of that shell variable. If you want to run shell commands inside an awk script, you have to learn the awk commands that can be used to do that (and the awk you have shown us doesn't make any attempt to do so).

BSD, Linux, and UNIX systems provide you with hundreds of tools you can use to do all sorts of wondrous things. But you have to first learn that those tools only fit together in certain ways. And, there are certain tools that are extremely good at doing one thing and extremely poor at doing other things. And, using an awk if statement to determine if a string assigned to a shell variable names an existing file is an extreme case of using the wrong tool to try to do the job.

I wish you luck in your adventures, but I sincerely hope you'll take a closer look at the (entirely) bash script I gave you that seems to do what I think you were trying to do. Trying to use awk to determine whether or not there is a file of a certain name in the current working directory is enormously more difficult than learning to use the test utility that is available in all shells based on Bourne shell syntax (e.g., bash , dash , ksh , sh , and zsh ).

You could also use [[ expression ]] instead of test expression or [ expression ] , in some versions of bash and some versions of ksh , but I strongly suggest that you learn the basics before trying to use shell specific features that work well in some cases but not in others.

1 Like
  	 	 	 	   Uh, my awk code works fine provided $input is a file in the working directory. I am not using any special editors, and after reading RudiC's comments, I restarted my system and went to a real terminal \(not terminal emulator\) and did the following at a prompt:

AfileinDir is simply a file with some data in a couple of fields.

 
 
    input=�AfileinDir� 



    awk '{if($input == $(ls)) {print $1 "\t" $2;} }' $input 



code works fine. Go to a directory on your computer, define input as a file in that directory that has at least 2 fields and copy paste the awk line above.

 My first question relates to the else portion of the code which I haven't included. I know * that * part doesn't work, which is why I asked for help on improving the script.  


My second question is still not being answered because so far the respondents are saying my code doesn't work when it does?? No special emulators. Straight from tty1 terminal.  

p { margin-bottom: 0.1in; line-height: 115%; }

Please explain in English what you think the expression in the if statement in:

awk '{if($input == $(ls)) {print $1 "\t" $2;} }

is doing. Since you can't get the else clause to work, by definition, your code is not working. How can you say your code is working when it won't run if the filename you pass to it is not a file that exists in the current directory? Furthermore, it is my belief that what the code above is doing is not doing what you think it is. Whether or not that means it is working when the named file does exist is open for discussion. I claim that if it is not doing what you think it is doing, it is not working. You claim that since the then clause of your if statement is working, everything is fine.

Without running the following code, what output do you think it would produce:

awk '{if($input == $(ls)) {print $1 "\t" $2 "\t$input=\"" $input "\"\t$(ls)=\"" $(ls) "\"";} } $input

After you have decided what output you think it should produce, run it and compare the output you get to the output you thought it should produce. Then go back and look at how I said that expression in the if statement would be evaluated in my comments in post #4 in this thread.

Please run the above test and let us know what happens!

As I said before, I can't test your code using the version of awk that I have available on my system because the code you're using is not technically correct (and produces a syntax error in the version of awk that I'm using, but runs without producing a syntax error in the version of awk that you're using). I can force the version of awk I'm using to get the results you're seeing by changing your code to:

awk '{if($input == $(ls+0)) {print $1 "\t" $2;} }
 	 	 	 	   Never said the else clause was working, that's why I asked for a different way to script it.  


First, the code will return an error, since the code you provided doesn't close the awk command nor does it provide it a file to look at.  


But if those were present, and in English:  


    awk will first examine the file to see if it's present. Then it will evaluate the value of variable input to see if it is equal to the value of what a list \(ls\) command returns.  �ls' will return every non hidden file in the directory it's run in, so strictly speaking, all the strings returned taken as a whole will not equal the value of input - but this is done in the context of an awk command with a specific file \(the value of input\) to look at, and if true \(the file is present\) it will print out the first field, followed by a tab, followed by the second field. 


   If, however, the value of input is not in the directory \(the file isn't listed\), then awk will exit with an error. That error is 
awk: fatal: cannot open file `a' for reading (No such file or directory)

where �a' is not a file in the directory. This is the reason why else is never executed - because the error is returned before the condition and commands that form part of the if statement are ever read.
p { margin-bottom: 0.1in; line-height: 115%; }

 	 	 	 	   it will return a prompt because you didn't close off the awk command before providing the file

with the awk command closed off,
Sorry, I had to run it to see what the code was doing :p. But you're changing the value of both $input and $(ls) so it's going to list the values of those as you specified.I don't see how this helps me with either of my questions in the original post.

p { margin-bottom: 0.1in; line-height: 115%; }

 awk --version
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)
Copyright (C) 1989, 1991-2016 Free Software Foundation.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.

I sincerely apologize for not being able to test the code I suggested you try on the system I'm using (and, therefore, missing a closing single-quote in the code I provided).

I'm sorry that you believe that I should be required to download all of the software you're using on your system onto my system in order to try to help you understand the code you're using. I AM NOT GOING TO DO THAT! I am perfectly happy with the way awk works on my system even though awk on your system has some (non-standard) extensions that do some nice things in some cases. What those extensions do has nothing to do with what you're doing nor with what you're trying to do. The difference we are seeing is because you are doing something that the standards describe as producing "unspecified behavior" and your version of awk produces a different unspecified behavior than my version of awk produces.

All that I was trying to do with the awk statement I was trying to get you to run was to show that the value of $input and the value of $(ls) inside your awk script have absolutely no relationship to the value of $input or the value of $(ls) in bash outside of your awk script.

Nothing in your explanation in English of what the expression in your awk if statement is true. The awk variable input is not defined anywhere in your awk code (only the shell variable input is defined; not the awk variable input ). In bash $(ls) is a command substitution that runs the ls command and substitutes the output it produces. In awk $(ls) is a command that expands to the contents of the field number specified by the value of the awk ls variable converted to an integer. The awk and shell == operators are a request to compare the two operands on both sides of that operator. If there is more than one file in a directory and one of those files is the name of the file you're processing, there is absolutely no way that the name of that one file can possibly be equal to the list of names of files that are present in that directory. Since you are seeing the output you're seeing, your version of awk has to be expanding both sides of the comparison to the expansion of $0 which (in awk ) is the entire contents of the current input line.

With the missing quote added AND assuming that you had already defined the shell variable input as you had shown before AND assuming that the file named AfileinDir contained the two input lines:

Line 1 in file AfileinDir
AfileinDir's 2nd line

I would expect the sequence of commands:

input="AfileinDir"
awk '{if($input == $(ls)) {print $1 "\t" $2 "\t$input=\"" $input "\"\t$(ls)=\"" $(ls) "\"";} }' "$input"

to produce output very similar to the following:

Line	1	$input="Line 1 in file AfileinDir"	$(ls)="Line 1 in file AfileinDir"
AfileinDir's	2nd	$input="AfileinDir's 2nd line"	$(ls)="AfileinDir's 2nd line"

Despite what you said, absolutely nothing in the code above assigns any value to the awk variables input and ls ; that code only displays the values that the expression in your awk if statement expression are comparing (i.e., the entire contents of the current line in AfileinDir on both sides of the == ).

Maybe if we rewrite that as:

input="AfileinDir"
awk '{if($input == $(ls)) {print $1 "\t" $2;print $input;print $(ls); print "End of record #" NR;} }' "$input"

which (if you set the contents of the file named AfileinDir to the contents I specified above), I expect will produce the output:

Line	1
Line 1 in file AfileinDir
Line 1 in file AfileinDir
End of record #1
AfileinDir's	2nd
AfileinDir's 2nd line
AfileinDir's 2nd line
End of record #2

you will see that $input and $(ls) in your awk script do not expand to the strings you think they will expand to.

I know that you don't want to just use the shell (i.e. bash ) to determine whether or not the variable read from your script's user (i.e. input ) is the name of a file in your current directory, but that is exactly what the simple bash command:

if [ -f "$input" ]
then	# The name of an existing regular file was given.  Print the first two fields.
	while read -r field1 field2 rest
	do	printf '%s\t%s\n' "$field1" "$field2"
	done < "$input"
else	# The name given does not name an existing file.
	echo "file \"$input\" not in this Directory"
	exit 1
fi

that I suggested in post #4 did for you. If the shell variable input contains the name of a regular file that exists in the current directory, it will perform the while loop in the then clause of the if statement. Otherwise, it will print the name of the file the user supplied and say that that file is not in this Directory and then exit with a non-zero exit status as specified in the else clause of the if statement.

1 Like

I'm not as eloquent in the English language as Don Cragun is, nor am I as patient. And, I hope you believe me really wanting to help you.
That said: He's right, and you are wrong.
Your code does NOT "work fine". It may by sheer accident provide an answer that suits you, but I'm sure that's not what you want.

For the if clause in question, replace it with

if (A == A) print ...

or, even better,

if (1) print ...

and see what you get, mayhap even trying the else clause. You may also want to closely read and understand all the posts in your thread.
Then come back, and we'll continue the discussion.

1 Like