Hello experts,
I have a requirement where I have to implement two checks on a csv file:
-
Check to see if the value in first column is duplicate, if any value is duplicate script should exit.
-
Check to verify if the value at second column is between "yes" or "no", if it is anything else script should exit.
My input file looks like:
hiring,no
system,yes
hiring,yes
quota,no
OS is solaris.
I have been trying to implement/list my first requirement using awk but without any success, i tried this but there is no output:
awk 'x[$1]++ == 1 { print $1 " is duplicated"}' FILENAME
awk'x[$1]++FS=","
is not working either, since above file has hiring at two places script should come out.
Please advise.
You need to exit
after the print
.
$ cat input
hiring,no
system,yes
hiring,yes
quota,no
quota,maybe
$ sort input | awk 'NR == 1 {p=$1; next} p == $1 { print $1 " is duplicated"} {p=$1}' FS=","
hiring is duplicated
quota is duplicated
$ awk '$2 != "yes" && $2 != "no" { print $2 " on line " NR " is not yes/no"}' FS="," input
maybe on line 5 is not yes/no
In a shell script, save the output from each awk command to a file, and use [ -s file ] to determine whether to exit the script or not.
2 Likes
Yoda
4
Perform pre-increment and check if greater than 1 to identify duplicates:
awk -F, ' ++A[$1] > 1 { print $1 "is duplicate"; exit 1 } ' file
1 Like
Thank you to both of you hanson44 and Yoda, I used both the utilities in my script and they are working absolutely file. Thank you again.