Cut columns with delimiter

HI,

I have a file like below

"103865","103835","Zming","","Zhu","103965","Sunnyvale","US",
"116228","116227","Morlla","","Kowalski","113228","Paese "(Treviso)""IT"

I want to validate the 7th column which is below.

"Sunnyvale"
"Paese

In the above 7th column Paese is not ended with double quotes which is a issue. So i need to find these columns. Any help would be appreciated.

Thanks

awk -v FS="," 'NF!=8' filename

will print all rows which don't have the right number of records.

# awk -F, 'NR==1{print $7;next};{split($7,a," \"");{print a[1]}}' file
"Sunnyvale"
"Paese

regards
ygemici

@ygemici

Hi,

This prints all the records..

Thanks, This was helpful.

Can i also filter this output seeing which ever not completed with "

Krish.

edit: sorry this was wrong.

# cat file
"103865","103835","Zming","","Zhu","103965","Sunnyvale","US",
"116228","116227","Morlla","","Kowalski","113228","Paese,"(Treviso)","IT"
# awk -F, '{for(i=1;i<NF;i++){if(substr($i,length($i),length($i)) !~ /"/)print $i}}' file
"Paese

regards
ygemici

This is how the file looks

1  Identifier,Username,First_Name,Middle,Last_Name,Employee_ID,City,RS Code,Postal,Email_addr,HomePhone,WorkPhone,Internal,Ethnic_Question,Ethnic_Questio
n,USA_EEO2_Ethnicity_Answer,USA_EEO2_Race_Question,USA_EEO2_Race_Question,USA_EEO2_Race_Answer,Gender_Question,Gender_Question,Gender,SSOID^M
     2  "67042","67042","A","","Jones","67042","Georgetown","US","78628","AJ_Jones@amat.com","+1  512-863-2043","+1  512-272-3247","TRUE","-1","-1","-3","-2",
"-2","-9","-3","-3","-11","67042"^M
     3  "113021","113021","Aan Wooi","","Mu","113021","Tampines","MY","520852","Aan_Wooi_Mu@amat.com","","+65  98374227","TRUE","-1","-1","-1","-2","-2","-4",
"-3","-3","-11","113021"^M
     4  "66872","66872","Aaron","","Hunter","66872","Santa Cruz","US","95060","Aaron_Hunter@amat.com","+1  408-458-0959","+1  408-584-0592","TRUE","-1","-1","
-3","-2","-2","-9","-3","-3","-11","66872"^M
     5  "70277","70277","Aaron","","Maciej","70277","Austin","US","78729","Aaron_Maciej@amat.com","+1  512-635-9991","+1  512-272-3617","TRUE","-1","-1","-3",
"-2","-2","-9","-3","-3","-11","70277"^M
     6  "79145","79145","Aaron","","Wei","79145","Hsin Chu","TW","N/A","Aaron_Wei@amat.com","+886  35323094","+1  886-3-579-3464","TRUE","-1","-1","-1","-2","
-2","-4","-3","-3","-11","79145"^M
     7  "103260","103260","Aaron","","Liu","103260","Lingya District","TW","802","Aaron_Liu@amat.com","+886  88677615344","+1  886-4-22172610","TRUE","-1","-1
","-1","-2","-2","-4","-3","-3","-11","103260"^M
     8  "104267","104267","Aaron","","Himmler","104267","San Jose","US","95136","Aaron_Himmler@amat.com","+1 (703) 8955735","+1  408-584-0249","TRUE","-1","-1
","-3","-2","-2","-9","-3","-3","-11","104267"^M

when i run your command. It gives me output like below. I think it is still not the right one.

Identifier
Username
First_Name
Middle
Last_Name
Employee_ID
City
RS Code
Postal
Email_addr
HomePhone
WorkPhone
Internal
Ethnic_Question
Ethnic_Question
USA_EEO2_Ethnicity_Answer
USA_EEO2_Race_Question
USA_EEO2_Race_Question
USA_EEO2_Race_Answer
Gender_Question
Gender_Question
Gender
"Taoyuan County
"Bangalore
"335001
"Wujie Township
"Chu Pei
"Chu Tung
"Frankenthal
"Zhubei City
"Taichung County 434

@krrishv

I have removed numbers and spaces from all the rows and tried like below...

cat checkit.sh
Identifier,Username,First_Name,Middle,Last_Name,Employee_ID,City,RS Code,Postal,Email_addr,HomePhone,WorkPhone,Internal,Ethnic_Question,Ethnic_Questio
n,USA_EEO2_Ethnicity_Answer,USA_EEO2_Race_Question,USA_EEO2_Race_Question,USA_EEO2_Race_Answer,Gende r_Question,Gender_Question,Gender,SSOID^M
"67042","67042","A","","Jones","67042","Georgetown","US","78628","AJ_Jones@amat.com","+1 512-863-2043","+1 512-272-3247","TRUE","-1","-1","-3","-2","-2","-9","-3","-3","-11","67042"^M
"113021","113021","Aan Wooi","","Mu","113021","Tampines","MY","520852","Aan_Wooi_Mu@amat.com","","+65 98374227","TRUE","-1","-1","-1","-2","-2","-4","-3","-3","-11","113021"^M
"66872","66872","Aaron","","Hunter","66872","Santa Cruz","US","95060","Aaron_Hunter@amat.com","+1 408-458-0959","+1 408-584-0592","TRUE","-1","-1","-3","-2","-2","-9","-3","-3","-11","66872"^M
"70277","70277","Aaron","","Maciej","70277","Austin","US","78729","Aaron_Maciej@amat.com","+1 512-635-9991","+1 512-272-3617","TRUE","-1","-1","-3","-2","-2","-9","-3","-3","-11","70277"^M
"79145","79145","Aaron","","Wei","79145","Hsin Chu","TW","N/A","Aaron_Wei@amat.com","+886 35323094","+1 886-3-579-3464","TRUE","-1","-1","-1","-2","-2","-4","-3","-3","-11","79145"^M
"103260","103260","Aaron","","Liu","103260","Lingya District","TW","802","Aaron_Liu@amat.com","+886 88677615344","+1 886-4-22172610","TRUE","-1","-1","-1","-2","-2","-4","-3","-3","-11","103260"^M
"104267","104267","Aaron","","Himmler","104267","San Jose","US","95136","Aaron_Himmler@amat.com","+1 (703) 8955735","+1 408-584-0249","TRUE","-1","-1","-3","-2","-2","-9","-3","-3","-11","104267"^M

 
cat checkit.sh | awk -F "," '{print $7}' | tail -7
"Georgetown"
"Tampines"
"Santa Cruz"
"Austin"
"Hsin Chu"
"Lingya District"
"San Jose"

Can u check, whether it is usefull for u??

Thanx

Still no luck.

This prints all the 7th column. All i wanted is which is not ending with "

Some of the records are like below.

"116228","116228","Moriella","","Kowalski","116228","Paese "(Treviso)"","IT","31038"

In this actually 7th column "Paese should enter with " but it did not.

Your command prints this

"Paese "(Treviso)"" It assumes this as whole column.

So still i am not getting the right data.

---------- Post updated at 11:19 AM ---------- Previous update was at 11:14 AM ----------

This command prints correctly all columns. But how can i print only which does not ending with "

awk -F, 'NR==1{print $7;next};{split($7,a," \"");{print a[1]}}' file

output

"Paese"
"Paese"
"Castagnole di Paese"
"Paese
"Paese"
"Castagnole di Paese"
"Paese"
"Paese"
"Padernello di Paese"

just print only records not ending with " here "Paese

Hi Krrishv,
as per your sample file, file delimiter is , but in second row field has different values.
Confirm how you are saying that "Paese is the 7th column value.

Please help me to understand the exact criteris to get the value.

"Paese