Get extract and replace column with link in a column where it exists

hi i have sample data

a,b,c,d,e,g h http://mysite.xyx
z,b,d,f,e,s t http://123124#
a,b,c,i,m,nothing
d,i,j,e,w,nothing

output expected is

a,b,c,d,e,http://mysite.xyx
z,b,d,f,e,http://123124#
a,b,c,i,m,nothing
d,i,j,e,w,nothing

i can get only links using grep -o 'http.*'

i tried something like below it doesn't work

for i in `cat file.csv`
do
first=$i|awk '{print $1}'
second=$i|awk '{print $2}'
third=$i|awk '{print $3}'
four=$i|awk '{print $4}'
five=$i|awk '{print $5}'
six=$i|awk '{print $6}'
 if [ $six = "nothing" ] 
 then
 six=$six
 else
    six=`grep -o 'http.*' $six`
 fi
echo "$first,$second,$third,$four,$five,$six"
 done >> output.csv

Hello zozoo,

There are lot of questions in output shown by you.
I- By what logic you have removed lines a,b,c,i,m,nothing and d,i,j,e,w,nothing ?
II- By which logic line z,b,d,f,e,s t http://123124# changed to z,b,d,f,e,http://mysite.xyx ?

Would like to request you to please be clear in your posts.

Thanks,
R. Singh

1 Like

hi Ravinder i am sorry for the wrong output corrected now
so basically i want to check if the sixth column is having any url then replace the field with url else leave it what ever value it is having .

Hello zozoo,

Could you please try following and let me know if this helps you.

awk -F',| ' '{print $1,$2,$3,$4,$5,$NF}' OFS=,   Input_file

Thanks,
R. Singh

1 Like

Another way:

awk 'NF>1{sub(/[^,]*$/,$NF)}1' file

or

sed 's/[^,]* //' file

--
(same thing with awk: )

awk '{sub(/[^,]* /,x)}1' file
1 Like

That solved i was trying to another version just now to match the http string and then split by space into array to retrun the value , but your solution solved it

so in the solution you are trying to split by delimeter , or <space> right and $NF would give last field am i correct in understanding the solution

---------- Post updated at 08:50 PM ---------- Previous update was at 08:41 PM ----------

Hi Scrutinizer the solution you provided also works its difficult to understand can you please explain

Hi zozoo,

The first approach replaces the part after the last comma with the last field ( $NF )
The other ones remove the part after the last comma upto and including the last space

[^,] means "a character that is not a comma".

Hello zozoo,

Could you please go through following explanation and let me know if this helps you.

awk -F',| ' '{           ##making field separator as comma(,) OR space here for each line.
print $1,$2,$3,$4,$5,$NF ##Printing the first, second, third, fourth, fifth, and $NF(last field) od the line.
}
' OFS=,   Input_file     ##Setting Output filed separator as comma and mentioning Input_file here too.

Thanks,
R. Singh

perl -pe 's/\w+ //g/' zoo.file #removes any word followed by space