Remove comma and next rows beginning from the end

Hello friends,

I have a file which consists of many rows, I use a couple of commands to convert it so i can use in a database query for filtering. I need the first columns (msisdns) in a row, seperated with commas,

9855162267,4,5,2010-11-03 17:02:07.627
9594567938f,5,5,2010-11-02 12:47:08.047
9855155486,4,5,2010-11-01 12:26:37.640
9233453445f,5,5,2010-11-02 11:20:43.327
9434326423,5,5,2010-11-01 11:02:02.217
9592416210f,4,5,2010-11-02 10:20:52.063
nawk -F, '{print $1}' FILE | sed -e 's/f$//g' -e 's/\([0-9]\{10\}\)/91\1/g' | nawk '!_[$0]++' | nawk -v RS="\n" -v ORS=","  '{}'1

code works well but in output there is a comma at the end and more, so i cant save it to a file. I know it is easier to get rid of it using "print" options but i couldnt, I appreciate any suggestion to remove the colored part, is it also possible without adding another command with pipe?

919855162267,919594567938,919855155486,919233453445,919434326423,919592416210,server{root}/a/b/c>

Regards

This should do it in 1 nawk command:

nawk -F, '{ printf (NR==1?"":",")$1} END {printf "\n"}' FILE
1 Like

To drop the f force $1 into numerical context:

nawk -F, '{printf (NR>1?FS:x)91$1+0} END{print x}' infile
1 Like
awk -F, '{gsub(/f/,"",$1)}NR==1{a="91" $1;b[$1]++;next}!b[$1]++ {a=a ",91" $1}END{print a}' FILE
1 Like

@Scruti

with the +0 then i get a result in scientific notation :

# awk -F, '{printf (NR>1?FS:x)91$1+0} END{print x}' infile
919.85516e+09,919.59457e+09,919.85516e+09,919.23345e+09,919.43433e+09,919.59242e+09

That is probably because your awk cannot handle large integers. Try:

nawk -F, '{printf (NR>1?FS:x)"91%.0f",$1} END{print x}' infile

(which is better practice anyway :wink: . Another advantage is there is no need for +0)

1 Like

Yep ! that one is better :wink:

x stand for an empty string , correct ?

Correct, since variable x is uninitialized and it is used in a string context.

Thanks all for your responses, I have a question;

rdcwayx your code simplly covers all mine :slight_smile: , it is good that you put "!b[$1]++ " option to remove duplicates there. When one of first field of next lines is same then it skips directly to the next line without applying the rest of code does not it?

Besides Scrutinizer it is good to learn how "x" works there :slight_smile:

And Scrutinezer i added the

!b[$1]++

part to your code and it also covers removing duplidates now

nawk -F, '!b[$1]++{printf (NR>1?FS:x)91$1+0} END{print x}' FILE

Regards

Hi Eagle, It did not know that was a requirement. But then I would add this (to exclude duplicates that differ only in an "f"):

awk -F, '!b[$1+0]++{printf (NR>1?FS:x)91$1+0} END{print x}'  infile