Help with concatinating the data of 2 files

ss_ss · February 5, 2013, 3:52am

Hi All,

I am trying to merge to 2 files 348.csv & 349.csv using join,awk commands but not getting proper output:

 
cat 348.csv
Timestamp              Server
                              348 
02/04/2013 09:19      100
02/04/2013 09:20      250
02/04/2013 09:21      80

 
cat 349.csv
Timestamp              Server
                              349
02/04/2013 09:19      234 
02/04/2013 09:20      13 
02/04/2013 09:21      546

 
Expected output
Timestamp                 Server      
                              348  349
02/04/2013 09:19      100  234
02/04/2013 09:20      250  13
02/04/2013 09:21      80    546

 
using below awk:
awk 'NR==FNR{i=NF<5?"__":$1$2$3$4;a=$0;next} FNR==1{print}{i=NF<5?"__":$1$2$3$4}FNR>1&&i in a{print a,$NF}' 348.csv 349.csv

 
using below join:
join -a1 -a2 348.csv 349.csv

Thanks,

pamu · February 5, 2013, 4:07am

Try

 
awk 'NR==FNR{A[NR]=$0;next}{ print A[FNR],FNR!=1?$NF:""}' file1 file2

RudiC · February 5, 2013, 4:20am

This one relies upon your files having exactly synchronized, corresponding lines:

awk 'NR==1 {print; getline < f1; next}
     {getline ln < f1; print ln, $NF}
    ' f1="file1" file2
Timestamp              Server
                              348  349
02/04/2013 09:19      100 234
02/04/2013 09:20      250 13
02/04/2013 09:21      80 546

You may want to alter the OFS char to make the output prettier...

ss_ss · February 6, 2013, 4:17am

Hi Pamu/Rudi,

Thanks for your inputs, it worked well.

But still facing some formatting issues i.e. both the numbers are coming under one server name only.

Timestamp             BRM Servers
                          348  349
02/05/2013 22:26          5 6
02/05/2013 22:36          5 6

Need the numbers below their respective server names.

Is there anyhting can be done?

Thanks in advance!!!!

pamu · February 6, 2013, 4:32am

do mean like this..?

 
$awk 'NR==FNR{A[NR]=$0;next}{ print A[FNR],$NF}' file1 file2

Timestamp              Server Server
                              348  349
02/04/2013 09:19      100 234
02/04/2013 09:20      250 13
02/04/2013 09:21      80 546

RudiC · February 6, 2013, 4:49am

Of course, but this is heavily depending on your input files. In your sample files, the server ID is far out right, with so many space before it, so don't expect it to come into the desired position by itself. As I said before, you could play around with FS and OFS variables to awk to improve the output formatting.
If you post two carefully composed input files, we can have a look onto the output formatting.

ss_ss · February 6, 2013, 9:12am

The server id is not intentionally put to the far right, got messed up while posting.

In the input file also the nos would be below the server name and same is required in the output.

Tried a lot posting a clean and formatted input output here but not able to do so, the server names are automatically shifting towards the right

Thanks

RudiC · February 7, 2013, 12:39am

Then you have to play with FS and OFS, as mentioned before. Try like FS=" +" and OFS="\t" and combinations of white space that you see in your in your input files. BTW - your input files have trailing spaces that can influence results as well. Try to get rid of them.

ss_ss · February 8, 2013, 10:17am

Hi RudiC,

Below im pasting the script which is running on 2 different servers and generating the files 348.csv & 349.csv (which we are trying to merge)

#!/bin/bash
F=eai_js_348.csv
D=dm_eai_348.csv
cat > $F << EOF # Write a header to the file $F
Timestamp             BRM Servers
                           348
EOF
cat > $D << EOF # Write a header to the file $D
Timestamp             BRM Servers
                           348
EOF
# Write data every 1800 seconds.
while sleep 1800; do
    date +'%m/%d/%Y %H:%M' | tr -d \\n >> $F
    printf '          ' >> $F
    netstat | grep 16001 | wc -l >> $F
    date +'%m/%d/%Y %H:%M' | tr -d \\n >> $D
    printf '          ' >> $D
    ps -ef | grep -i DM_EAI | wc -l >> $D
done

So, wanted to check if something can be fine tuned in this script only so that can avoid the formatting of the final output (after the merge) or atleast can get rid of the trailing spaces or any other formatting issues what you mentioned in the previous post.

Thanks

RudiC · February 8, 2013, 11:02am

I can't see anything immediately jumping to my eyes. There's no trailing spaces in above script...
You could try to print everything in one go to the respective files:

$ printf "%s\t\t%s", $(date +'%m/%d/%Y %H:%M), $(netstat | grep 16001 | wc -l)  > $F

, but I'm not sure that would really help...