Hello,
I need help on.
I have a File which stores the information as below.
It is space separated file, I want to keep only unique record in file based on file name.
Also if you notice sometime filename with space appear in last column like (abc_ xyz1_bc12_20140312_c.xlsx)
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:42 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
Thank you for you help.
Hello,
Following may help you.
awk '!a[$6]++' file_name
Output will be as follows.
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
Thanks,
R. Singh
1 Like
awk '{fname = "";
for(i = 6; i <= NF; i++)
{fname = (fname == "") ? $i : (fname OFS $i)};
if(a[fname]++ == 0) {print $0}}'
1 Like
ravindersingh13:
Hello,
Following may help you.
awk '!a[$6]++' file_name
Output will be as follows.
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
Thanks,
R. Singh
Thanks It works fine. I need to redirect the output to new file.
Hello,
Just use >
redirect operator for same at the end of solution as follows.
awk '!a[$6]++' file_name > OUTPUT_file_name
Thanks,
R. Singh
akshay@Aix:/tmp$ cat file
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:42 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk 'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' file
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
I think Ravinder you did not notice input properly
see below carefully
akshay@Aix:/tmp$ cat f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:42 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk '!a[$6]++' f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk 'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
akshay@Aix:/tmp$ awk '{match($0,/[^ ]* [^ ]*$/);print substr($0,RSTART,RLENGTH)}' f
abc_ xyz2_bc12_20140312_c.xlsx
abc_xyz1_bc12_20140312_c.xls
pqr_tbd1_bc12_20140312_c.doc
pqr_tbd1_bc12_20140312_c.zip
abc_ xyz1_bc12_20140312_c.xlsx
pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk '{print $6}' f
abc_
abc_xyz1_bc12_20140312_c.xls
pqr_tbd1_bc12_20140312_c.doc
pqr_tbd1_bc12_20140312_c.zip
abc_
pqr_tbd1_bc12_20140312_c.zip
ctsgnb
March 17, 2014, 9:24am
7
In my opinion, Instead of complicating the code to deal with space, i would go for renaming the file without space.
akshay@Aix:/tmp$ cat file
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:42 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk 'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' file
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
I think Ravinder you did not notice input properly
see below carefully
akshay@Aix:/tmp$ cat f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014 10:42 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk '!a[$6]++' f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk 'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' f
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014 10:35 AM 618 Admin\david abc_xyz1_bc12_20140312_c.xls
03/17/2014 10:38 AM 618 Admin\samar pqr_tbd1_bc12_20140312_c.doc
03/17/2014 10:34 AM 618 Admin\titlis pqr_tbd1_bc12_20140312_c.zip
03/17/2014 10:39 AM 618 Admin\vick abc_ xyz1_bc12_20140312_c.xlsx
akshay@Aix:/tmp$ awk '{match($0,/[^ ]* [^ ]*$/);print substr($0,RSTART,RLENGTH)}' f
abc_ xyz2_bc12_20140312_c.xlsx
abc_xyz1_bc12_20140312_c.xls
pqr_tbd1_bc12_20140312_c.doc
pqr_tbd1_bc12_20140312_c.zip
abc_ xyz1_bc12_20140312_c.xlsx
pqr_tbd1_bc12_20140312_c.zip
akshay@Aix:/tmp$ awk '{print $6}' f
abc_
abc_xyz1_bc12_20140312_c.xls
pqr_tbd1_bc12_20140312_c.doc
pqr_tbd1_bc12_20140312_c.zip
abc_
pqr_tbd1_bc12_20140312_c.zip
Hello,
I am trying to understand what you explained but could not make it.
during the sort process even if we miss the file name with Space inside file is not problem. because on server we receive only file without space.
Thanks again for help and explanation.
Please.. look into answer once again
Regards,
Akshay
1 Like
One more trouble, how we can keep the latest record while doing the shorting.
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/31/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
---------- Post updated at 05:13 AM ---------- Previous update was at 04:32 AM ----------
Hello,
Akshay,
now i understood what you trying to explain.
but how we can keep the latest record while doing the shorting not the duplicate first one.
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
03/31/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx
I need only
03/31/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx