Sort a file content with space

Hello,
I need help on.

I have a File which stores the information as below.
It is space separated file, I want to keep only unique record in file based on file name.
Also if you notice sometime filename with space appear in last column like (abc_ xyz1_bc12_20140312_c.xlsx)

03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip
03/17/2014  10:39 AM  618 Admin\vick      abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:42 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

Thank you for you help.

Hello,

Following may help you.

awk '!a[$6]++' file_name

Output will be as follows.

03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

Thanks,
R. Singh

1 Like
awk '{fname = "";
 for(i = 6; i <= NF; i++)
  {fname = (fname == "") ? $i : (fname OFS $i)};
 if(a[fname]++ == 0) {print $0}}' 
1 Like

Thanks It works fine. I need to redirect the output to new file.

Hello,

Just use > redirect operator for same at the end of solution as follows.

awk '!a[$6]++' file_name > OUTPUT_file_name

Thanks,
R. Singh

akshay@Aix:/tmp$ cat file
03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip
03/17/2014  10:39 AM  618 Admin\vick      abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:42 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

akshay@Aix:/tmp$ awk  'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' file
03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

I think Ravinder you did not notice input properly

see below carefully

akshay@Aix:/tmp$ cat f
03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip
03/17/2014  10:39 AM  618 Admin\vick      abc_ xyz1_bc12_20140312_c.xlsx
03/17/2014  10:42 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

akshay@Aix:/tmp$ awk '!a[$6]++' f
03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip

akshay@Aix:/tmp$ awk  'match($0,/[^ ]* [^ ]*$/) && !x[substr($0,RSTART,RLENGTH)]++' f
03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx
03/17/2014  10:35 AM  618 Admin\david     abc_xyz1_bc12_20140312_c.xls
03/17/2014  10:38 AM  618 Admin\samar    pqr_tbd1_bc12_20140312_c.doc
03/17/2014  10:34 AM  618 Admin\titlis      pqr_tbd1_bc12_20140312_c.zip
03/17/2014  10:39 AM  618 Admin\vick      abc_ xyz1_bc12_20140312_c.xlsx

akshay@Aix:/tmp$ awk  '{match($0,/[^ ]* [^ ]*$/);print substr($0,RSTART,RLENGTH)}' f
abc_ xyz2_bc12_20140312_c.xlsx
 abc_xyz1_bc12_20140312_c.xls
 pqr_tbd1_bc12_20140312_c.doc
 pqr_tbd1_bc12_20140312_c.zip
abc_ xyz1_bc12_20140312_c.xlsx
 pqr_tbd1_bc12_20140312_c.zip

akshay@Aix:/tmp$ awk '{print $6}' f
abc_
abc_xyz1_bc12_20140312_c.xls
pqr_tbd1_bc12_20140312_c.doc
pqr_tbd1_bc12_20140312_c.zip
abc_
pqr_tbd1_bc12_20140312_c.zip

In my opinion, Instead of complicating the code to deal with space, i would go for renaming the file without space.

Hello,
I am trying to understand what you explained but could not make it.

during the sort process even if we miss the file name with Space inside file is not problem. because on server we receive only file without space.

Thanks again for help and explanation.

Please.. look into answer once again :slight_smile:

Regards,
Akshay

1 Like

One more trouble, how we can keep the latest record while doing the shorting.

03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx
03/31/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx

---------- Post updated at 05:13 AM ---------- Previous update was at 04:32 AM ----------

Hello,
Akshay,
now i understood what you trying to explain.
but how we can keep the latest record while doing the shorting not the duplicate first one.

03/17/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx
03/31/2014  10:35 AM  618 Admin\vick       abc_ xyz2_bc12_20140312_c.xlsx

I need only
03/31/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140312_c.xlsx