Hi,
using Shell to do some file manipulation here.
Input - input.txt
"2006/2007", "string1a","string2v","stringf3"
"2006/2007", "string12b","string30c","string10d"
"2006/2007", "string22","string22","string11"
"2007/2008", "string1a","string2v","stringf3"
"2007/2008", "string12","string30","string10"
"2007/2008", "string22","string222","string111"
"2007/2008", "string100","string111","string444"
"2007/2008", "string134","string245","string389"
"2008/2009", "string1","string2","string3"
"2008/2009", "string12","string30","string10"
"2008/2009", "string22","string222","string111"
"2008/2009", "string1000","string1111","string4444"
"2008/2009", "string1234","string2456","string3789"
I need the output to be thrown into three or x files based on the unique values on first column
Output
File 2006-2007.txt
"2006/2007", "string1a","string2v","stringf3"
"2006/2007", "string12b","string30c","string10d"
"2006/2007", "string22","string22","string11"
File 2007-2008.txt
"2007/2008", "string1a","string2v","stringf3"
"2007/2008", "string12","string30","string10"
"2007/2008", "string22","string222","string111"
"2007/2008", "string100","string111","string444"
"2007/2008", "string134","string245","string389"
File 2008-2009.txt
"2008/2009", "string1","string2","string3"
"2008/2009", "string12","string30","string10"
"2008/2009", "string22","string222","string111"
"2008/2009", "string1000","string1111","string4444"
"2008/2009", "string1234","string2456","string3789"
I am trying to to take unique values from the first column using cut -d , :f1 | sort | uniq to create a list of files and using the below code to populate the output files.
for id in `cut -d , :f1 input.txt | sort | uniq` ;do
grep $id input.txt > $id.txt
done
But it is taking so long time. Cut itself taking 7mins as the input file is huge.
Please suggest any other approach which can improve the turnaround time.