Separate output based on dates

Jimmy_the_tulip · September 25, 2014, 9:44am

Hi guys!

First time poster on here, was wondering if someone could help with a problem which I'm facing.

Is it possible to use awk or sed to separate the below output based on TRANSACTION_DATE? The output would then feed into DataStage.

Contents of CSV file:

TRANSACTION_DATE_TIME,TRANSACTION_DATE,TRANSACTION_ID,STATUS
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE    
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE    
5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE    
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE    
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE    
6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE    
6/8/2014 8:25,7/8/2014,MTS_104,COMPLETE

So
Result set 1:

TRANSACTION_DATE_TIME,TRANSACTION_DATE,TRANSACTION_ID,STATUS
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE    
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE

Result set 2:

5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE    
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE    
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE

Result set 3:

6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE

etc...

Thank you kindly.

jim_mcnamara · September 25, 2014, 9:52am

Do you need separate output files?- I do not know what Datastage uses.

Jimmy_the_tulip · September 25, 2014, 10:10am

Hi jim mcnamara,

No, just to be outputted into the unix command line.

RavinderSingh13 · September 25, 2014, 10:17am

jimmy\_the_tulip:

Hi guys!

First time poster on here, was wondering if someone could help with a problem which I'm facing.

Is it possible to use awk or sed to separate the below output based on TRANSACTION_DATE? The output would then feed into DataStage.

Contents of CSV file:
TRANSACTION_DATE_TIME,TRANSACTION_DATE,TRANSACTION_ID,STATUS
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE    
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE    
5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE    
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE    
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE    
6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE    
6/8/2014 8:25,7/8/2014,MTS_104,COMPLETE
So
Result set 1:
TRANSACTION_DATE_TIME,TRANSACTION_DATE,TRANSACTION_ID,STATUS
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE    
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE 
Result set 2:
5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE    
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE    
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE    
Result set 3:
6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE   
etc...

Thank you kindly.

Hello,

Could you please try following, following will help you to create files according to first column.

awk -F" |," 'BEGIN{i=1;}{if(A != $1){i++}} {print $0 >> i".txt";A=$1}' Input_file

In this case 3 files will be created as only 3 unique dates are there in given input.

cat 4.txt
6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE
6/8/2014 8:25,7/8/2014,MTS_104,COMPLETE
 
cat 3.txt
5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE
 
cat 2.txt
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE

EDIT: Just saw OP wants to show on command line only so following may help.

sort -k1 Input_file | awk -F" |," 'BEGIN{i=1;}{if(A != $1 && A){print "**************************************"}} {print $0;A=$1}'

Output will be as follows.

4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE
4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE
**************************************
5/8/2014 5:10,5/8/2014,MTS_100,COMPLETE
5/8/2014 5:26,5/8/2014,MTS_101,COMPLETE
5/8/2014 5:30,5/8/2014,MTS_102,COMPLETE
**************************************
6/8/2014 8:05,6/8/2014,MTS_103,COMPLETE
6/8/2014 8:25,7/8/2014,MTS_104,COMPLETE

You can print new line etc in place of **** too. Hope this helps.

EDIT: Also if you need to take date as variable then following may help.

awk -F" |," -vs1="4/8/2014" '{if($1 ~ s1){print $0}}' Input_file

Output will be as follows.

4/8/2014 2:30,4/8/2014,MTS_99,COMPLETE
4/8/2014 1:50,4/8/2014,MTS_68,COMPLETE

Thanks,
R. Singh

Akshay_Hegde · September 25, 2014, 10:39am

Try :

 awk -F'[ ,]' 'FNR==1{h=$0;next}{f=sprintf("%s.txt",$1); gsub(/\//,"_",f); if(f in F){ print >> f } else { print h RS $0 > f; F[f] } close(f) }' file

---------- Post updated at 09:09 PM ---------- Previous update was at 09:07 PM ----------

If you want to try this on a Solaris/SunOS system, change awk at the start of the script to /usr/xpg4/bin/awk or /usr/xpg6/bin/awk or nawk .

Jimmy_the_tulip · September 25, 2014, 8:32pm

Hi RavinderSingh13,

This was exactly what I was after, thanks a lot!