Hi,
I am currently using the sed and awk commands to filter a file that has multiple sets of data in different columns. An example of part of the file i am filtering is as follows;
Sat Oct 2 07:42:45 2010 01:33:46 R1_CAR_12.34
Sun Oct 3 13:09:53 2010 00:02:34 R2_BUS_56.78
Sun Oct 3 21:11:39 2010 00:43:21 R3_TRAIN_COACH_90.12
Mon Oct 4 06:07:10 2010 00:01:50 R4_TRAIN_CARRAIGE_34.56X
when i filter the file i get the following result;
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X
The sed and awk commands i am using are as follows;
sed 's/[^ \t][^ \t]*[ \t]//4;s/[^ \t_]*_//;s/_.*\(.\)$/ \1/;s/[^X]$//' | awk '{print $1","$2","$3","$4","$5","$
6","$7}'
I am trying to figure out how to filter the data so that, for example, instead of getting;
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X
i would like to get;
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,COACH,
Mon,Oct,4,2010,00:01:50,CARRAIGE,X
Could i use the sed command twice so that i would get;
Sat Oct 2 07:42:45 2010 01:33:46 CAR
Sun Oct 3 13:09:53 2010 00:02:34 BUS
Sun Oct 3 21:11:39 2010 00:43:21 TRAIN_COACH
Mon Oct 4 06:07:10 2010 00:01:50 TRAIN_CARRAIGE X
first and then use the sed command to remove the "TRAIN_" part to get;
Sat Oct 2 07:42:45 2010 01:33:46 CAR
Sun Oct 3 13:09:53 2010 00:02:34 BUS
Sun Oct 3 21:11:39 2010 00:43:21 COACH
Mon Oct 4 06:07:10 2010 00:01:50 CARRIAGE X
This is only a suggestion but a much better method could probably be used.
Unfotunately i am new to unix so i am only just getting used to all the commands
If i have made anything unclear please let me know and i will try to explain the problem better.
Any help would be greatly appreciated
Thanks in advance