My file (the output of an experiment) starts off looking like this,
_____________________________________________________________
Subjects incorporated to date: 001
Data file started on machine PKSHS260-05CP
**********************************************************************
Subject 1, 11/30/2017 16:07:17 on PKSHS260-05CP, DMDX 5.1.5.3, Windows 6.1.7601, refresh 16.67ms, ID ik60607
! DMDX is running in auto mode (automatically determined raster sync)
! Video Mode 1280,1024,32,60
! Item File <C:\Users\XXXXXXX\Desktop\Experiment\Version2.rtf>
Item 11213, 1372.44
1372.44,+Right Ctrl
Item 11213, 1052.90
1052.90,+CTRL
Item 114109, -1102.03
1102.03,+Right Ctrl
Item 11131, 721.06
721.06,+Right Ctrl
Item 111325, 1075.30
1075.30,+Right Ctrl
I used the following:
egrep '^(Item|Subject|!)' filename
to get it like this
Subjects incorporated to date: 001
Subject 1, 11/30/2017 16:07:17 on PKSHS260-05CP, DMDX 5.1.5.3, Windows 6.1.7601, refresh 16.67ms, ID ik60607
! DMDX is running in auto mode (automatically determined raster sync)
! Video Mode 1280,1024,32,60
! Item File <C:\Users\XXXXX\Desktop\Experiment\Version2.rtf>
Item 11213, 1372.44
Item 11213, 1052.90
Item 114109, -1102.03
Item 11131, 721.06
Now I want to use something like
awk 'NF > 5' | awk '{print $NF}'
to extract the ik60607 (which is an identifier for that participant in the experiment)
and produce something like this
ik60607,Item 11213, 1372.44
ik60607,Item 11213, 1052.90
ik60607,Item 114109, -1102.03
ik60607,Item 11131, 721.06
etc.
or ideally do something like this operation twice (using the rtf filename) to produce
Version2.rtf,ik60607,Item 11213, 1372.44
Version2.rtf,ik60607,Item 11213, 1052.90
Version2.rtf,ik60607,Item 114109, -1102.03
Version2.rtf,ik60607,Item 11131, 721.06
I think I might need to use 'paste' after making a file with the ID printed the same number of times as the original file has lines and then deleting the bits. I have tried using awk but got confused with record and field separators & do not understand what I have read in manual pages. I am not in anyway skilled with this but was forced to do some of this twice in my life for 2 month periods 10 and 25 years ago working with large dictionaries which is why I even tried. Please help the very naive non-programmer (& it is pre-processing data for even more hapless students). I have to do this on a 108 (currently separate) files all with different (non regular) names and am not sure whether using cat first to join them up would make things even worse (there would be record and field separators that way but it seems even more complex)
Any advice gratefully received.
___________________________________________________________