Header1
Header2
*TM*
Data record 1
Datarecord 2
Datarecord n
*TM*
Trailer1
Trailer2
*TM*
*TM*
EOF
I've to read all the data file name one by one and extract only data records to single file "complete.txt".
I've done the shell script for the above work. But my Manager is suggested me to simplify the code.
Are there any simple logic using awk to accomplish the same?
1) Radoulov - Your code is amazing, but what is the difference between the two option which you have given? I was not clear for "If the filenames contain no spaces". But the first option is working for my requirement.
2)bobbygsk - your code is also working for my requirement. But my concern here is the "performance". Since in real environment the datafiles will have "millions" of records speed up our script is very essential.
My code looks simple and easy to understand.
I'm in intermediate stage of unix scripting.
I do not know about my script performance.
It is better to go with AWK.
You need to try out the alternatives before picking one that works efficiently specially since you need to process millions of records which would not be an easy feat to accomplish.
awk '{
s = sprintf("\"%s\"", $0)
re = "*TM*"
a[re] = 0
while ("cat "s | getline l) {
if (l == re)
a[re]++
if (a[re] == 1 && l != re)
print l
}
}' list_file
Since datafile and list will not be in the same location. Also list file will only have the datafile names to be selected and not the location. We've to declare this explicitly.
ex: FTPIN/ is the location for datafile and FILES/ is the location for list file and FTPOUT/ is the location for complete.txt file.
I've to mention these location in your code as below: