Want to read similar all files and extract the info. That means the output would have info for each
of the files with such format.The output for one file would be expected like below.
Here is a bash solution with associative arrays, requires bash 4.
#!/bin/bash
cols="node tname pfile metafile"
space=20
declare -A C
# print the header
headersep=""
for col in $cols
do
printf "${headersep}%${space}s" "$col"
headersep=" | "
done
printf "\n"
# loop over the files
for jfile in job[0-9]*.ksh
do
# loop over the lines, collect values in hash C[]
while IFS="=" read key val
do
[ -z "${#val}" ] && continue
case $key in
*"&node") C[node]=$val
;;
*"&tname") C[tname]=$val
;;
*"&pfile"*) C[pfile]=${C[pfile]}${C[pfile]:+,}${val##*/}
;;
*"&metafile"*) C[metafile]=${C[metafile]}${C[metafile]:+,}${val##*/}
;;
esac
done < "$jfile"
# print and clear C[]
sep=""
for col in $cols
do
printf "${sep}%${space}s" "${C[$col]}"
unset C[$col]
sep=$headersep
done
printf "\n"
done
It is not yet complete.
But once understood how it works it is easy to expand.
with awk in the script above.
There can be different files with different pattern and I want to pick only those files (1000 files in this case ) which would have content starting with submit.
can the last line be modified to select only those files that have content starting with line 'submit'
' OFS="| " *.ksh
' OFS="| " awk '/^submit/{print FILENAME;nextfile}' *.sh # pick only those selected files
Please always open a new thread for a new question, now coming to your question, if you want to print only the file names out of many files which have string submit in them starting of the line then you could make a slight change into your code.
awk '/^submit/{print FILENAME;nextfile}' *.sh
Since you haven't told us about the Input_files and their look so I am removing the OFS part here which is anyways not require since you are printing only the file names, in case your Input_files are | delimited then you could add -F"|" into this above code after awk and your string submit is on a specific field you could look only for that field then.
I just tested with 3 files and it worked for me, could you please paste the exact error here? Also make sure you have at least read permissions to files in which you are trying to search the keyword.
Are you running this awk program too as a .sh script? If yes then you may need to consider that it will take this script also as this script is also ending with .sh . If this is not the case then you may need to post whatever files are present in the directory and we may need to see what's going on.
I am shocked, as I have already told you not to mix 2 codes. My first code was to get the output in your expected shape from a Input_file and second code which was posted by you I corrected(which I provided fair warning like don't mix them and open a new thread on same).
Would like to request you if you can segregate your requirements as it is very confusing now.
NOTE: Above your code will NOT work in this style you are using 2 awk by providing 1 time Input_file which is *.sh .
Hi,
I want to pull all the files that have the pattern like the input file given. In a directory there are say 5000 files out of which 1000 files have the text starting with 'submit file'.
and from those files I want to get the output I mentioned earlier.
First filter those files which have the text starting with 'submit file' ( out of 5000, I would have 1000 such input file ) and read those input files which have have the text starting with 'submit file' and get the output in the required format I mentioned earlier. So basically, I want to filter out the files and read those file as input only and get required output. thanks
This is very convoluted logic. Using awk to read all of your 5000 files to get a list of files containing a certain string to use as arguments to another awk script is grossly inefficient since you have to read each of your 5000 and then read the selected 1000 files a second time. There is very seldom a need to invoke awk twice, but if you must you have to actually invoke awk twice instead of just using the command line arguments that would be used to invoke awk as operands to awk as in:
Many thanks!
One problem I am seeing here is that awk also is reading the comment lines.
Is there a way to ignore all comment lines at the very beginning? Should I need to create a separate post for this? Thanks.
If you are having problems with comments in the shell scripts that are being fed into the script you specified in post #1 in this thread, you don't need to start a new thread; otherwise, you do.
Either way you need to explain what comments need to be removed and exactly how your awk script is supposed to determine what you consider to be a "comment line at the very beginning". The only comment shown in your sample input file(s) is:
#!/bin/ksh
and I don't see why removing that comment from your input files will make any difference in your results. If you don't supply representative sample input files corresponding to the data you want to process, you are wasting time for all of us.