Variable of Path directory is not parsing in awk

Hi All,

i had to split one files into 10 equally. For that i have coded below awk.

OUTPUT_FILE=/home/sit/path/Files/file_EXPORT.lst
DIR_NM=`dirname ${OUTPUT_FILE}`

awk -v CURR_DATE="$(date +'%d-%m-%Y-%H-%M')" -v pth=$DIR_NM '{print >> pth/"tgt_file_name"CURR_DATE"_"NR%10 }' ${OUTPUT_FILE}

but the pth is not parsing or i don't know what is happening i am getting below error

 fatal: division by zero attempted

but when i remove pth or directory it runs fine and generate files in current directory. like below.

awk -v CURR_DATE="$(date +'%d-%m-%Y-%H-%M')" -v pth=$DIR_NM '{print >> "tgt_file_name"CURR_DATE"_" NR%10 }' ${OUTPUT_FILE}

but i want to create file in some directory only and i cannot hard code it.
Kindly help on this.

What does $DIR_NM evaluate to? Moreover what is the dirname "command" -- A script?

dirname

will extract directory path from complete file path eg

dirname /home/admin/filename

evaluate to

/home/admin

i am putting this directory path into $DIR_NM

Hello looney,

Not completely sure about your whole requirement, could you please change pth/"tgt_file_name"CURR_DATE"_"NR%10 to following into your code and let me know how it goes please.

 pth"/tgt_file_name"CURR_DATE"_"NR%10
 

Thanks,
R. Singh

1 Like

One more query, there will be around 10 million records in source file, that must be distributed/split exactly in 10 files. Since i have used awk print >> . Do i need to close it also. ?
Thanks

Hello looney,

Yes, you could do division by int(NR/1000000) to make only 10 files as per your requirement. You could double check it as I don't have 10 million line Input_file with me, so logic is simple like if you wanted to make 10 files just divide the total number of lines by the number of files which you need and you will get per file total number of lines then.

Yes, we do need to close the files for example close(f) where f is a variable for the filename.
I hope this helps you.

Thanks,
R. Singh

If you want to split the source file in to 10 equal pieces, and the files renumbered for easy identification.

split -l 1000000 --numeric-suffixes --suffix-length=3 Source_file target_file.

1 Like

The NR%10 will distribute the lines as evenly as possible between 10 files, writing every single line to the next (rotating) file, independent of the total number of lines.

If the open file count is less then the max. allowed open files, close(f) is not indispensable, although it were good style - normally. In your case of 1E7 lines, it would mean 1E7 open and 1E7 close operations, slowing down processing considerably. Me, I'd forgo it.