Pass input and output file as parameter to awk script

bhaskarjha178 · July 20, 2009, 5:05am

Hi,
i am new to awk. I am using csv2pipe script(shown below)

BEGIN { FS=SUBSEP; OFS="|" }
{
result = setcsv($0, ",")
print
}
# setcsv(str, sep) - parse CSV (MS specification) input
# str, the string to be parsed. (Most likely $0.)
# sep, the separator between the values.
#
# After a call to setcsv the parsed fields are found in $1 to $NF.
# setcsv returns 1 on sucess and 0 on failure.
#
# By Peter Strvmberg aka PEZ.
# Based on setcsv by Adrian Davis. Modified to handle a separator
# of choice and embedded newlines. The basic approach is to take the
# burden off of the regular expression matching by replacing ambigious
# characters with characters unlikely to be found in the input. For
# this the characters "\035".
#
# Note 1. Prior to calling setcsv you must set FS to a character which
# can never be found in the input. (Consider SUBSEP.)
# Note 2. If setcsv can't find the closing double quote for the string
# in str it will consume the next line of input by calling
# getline and call itself until it finds the closing double
# qoute or no more input is available (considered a failiure).
# Note 3. Only the "" representation of a literal quote is supported.
# Note 4. setcsv will probably missbehave if sep used as a regular
# expression can match anything else than a call to index()
# would match.
#
function setcsv(str, sep, i) {
gsub(/""/, "\035", str)
gsub(sep, FS, str)
while (match(str, /"[^"]*"/)) {
middle = substr(str, RSTART+1, RLENGTH-2)
gsub(FS, sep, middle)
str = sprintf("%.*s%s%s", RSTART-1, str, middle,
substr(str, RSTART+RLENGTH))
}
if (index(str, "\"")) {
return ((getline) > 0) ? setcsv(str (RT != "" ? RT : RS) $0, sep) : !setcsv(str "\"", sep)
} else {
gsub(/\035/, "\"", str)
$0 = str
for (i = 1; i <= NF; i++)
if (match($i, /^"+$/))
$i = substr($i, 2)
$1 = $1 ""
return 1
}
}

I run it as
nawk -f csv2pipe.awk raw.txt

It works fine but gives the ouput on the console. I need it on the $1 parameter.
how should I modify "print" in the script so that the output is received on $1 instead of console.

I tried the following:
print > $1 ---> Doesn't work
print > "temp" --> Creates a new file temp and saves the output

Can anybody please help me out.

Thanks in advance.

kshji · July 20, 2009, 5:15am

Keep your awk in standard format: read data from stdin and write data to the stdout. Caller tell the in and out using standard method like:

nawk -f csv2pipe.awk raw.txt > result.txt

If you need only some of output to file, then you need tell in your awk-block. Ex.

nawk -v outfile="somefile.txt" -f csv2pipe.awk raw.txt > result.txt

Now you can use stdout to print result.txt and some output to file somefile.txt using in awk:

print "...."  >> outfile
printf "....." >> outfile

bhaskarjha178 · July 20, 2009, 5:19am

hey Kshiji,
Thanks for the reply.

Please note that I have to do this inside the script.
I cannot use to store in an output file like
nawk -f csv2pipe.awk raw.txt > result.txt

I want to use something like
nawk -f csv2pipe.awk raw.txt out.txt

And get the result in out.txt ie $1 parameter to the script.

Can you suggest a way to do this?

kshji · July 20, 2009, 6:05am

I can't see the light ?
What difference is using set output in command line or in awk ?

You also didn't try:

awk -v outfile="somefile.txt" -f csv2pipe.awk raw.txt
and then in awk redirect all your print/printf using variable outfile, which you have set in commandline.
print .... >> outfile

Variable setting is more readable as using command line arguments in awk.

Doc, including awk, look "Command-line arguments" in Awk-section.

bhaskarjha178 · July 21, 2009, 3:17am

Thanks, it worked out for me at the first time. Sorry for not updating the blog.

---------- Post updated 07-21-09 at 12:47 PM ---------- Previous update was 07-20-09 at 03:37 PM ----------

Hi,
By using print > out in my script and running it as

/bin/pgawk -v out="raw.txt" -f csv2pipe.awk "raw.txt"

I don't have nawk installed, so i decided to use pgawk.
I want to convert csv file (raw.txt) into pipe deliminated file with same name.

This command works fine for files with small size, ie it converts the entire data from csv to pipe.
But when the data is huge, like 4000 lines, it converts only 300 lines and save them. I am not able to get the entire data into the new file.

i tried print >> out, but it produces some other huge output.

Also, every time I run the above command, it makes another file name "awkprof.out". Is there a way to not to create this file and get the complete output of csv file into pipe deliminated file in case of huge files.

Please help me out.
Thanks in advance.

kshji · July 21, 2009, 3:38am

You can't read and write same file in same time. With small file it maybe works. Output to the tmpfile. If there is problem in csv2pipe.awk, you need to show it.

bhaskarjha178 · July 21, 2009, 3:40am

Got it. it works fine with a tmp file. Thanks a lot for ur time.