Apply 'awk' to all files in a directory or individual files from a command line

ScKaSx · November 4, 2010, 3:04pm

Hi All,

I am using the awk command to replace ',' by '\t' (tabs) in a csv file. I would like to apply this to all .csv files in a directory and create .txt files with the tabs.

How would I do this in a script?

I have the following script called "csvtabs":

awk 'BEGIN {
                FS = ","
                OFS = "\t"
        }

        {
                $1 = $1
                for (i = 1; i <= NF; i++) {
                        if ($i == "") {
                                $i = "null"
                        }
                }
                print $0
        }' test.csv > test.txt

which makes test.txt with tabs. However, this only works for one file (test.csv) which I have to change each time to the file I want to convert. Since I have many files I want to make the process faster.

Alternatively, if the solution is too difficult, how can I alter the script to take in files at the command line. In other words run:

$ csvtabs test.csv test.txt

instead

$ csvtabs

Any help is appreciated,

Cheers,
ScKaSx

Franklin52 · November 4, 2010, 3:19pm

Try this:

awk 'BEGIN {FS = ","; OFS = "\t"}
f!=FILENAME{f=FILENAME;split(f,a,".")}
{
  $1 = $1
  for (i = 1; i <= NF; i++) {
    if ($i == "") {
      $i = "null"
    }
  }
  print $0 > a[1]".txt"
}' *.csv

anbu23 · November 4, 2010, 3:21pm

cd /temp/temp1
for i in *.csv
do
awk 'BEGIN {
FS = ","
OFS = "\t"
}

{
$1 = $1
for (i = 1; i <= NF; i++) {
if ($i == "") {
$i = "null"
}
}
print $0
}' $i > ${i%".csv"}".txt"
done

ScKaSx · November 4, 2010, 6:21pm

Thanks guys, these work great!

Maybe you can also shed so light on another thing I am trying to do. In these files I have 3 columns with example data as such:

1 2 3
2 3 4
3 4 5
4 3 4
5 2 3
6 3 4
7 4 5
8 5 6
9 6 7
10 5 6

Looking at column 2 or 3 (maybe specified at command line) I want to find the two maximums and write to file (in case of column 2 it would be 4 and 6). Since this is somewhat complicated I first was trying to find just the maximum from one of the columns. Therefore, I tried to add this to the awk command:

      3 max=1
      4 awk 'BEGIN {
      5       FS = ","
      6       OFS = "\t"
      7    }
      8 
      9    {
     10       $1 = $1
     11       for (i = 1; i <= NF; i++) {
     12          if ($i == "") {
     13             $i = "null"
     14          }
     15          if ($2 > max) {
     16             max = $2
     17          }
     18       }
     19       print $0
     20    }' test.csv > test.txt
     21 echo $max"

As you can probably tell, I have a c background. Anyways, this isn't giving me the maximum number. Is this even the right approach for what I ultimately want?

Cheers,
ScKaSx

Corona688 · November 4, 2010, 6:50pm

awk does not run inside your shell, hence cannot set shell variables. If you want to get a value out of awk, you have to print it somehow. Since you're already using stdout for data, maybe stderr?

END { print "max is", $max > "/dev/stderr"; }