I have a csv file that I need to extract some data from depending on another field after reading info from another text file.
The text file would say have 592560 in it.
The csv file may have some data like so
Field 1 Field2 Field3 Field4 Field5 Field6
20009756 1 2 1 1 592560
The information will always be in the same position.
If the data in the text file is on line 6 then output the data in line 2 to another text file. There would be many rows and the correct line 2 would need to be copied. There could be many entires in that text file and could reside in the csv file and would need to be added to the output file.
mirni
April 27, 2011, 7:26pm
2
You mean column?
Let me try, although I'm not sure I understood you completely:
$ cat data
Field 1 Field2 Field3 Field4 Field5 Field6
20009756 1 2 1 1 592560
20007654 2 4 45brt ewfw 55552
20009756 X 2 1 1 592560
20009756 balh 2 1 1 592560
20007654 8.t 4 45brt ewfw 55552
$ cat aux
592560
55552
$ awk 'NR==FNR{a[$1]=cnt++; next}{if($6 in a){print $2 > "output."$6;} }' aux data
$ cat output.55552
2
8.t
$ cat output.592560
1
X
balh
The above awk command will print the second column of a line whose sixth field matches an entry in aux, and print it into a file.
$ $ cat data
Field 1 Field2 Field3 Field4 Field5 Field6
20009756 1 2 1 1 592560
20007654 2 4 45brt ewfw 55552
20009756 X 2 1 1 592560
20009756 balh 2 1 1 592560
20007654 8.t 4 45brt ewfw 55552
$ awk '{print >$6 ".txt"}' infile
$ cat 592560.txt
20009756 1 2 1 1 592560
20009756 X 2 1 1 592560
20009756 balh 2 1 1 592560
or only print column 2 :
$awk '{print $2 >$6 ".txt"}' infile
$ cat 592560.txt
1
X
balh
This worked
awk 'NR==FNR{a[$1]=cnt++; next}{if($6 in a){print $2 > "output.txt";} }' aux data
This put all of the data into one file called output.txt. The other way created a separate file for each entry.
In any case that was a huge help rdcwayx.
There are some other things I need to try to do but this gets me far.
Thanks again!
---------- Post updated at 08:57 PM ---------- Previous update was at 08:32 PM ----------
update
awk -F't\' 'NR==FNR{a[$1]=cnt++; next}{if($6 in a){print $2 > "output.txt";} }' aux data
That was what worked. Needed the -F'\t' so awk knew it was tab delimited.
Have to pad $2 to account for 5 characters and name the output.txt the name of the data file.
Thanks again.
---------- Post updated at 09:46 PM ---------- Previous update was at 08:57 PM ----------
and this
awk -F'\t' 'NR==FNR{a[$1]=cnt++; next}{if($12 in a){printf "%05s\n", $2 > "output.txt";} }' aux data
does the padding of the $2 field.
Anyone know how to pad the output.txt so that it has 14 characters. Padding that with 0's
awk -F'\t' 'NR==FNR{a[$1]=cnt++; next}{if($12 in a){printf "%05s\n", $2 > $1".pull";} }' /aiw/aiw1/pullfiles/*.txt *.dpf
Thanks in advance!
I figured out a way to do what I needed in padding 0's.
I do have another question
I need to search in all txt files in a location. But awk only searches the first one and stops.
How do I search *.txt in a location in awk and still have the same results.
Any help would be appreciated.
mirni
May 17, 2011, 9:48pm
7
Of course it does:
awk -F'\t' 'NR==FNR{a[$1]=cnt++; next}{if($12 in a){printf "%05s\n", $2 > $1".pull";} }' /aiw/aiw1/pullfiles/*.txt *.dpf
The 'NR==FNR' bit is what tells awk to remember pattern ('a[$1]=cnt++') only from the first file.
You could do:
awk -F'\t' 'FILENAME~".txt$"{a[$1]=cnt++; next}{if($12 in a){printf "%05s\n", $2 > $1".pull";} }' /aiw/aiw1/pullfiles/*.txt *.dpf
Which instructs awk to apply the first action to *.txt files.
That does not give me an output file.
Yes is did. Sorry.
Thanks!
grovetuckey:
I figured out a way to do what I needed in padding 0's.
I do have another question
I need to search in all txt files in a location. But awk only searches the first one and stops.
How do I search *.txt in a location in awk and still have the same results.
Any help would be appreciated.
This would pad that file name with the 0's. Sorry I did not post this before.
ls *.pull | awk -F'.pull' '{ printf "%s %014s.pull%s\n", $0, $1, $2; }' | xargs -n 2 mv
so this worked until today.
What happened was when the filed has no data in it the command pulls the data. I do not need it to do that.
Here is the line
awk -F'\t' 'FILENAME~".txt$"{a[$1]=cnt++; next}{if(($12 in a) || ($13 in a)){printf "%05s\n", $2 > $8 ".pull"} }'
Here is a sample data that is complet with no empty fields in question.
5122927 and 0 are the two fields I look at. 12 and 13
10129953 155 1 155 1 386618 1346 10129953.1 155 00:00.0 85248 5122927 0 FirstName LastName 111 Blah DRIVE SUN LAKES AZ 85248 9700485 155 1 1 155
Here is a line where the 13th field (0) is empty.
10129953 156 1 156 1 387964 1255 10129953.1 156 00:00.0 5478 5122928 FirstName LastName 111 Blah MAIN ST APT 4 SAN ALBANS VT 5478 9700486 156 1 1 156
I would like the script to just ignore the blank field.
Thanks in advance!
---------- Post updated at 10:17 PM ---------- Previous update was at 09:40 PM ----------
I would not be opposed to adding a 0 in a blank $13 field.