Delete specific lines from files based on another file

aden · January 21, 2014, 8:23am

I have some text files in a folder named ff as follows. I need to delete the lines (in-place editing)in these files based on another file aa.txt.

32bm.txt:

249  253 A P        -     0   0    8      0, 0.0     6,-1.4     0, 0.0     2,-0.4  -0.287  25.6-102.0 -74.4 161.1   37.1   13.3   10.9
250  254 A K  B      Z  254   0E  77    -48,-2.5   -48,-0.3     4,-0.2     4,-0.3  -0.720 360.0 360.0 -93.4 135.2   38.1   11.1    8.1
252        !*             0   0    0      0, 0.0     0, 0.0     0, 0.0     0, 0.0   0.000 360.0 360.0 360.0 360.0    0.0    0.0    0.0
253  143 B R              0   0   96      0, 0.0    -2,-3.7     0, 0.0     2,-0.2   0.000 360.0 360.0 360.0 110.4   38.4   10.4    3.0
254  144 B Q  B     -Z  250   0E  62     -4,-0.3    -4,-0.2    -3,-0.1     2,-0.1  -0.347 360.0-157.5 -58.1 119.5   39.4   13.6    4.8
255  145 B T        -     0   0   22     -6,-1.4     2,-0.3    -2,-0.2    -7,-0.2  -0.396   7.8-127.4 -91.5 173.9   36.3   15.7    5.4

2fok.txt:

1  361 X G              0   0  137      0, 0.0     2,-0.2     0, 0.0     3,-0.0   0.000 360.0 360.0 360.0  97.3   25.2  -16.6   -6.6
2  362 X A        -     0   0   98      1,-0.0     0, 0.0     0, 0.0     0, 0.0  -0.649 360.0 -33.9-148.3  84.1   28.0  -18.6   -4.8
3  363 X R        -     0   0  226     -2,-0.2     2,-0.0     1,-0.1    -1,-0.0   1.000  68.7-149.8  66.4  76.9   31.1  -16.5   -4.0
1  361 B G              0   0  137      0, 0.0     2,-0.2     0, 0.0     3,-0.0   0.000 360.0 360.0 360.0  97.3   25.2  -16.6   -6.6
2  362 B A        -     0   0   98      1,-0.0     0, 0.0     0, 0.0     0, 0.0  -0.649 360.0 -33.9-148.3  84.1   28.0  -18.6   -4.8
3  363 B R        -     0   0  226     -2,-0.2     2,-0.0     1,-0.1    -1,-0.0   1.000  68.7-149.8  66.4  76.9   31.1  -16.5   -4.0

aa.txt

32bm    B   143 145
2fok    X   361 363
2moj    B   361 367
-
-
-

For example, in the 32bm.txt, I need only the lines having B (column3) and the numbers from 143 to 145 (column2).

Desired output:

32bm.txt

253  143 B R              0   0   96      0, 0.0    -2,-3.7     0, 0.0     2,-0.2   0.000 360.0 360.0 360.0 110.4   38.4   10.4    3.0
254  144 B Q  B     -Z  250   0E  62     -4,-0.3    -4,-0.2    -3,-0.1     2,-0.1  -0.347 360.0-157.5 -58.1 119.5   39.4   13.6    4.8
255  145 B T        -     0   0   22     -6,-1.4     2,-0.3    -2,-0.2    -7,-0.2  -0.396   7.8-127.4 -91.5 173.9   36.3   15.7    5.4

2fok.txt

1  361 X G              0   0  137      0, 0.0     2,-0.2     0, 0.0     3,-0.0   0.000 360.0 360.0 360.0  97.3   25.2  -16.6   -6.6
2  362 X A        -     0   0   98      1,-0.0     0, 0.0     0, 0.0     0, 0.0  -0.649 360.0 -33.9-148.3  84.1   28.0  -18.6   -4.8
3  363 X R        -     0   0  226     -2,-0.2     2,-0.0     1,-0.1    -1,-0.0   1.000  68.7-149.8  66.4  76.9   31.1  -16.5   -4.0

your suggestions would be greatly appreciated!

Yoda · January 21, 2014, 8:47am

Here is one way of doing this:

#!/bin/bash

for file in *.txt
do
        [ "$file" = "aa.txt" ] && continue

        echo "Fixing file: $file"

        awk '
                NR == FNR {
                        A[$1] = $0
                        next
                }
                !(F) {
                        F = FILENAME
                        sub ( /\..*/, X, F )
                }
                F in A {
                        split ( A[F], R )
                        if ( $3 == R[2] && $2 >= R[3] && $2 <= R[4] )
                                print $0
                }
        ' aa.txt "$file" > tmp

        mv tmp "${file}.new"
done

Note that I am creating output file with .new as suffix. You may remove it if the output looks good.

RudiC · January 21, 2014, 8:50am

Try

awk '{FN=$1".txt"; KR=$2; LL=$3; LH=$4; while (getline < FN) if ($3==KR && $2 >= LL && $2 <= LH) print}' aa.txt

You may need to print to individual files, e.g. FN".new".