Read column from file and delete rows with some condition..

nex_asp · December 5, 2012, 11:59pm

Hi....

I have a need of script to do delete row whenever condition is true....

2.16    (3)    [00]    1    3    9999    0    (1)    (0)    [00]
34.42    (4)    [00]    1    3    9999    37    (2)    (3)    [00]
34.38    (4)    [00]    1    3    9999    64    (2)    (3)    [00]
34.4    (4)    [00]    1    3    1    110    (3)    (3)    [00]
34.38    (4)    [00]    1    3    12    165    (3)    (3)    [00]
34.42    (4)    [00]    1    3    13    220    (3)    (3)    [00]
34.4    (4)    [00]    1    3    9999    274    (3)    (3)    [00]
34.38    (4)    [00]    1    3    9999    348    (3)    (3)    [00]

Here 6th column is having some values, whenever there is 9999, that row should be deleted....

and some columns containing values inside brackets both '()' and []',those brackets should be removed,,,and values inside bracket should be as it is..

Here I have attached my input sample file....please go through it..

Thanks in advance..

pamu · December 6, 2012, 1:12am

Is this what you want..

$ cat file
2.16    (3)     [00]    1       3       9999    0       (1)     (0)     [00]
34.42   (4)     [00]    1       3       9999    37      (2)     (3)     [00]
34.38   (4)     [00]    1       3       9999    64      (2)     (3)     [00]
34.4    (4)     [00]    1       3       1       110     (3)     (3)     [00]
34.38   (4)     [00]    1       3       12      165     (3)     (3)     [00]
34.42   (4)     [00]    1       3       13      220     (3)     (3)     [00]
34.4    (4)     [00]    1       3       9999    274     (3)     (3)     [00]
34.38   (4)     [00]    1       3       9999    348     (3)     (3)     [00]

$ awk '$6 != 9999{gsub("\\(","");gsub("\\)","");gsub("\\[","");gsub("\\]","");print}' file
34.4    4       00      1       3       1       110     3       3       00
34.38   4       00      1       3       12      165     3       3       00
34.42   4       00      1       3       13      220     3       3       00

michaelrozar17 · December 6, 2012, 2:25am

Alternate awk..

$ uname -rs
SunOS 5.10
$ nawk '$6!="9999"{gsub(/[)(\]\[]/,"",$0);print}' inputfile

itkamaraj · December 6, 2012, 3:13am

 
$ perl -lane 's/[()\]\[]//g;print $_ if $F[5]!=9999' input.txt
34.4    4     00    1       3       1       110     3     3     00
34.38   4     00    1       3       12      165     3     3     00
34.42   4     00    1       3       13      220     3     3     00

nex_asp · December 6, 2012, 3:37am

I have one more problem of same kind....here is that file format

9.983    68.033    1    28.25    36.42
9.983    68.033    5    28.26    36.42
9.983    68.033    10    28.23    36.43
9.983    68.033    15    28.22    36.43
9.983    68.033    20    28.2    36.42
9.983    68.033    25    28.19    36.43
9.983    68.033    30    28.18    36.43
9.983    68.033    35    28.18    36.43
9.983    68.033    40    28.18    36.44
9.983    68.033    45    28.19    36.45
9.983    68.033    50    28.19    36.44
9.983    68.033    55    28.2    36.45
9.983    68.033    60    28.2    36.469
9.983    68.033    5    28.26    36.42
9.983    68.033    10    28.23    36.43
9.983    68.033    15    28.22    36.43
9.983    68.033    20    28.2    36.42
9.983    68.033    25    28.19    36.43
9.983    68.033    30    28.18    36.43
9.983    68.033    35    28.18    36.43
9.983    68.033    40    28.18    36.44
9.983    68.033    45    28.19    36.45
9.983    68.033    50    28.19    36.44
9.983    68.033    55    28.2    36.45
9.983    68.033    60    28.2    36.46
9.983    68.033    65    28.21    36.48
9.983    68.033    70    28.22    36.47
9.983    68.033    65    28.21    36.48
9.983    68.033    70    28.22    36.47

I need data till 50 in third column.....after 50 all data to be ignored...and want to print only 2nd,3rd and 5th column as output...

pamu · December 6, 2012, 3:47am

I assuming you want data which has value less than or equal to 50 in 3rd column.

 $ cat file
9.983    68.033    1    28.25    36.42
9.983    68.033    5    28.26    36.42
9.983    68.033    10    28.23    36.43
9.983    68.033    15    28.22    36.43
9.983    68.033    20    28.2    36.42
9.983    68.033    25    28.19    36.43
9.983    68.033    30    28.18    36.43
9.983    68.033    35    28.18    36.43
9.983    68.033    40    28.18    36.44
9.983    68.033    45    28.19    36.45
9.983    68.033    50    28.19    36.44
9.983    68.033    55    28.2    36.45
9.983    68.033    60    28.2    36.469
9.983    68.033    5    28.26    36.42
9.983    68.033    10    28.23    36.43
9.983    68.033    15    28.22    36.43
9.983    68.033    20    28.2    36.42
9.983    68.033    25    28.19    36.43
9.983    68.033    30    28.18    36.43
9.983    68.033    35    28.18    36.43
9.983    68.033    40    28.18    36.44
9.983    68.033    45    28.19    36.45
9.983    68.033    50    28.19    36.44
9.983    68.033    55    28.2    36.45
9.983    68.033    60    28.2    36.46
9.983    68.033    65    28.21    36.48
9.983    68.033    70    28.22    36.47
9.983    68.033    65    28.21    36.48
9.983    68.033    70    28.22    36.47

$ awk '$3 <= 50{print $2,$3,$5}' file
68.033 1 36.42
68.033 5 36.42
68.033 10 36.43
68.033 15 36.43
68.033 20 36.42
68.033 25 36.43
68.033 30 36.43
68.033 35 36.43
68.033 40 36.44
68.033 45 36.45
68.033 50 36.44
68.033 5 36.42
68.033 10 36.43
68.033 15 36.43
68.033 20 36.42
68.033 25 36.43
68.033 30 36.43
68.033 35 36.43
68.033 40 36.44
68.033 45 36.45
68.033 50 36.44

nex_asp · December 6, 2012, 4:09am

Pamu...if data file is like this...how can I take data till 1st columns maximum value...
say

1  25   35 
2  25    32 
3   25    32 
4   24    35
5   23    38 
6   17   15 
4   58    35 
3   15    36 
2   25    33 
1   25    35 
0   25   38

If I have to filter out till 6
then ..

output looks like this

1  25   35 
2  25    32 
3   25    32 
4   24    35
5   23    38 
6   17   15

Here...in first column maximum is 6, in some other file it may be 60, 70, or 200....I need data till 1st column's 1st profile's maximum file....downward data I don't need...that is 4,3, 2..need to be ignored...

pamu · December 6, 2012, 4:39am

Please use code tags..

Assuming you have sorted input.
Try

$ cat file1
1 25 35
2 25 32
3 25 32
4 24 35
5 23 38
6 17 15
4 58 35
3 15 36
2 25 33
1 25 35
0 25 38

$ awk '$1 > s || NR==1{print}{s=$1}' file1
1 25 35
2 25 32
3 25 32
4 24 35
5 23 38
6 17 15

$awk '$1 > s || NR==1{s=$1;print}' file1
1 25 35
2 25 32
3 25 32
4 24 35
5 23 38
6 17 15

nex_asp · December 6, 2012, 5:42am

Thank you Pamu, I am having problem in saving file..
I tried this command

for file in *.txt; do
awk '{print $8 "\t"$3 "\t"$7}' $file >"out_"$file.csv
done

in csv
output looks like this
1

    28.1027         33.7323                   2         
    
              
    28.1055         33.731                   3

pamu · December 6, 2012, 6:08am

CSV...?

do you want a comma separated file..? OR tab separated..?

nex_asp · December 6, 2012, 6:20am

both I tried not working...pasted command with tab sorry....when you save...then 1st column not coming properly...

---------- Post updated at 06:20 AM ---------- Previous update was at 06:13 AM ----------

Hi Pamu,

why its like that.....if you print on command prompt its showing properly tab separated...but when you save its not coming ...

Did you find the reason ?

pamu · December 6, 2012, 6:22am

And please use code tags for code and data sample..

I am not getting what could be the problem..

you may try like this..

$ cat file
      1.006   5.324079    27.6452    2.40651     2.8315     0.4706    33.1640      1.000
      2.012   5.323260    27.4376    2.88395     2.9420     0.5726    33.3054      2.000
      3.018   5.319734    27.3193    3.17664     3.0671     0.7445    33.3646      3.000
      4.024   5.320370    27.3121    3.55961     2.8734     0.7843    33.3740      4.000
      5.029   5.321427    27.2701    3.27116     2.7069     0.7734    33.4111      5.000
      6.035   5.317643    27.2201    2.99257     2.5828     0.8503    33.4199      6.000
      7.041   5.307164    27.1758    4.18136     3.0051     0.9228    33.3773      7.000
      8.047   5.305160    27.1626    4.47475     3.4154     0.8651    33.3724      8.000

$ awk '{print $8,$3,$7}' OFS="\t" file > out_file.csv

$ cat out_file.csv
1.000   27.6452 33.1640
2.000   27.4376 33.3054
3.000   27.3193 33.3646
4.000   27.3121 33.3740
5.000   27.2701 33.4111
6.000   27.2201 33.4199
7.000   27.1758 33.3773
8.000   27.1626 33.3724

$ awk '{print $NF,$(NF-5),$(NF-1)}' OFS="\t" file > out_file.csv

$ cat out_file.csv
1.000   27.6452 33.1640
2.000   27.4376 33.3054
3.000   27.3193 33.3646
4.000   27.3121 33.3740
5.000   27.2701 33.4111
6.000   27.2201 33.4199
7.000   27.1758 33.3773
8.000   27.1626 33.3724

Hope this helps you..

EDIT: about tab separated files. Yes it may look like having different spacing but still they are tab separated. Don't just look at spacings.

pamu

nex_asp · December 6, 2012, 6:28am

No, its not coming...see I have attached here...by changing extension...

pamu · December 6, 2012, 6:35am

You may like to define spacing by your own...

$ cat file
      1.006   5.324079    27.6452    2.40651     2.8315     0.4706    33.1640      1.000
      2.012   5.323260    27.4376    2.88395     2.9420     0.5726    33.3054      2.000
      3.018   5.319734    27.3193    3.17664     3.0671     0.7445    33.3646      3.000
      4.024   5.320370    27.3121    3.55961     2.8734     0.7843    33.3740      4.000
      5.029   5.321427    27.2701    3.27116     2.7069     0.7734    33.4111      5.000
      6.035   5.317643    27.2201    2.99257     2.5828     0.8503    33.4199      6.000
      7.041   5.307164    27.1758    4.18136     3.0051     0.9228    33.3773      7.000
      8.047   5.305160    27.1626    4.47475     3.4154     0.8651    33.3724      8.000

$ awk '{printf "%-10s%-12s%-10s\n", $8,$3,$7}' file
1.000     27.6452     33.1640
2.000     27.4376     33.3054
3.000     27.3193     33.3646
4.000     27.3121     33.3740
5.000     27.2701     33.4111
6.000     27.2201     33.4199
7.000     27.1758     33.3773
8.000     27.1626     33.3724