notes: i am using cygwin and notepad++ only for checking this and my OS is XP.
#!/bin/bash
typeset -i totalvalue=(wc -w /cygdrive/c/cygwinfiles/database.txt)
typeset -i totallines=(wc -l /cygdrive/c/cygwinfiles/database.txt)
typeset -i columnlines=`expr $totalvalue / $totallines`
awk -F' ' -v columnlines=$columnlines '{ if($1==$columnlines) {print $0} }' /cygdrive/c/cygwinfiles/database.txt
this is my first script construction so kindly pls bear with me. i just need ur help. the:
totalvalue is the number of values in the data
totallines is the number of lines in the data
these 2 are needed to count total columns
(pretty lame script and very basic since i dont know much)
if i have a data file who looks like:
aaa bbb ccc aaa
ccc eee ggg hhh
eee bbb eee eee
will return rows that have duplicates so, the output is
aaa bbb ccc aaa <two aaa's>
eee bbb eee eee <two eee's>
any help would be appreciated.
ERRORS are returned and I think the errors are in the variables. they seem not to be recognized as integers.
i am returning an error with the msg ")division by 0 (error token is "/c/cygwinfiles/database.txt)
---------- Post updated at 06:17 PM ---------- Previous update was at 06:15 PM ----------
the returned 2nd row contains 3 eee's (sorry for that)
awk '{for (i=1;i<=NF;i++) {if ($i in a) {print;break} else {a[$i]}};delete a}' infile
1 Like
try this AWK file,you can use it by:
awk -f awkfile inputfile
{
2 for(i=1;i<=4;i++)a[$i]++
3 if(a[$1]+a[$2]+a[$3]+a[$4] > 4)
4 printf "%s <",$0;
5 for(i=1;i<=4;i++){
6 if(a[$i]>2){
7 printf "%d %s's ",a[$i],$i
8 break;
9 }else if(a[$i] == 2 && $i != save){
10 printf "%d %s's ",a[$i],$i
11 save=$i
12 }
13 }
14 if(a[$1]+a[$2]+a[$3]+a[$4] > 4)
15 printf ">\n"
16 delete a
17 save=""
18 }
1 Like
awk '{for (i=1;i<=NF;i++) {if ($i in a) {print;break} else {a[$i]}};delete a}' infile
woah! it worked like a charm! now what ima do now is just to educate myself about these codes. thank you very much rdcwayx!
@homeboy
im thankful also for helping out. i just want to know y i cant't run properly bash scripts in cgywin. ima try this at once and find some program for me for running this in xp.
thank you very much guys.
---------- Post updated at 10:40 PM ---------- Previous update was at 08:01 PM ----------
now i have this odd assumption.
if the data were to be
As1d Pooa1 982ah
ghqyqt1 ss92 a82ss
Bg1ja Bg1ja 13ss
how can i achieve an output of
Bg1ja Bg1ja 13ss
meaning that line is duplicate
this in a sense assuming all Alphanumeric chars are used instead of small letters only.
will i use [A-Za-z0-9]? how will i inject it to the code?
Not really understand, with my code, I still can get the line:
Bg1ja Bg1ja 13ss
Do you ask for case insensitive ?
awk '{for (i=1;i<=NF;i++) {if (tolower($i) in a) {print;break} else {a[tolower($i)]}};delete a}' infile
1 Like
awk '{for (i=1;i<=NF;i++) {if (tolower($i) in a) {print;break} else {a[tolower($i)]}};delete a}' infile
this is perfect!
this would help me a lot for my database learning in unix.
so it actually analyzes the values as lower case but prints the line itself. thank you again rdcwayx!