Gr4wk
March 16, 2014, 3:27pm
1
Hi,
I have files with this kind of format (separator is space):
A1 B1 C1 D1 E1 F1 D1 C1 G1 H1
A2 B2 C2 D2 E2 F2 D2 C2 G2 H2
A3 B3 C3 D3 E3 F3 G3 D3 C3 H3
A4 B4 C4 D4 E4 F4 G4 D4 C4 H4
I want the output to be:
A1 B1 E1 F1 G1 H1
A2 B2 E2 F2 G2 H2
A3 B3 E3 F3 G3 H3
A4 B4 E4 F4 G4 H4
Any clue? Can I use awk for this?
Try:
awk '{for (i=1;i<=NF;i++) a[$i]++;for (i=1;i<=NF;i++) if (a[$i]==1) printf $i" ";printf "\n"}' file
Try :
$ awk '{delete B;for(i=1;i<=NF;i++){if($i in B){$i=$(B[$i])=x}B[$i]=i};$0=$0;$1=$1}1' file
A1 B1 E1 F1 G1 H1
A2 B2 E2 F2 G2 H2
A3 B3 E3 F3 G3 H3
A4 B4 E4 F4 G4 H4
Gr4wk
March 16, 2014, 3:59pm
4
Hi Bartus and Akhsay,
Both script working but for only first line.. the rest are not.
May be need few modifications? The field containing strings with different format (characters, numbers, etc)
Yoda
March 16, 2014, 5:06pm
6
Another approach that will work for posted data:
awk '
{
for ( i = 1; i <= NF; i++ )
{
n = gsub ( "\\<"$i"\\>", "&", $0 )
if ( n > 1 )
gsub ( "\\<"$i"\\>", X, $0 )
}
$1 = $1
print $0
}
' file
1 Like
Gr4wk
March 16, 2014, 5:08pm
7
Hi Scrunitzer,
It doesnt work.
This is the input format:
SEKK101 1C23.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK106 1C22.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK102 1C24.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK101 1C20.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK104 1C10.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK104 1C11.delay multiLink=0 dtx=0 sequence=1 >>> dtx=0 multiLink=0 sequence=0 >>>done.
SEKK101 1C12.delay algoRithm=0 thresHold=10 upThresh=10 >>> upThresh=11 thresHold=10 algoRithm=0 >>>done.
SEKK101 1C15.delay algoRithm=0 thresHold=10 upThresh=11 >>> upThresh=11 thresHold=11 algoRithm=0 >>>done.
SEKK106 1C16.delay algoRithm=0 thresHold=10 upThresh=10 >>> upThresh=11 thresHold=10 algoRithm=0 >>>done.
SEKK106 1C17.delay algoRithm=0 thresHold=10 upThresh=11 >>> upThresh=11 thresHold=11 algoRithm=0 >>>done.
SEKK102 1C18.delay algoRithm=0 thresHold=10 upThresh=10 >>> upThresh=11 thresHold=10 algoRithm=0 >>>done.
Hi Gr4wk, I had deleted my post already since it was not-fool proof anyway.. Try this one instead:
awk '{for(i=1; i<NF; i++) for(j=i+1; j<=NF; j++) if($i==$j) $i=$j=x; $0=$0; $1=$1}1' file
1 Like
Gr4wk
March 17, 2014, 1:13am
9
Thanks Scrutinizer.. it works!
Can you explain what is the meaning of the code?
awk '{delete a; delete b; for(i = 1; i <= NF; i++) {a = $i; b[$i]++}; for(i = 1; i <= length(a); i++) {if(b[$i] == 1) {printf "%s%s", a, FS}}; print ""}' file
1 Like
Gr4wk
March 17, 2014, 1:27am
11
Good job SriniShoo.. your code working too.. can you explain please?
delete a; delete b
to clear arrays a & b
for(i = 1; i <= NF; i++) {a = $i; b[$i]++}
Parse through the line and and store each field value int to different arrays
a - to print the output in an order
b - to cehck duplicates
for(i = 1; i <= length(a); i++) {if(b[$i] == 1) {printf "%s%s", a, FS}}
After I read the line, I am printint the values from array a if array b says it doesn't have duplicate values
printf "%s%s", a, FS
for formatting the output
1 Like
Small addition to my old code, which I missed yesterday
$ awk '{delete B;for(i=1;i<=NF;i++){if($i in B){$i=$(B[$i])=x}B[$i]=i}$0=$0;$1=$1}1' file
SEKK101 1C23.delay sequence=1 >>> sequence=0 >>>done.
SEKK106 1C22.delay sequence=1 >>> sequence=0 >>>done.
SEKK102 1C24.delay sequence=1 >>> sequence=0 >>>done.
SEKK101 1C20.delay sequence=1 >>> sequence=0 >>>done.
SEKK104 1C10.delay sequence=1 >>> sequence=0 >>>done.
SEKK104 1C11.delay sequence=1 >>> sequence=0 >>>done.
SEKK101 1C12.delay upThresh=10 >>> upThresh=11 >>>done.
SEKK101 1C15.delay thresHold=10 >>> thresHold=11 >>>done.
SEKK106 1C16.delay upThresh=10 >>> upThresh=11 >>>done.
SEKK106 1C17.delay thresHold=10 >>> thresHold=11 >>>done.
SEKK102 1C18.delay upThresh=10 >>> upThresh=11 >>>done.
---------- Post updated at 02:48 PM ---------- Previous update was at 02:44 PM ----------
Add delete a
to bartus11's approach it works here is modified version of bartus11
$ awk '{delete a;for (i=1;i<=NF;i++) a[$i]++;for (i=1;i<=NF;i++) if (a[$i]==1) printf $i" ";printf "\n"}' file
1 Like
Sure:
awk '
{ # For every line in file "file"
for(i=1; i<NF; i++) # Iterate variable "i" over the number of fields-1
for(j=i+1; j<=NF; j++) # Do the same for variable j from i+1 to the number of fields
if($i==$j) $i=$j=x # If two of these fields are equal then make their values ""
$0=$0 # Recalculate the fields, if previously fields were made equal to ""
#then there are now fewer fields..
$1=$1 # Recalculate the record, so that any amount of spacing between fields
# is converted to the OFS which is a single space.
}
1 # Print the record
' file # Read the file "file"
Hope this helps..
2 Likes