Need to count consecutive characters in a string and give the output as below
i/p= aaaabbcaa
o/p= a4b2c1a2
Any attempts / ideas / thoughts from your side?
Tried something like this, but I guess this will work only for first identical characters (a) and will not work for b.
i/p=aaaabbcaa
length=${#i/p}
for (i=0;i<$length;i++)
do
tmp=""
character=${ip:"$i"}
if [[ $character != $tmp ]]
then
o/p=$character
tmp=$character
else
o/p=$character$i
fi
done
Hmmm - there's quite some syntax errors in your script (assuming your (unmentioned) shell is bourne type, e.g. bash
, or ksh
), obviously logic error(s) as well. Would you mind to use an awk
solution?
awk '
{LAST = $1
CNT = 0
for (i=1; i<=NF; i++) {if ($i == LAST) CNT++
else {printf "%s%d", LAST, CNT
CNT = 1
}
LAST = $i
}
printf "%s%d\n", LAST, CNT
}
' FS="" file
a4b2c1a2
given your awk
version allows for a zero length field separator yielding every single char in the input line as a field of its own .
echo aaaabbcaa | fold -w 1 | uniq -c | awk '{l=l$2$1} END {print l}'
Here is a bash solution.
Note slash ( /
) is not allowed as a part of variable names, so I renamed i/p
and o/p
to ip
and op
respectively
ip=aaaabbcaa
for((i=0; i<${#ip}; i++))
do
((found++))
character=${ip:i:1}
nextchar=${ip:i+1:1}
if [[ $character != $nextchar ]]
then
op=$op$character$found
found=0
fi
done
echo $op
ip="aaaabbcaa"
op=$(echo "$ip" | awk '{while (/./) {c=substr($0, 1, 1); match($0, c "*", a); printf c RLENGTH; sub(a[0], "")}}')
echo "$op"
Another one:
echo aaaabbcaa | sed 's/\(.\)\1*/& /g' | awk '{for(i=1; i<=NF; i++) $i=substr($i,1,1) length($i)}1' OFS=
Nice idea.
However I would avoid using match() and sub() as these will interpret regex characters. [
in the input causes fatal Unmatched error and ?
or .
characters cause incorrect output.
$ ip="aaa"
$ op=$(echo "$ip" | awk '{while (/./) {c=substr($0, 1, 1); match($0, c "*", a); printf c RLENGTH; sub(a[0], "")}}')
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: Unmatched [, [^, [:, [., or [=: /[*/
$ ip="aaa???bbb"
$ op=$(echo "$ip" | awk '{while (/./) {c=substr($0, 1, 1); match($0, c "*", a); printf c RLENGTH; sub(a[0], "")}}')
$ echo $op
a3?3?2?1b3
ip="aaa.bbb"
$ op=$(echo "$ip" | awk '{while (/./) {c=substr($0, 1, 1); match($0, c "*", a); printf c RLENGTH; sub(a[0], "")}}')
$ echo "$op"
a3.4
ip=aaaabbcaa
op=
for s in `echo $ip |sed -r 's/((\w)\2*)/\1 /g'; do op=$op${s:0:1}${#s}; done
# op= a4b2c1a2
Nice use of backreferences!
Here is a Posix variant:
#!/bin/sh
ip=aaaabbcaa
op=
for s in `echo "$ip" | sed 's/\(\(.\)\2*\)/\1 /g'`
do
del=${s#?}
op=$op${s%$del}${#s}
done
echo "$op"
And a variant of Chubler's post#6:
#!/bin/bash
ip=aaaabbcaa
len=${#ip}
lchar=${ip:0:1}
for((i=1; i<=$len; i++))
do
((found++))
char=${ip:i:1}
if [[ $char != $lchar ]]
then
op=$op$lchar$found
found=0
lchar=$char
fi
done
echo "$op"