Help building a variable string from a keyword - character replacements!

Hello scripting geniusii! I come to kneel before the alter of your wisdom!

I am looking to take a keyword and replace characters within that keyword and add them to a string variable. I would like this to only go through however many characters the word has, which may vary in size.

Example word - dominos:

$ echo "dominos" |  sed 's/./_/1; s/./_/2'
__minos
$ echo "dominos" |  sed 's/./_/1; s/./_/3'
_o_inos
$ echo "dominos" |  sed 's/./_/1; s/./_/4'
_om_nos
$ echo "dominos" |  sed 's/./_/1; s/./_/5'
_omi_os
$ echo "dominos" |  sed 's/./_/1; s/./_/6'
_omin_s
$ echo "dominos" |  sed 's/./_/1; s/./_/7'
_omino_
$ echo "dominos" |  sed 's/./_/2; s/./_/3'
d__inos
$ echo "dominos" |  sed 's/./_/2; s/./_/4'
d_m_nos
$ echo "dominos" |  sed 's/./_/2; s/./_/5'
d_mi_os
$ echo "dominos" |  sed 's/./_/2; s/./_/6'
d_min_s
$ echo "dominos" |  sed 's/./_/2; s/./_/7'
d_mino_
   ... 

In the end state, I would like it to look like this:

KEYWORD="__minos _o_inos _om_nos _omi_os _omin_s _omino_ d__inos d_m_nos d_mi_os d_min_s ... "

I am doing this today manually, creating my "KEYWORD" variable, but I'm trying to tidy this all up and make my script pull the current working directory and use that as the keyword variable base, so that my script can be standardized over several different keywords I'm working with.

Hope this makes some sense, and Thanks so much in advance for your thoughts!

Dave aka Ghan

what OS are you under?
Do you have gawk ?
here's something to start with: echo 'dominos' | gawk -f gh.awk
where gh.awk is:

{
  len=length($0)
  for (i=1;i<=len;i++)
    for(j=i+1;j<=len;j++)
      print gensub(".", "_", i, gensub(".", "_", j, $0))
}
2 Likes

I'm just using a little utility Ubuntu box. Nothing special!

Thanks!

Dave aka Ghan

Ok, thanks!

Another approach using substr:

awk '
{ 
  printf "KEYWORD=\""
  for (i=0; i <= length($0) - 2; i++)
    for(j=i + 1; j <= length($0) - 1; j++)
      printf "%s%s", \
          substr($0, 0, i) "_"  \
          substr($0, i + 2, j - i - 1) "_"  \
          substr($0, j + 2),  \
          i == (length($0) - 2) ? "\"" : " "
   print ""
}'

Wow, that's amazing, exactly what I was after. Since my skills are soo feeble, could I trouble you for one more enhancement? In that same KEYWORDS string, I'd like to do a single character replacement as well, with a single % marching through, so it would look like this?

KEYWORD="%ominos d%minos do%inos dom%nos domi%os domin%s domino% __minos _o_inos _om_nos _omi_os _omin_s _omino_ d__inos d_m_nos d_mi_os d_min_s d_mino_ do__nos do_i_os do_in_s do_ino_ dom__os dom_n_s dom_no_ domi__s domi_o_ domin__"

I did give a solid hour trying to tinker with the awk -- but I'm just not skilled enough not to shatter the loop. HAHA!

Thanks again for your efforts, its amazing to see how much I don't know!

Dave aka Ghan

word=dominos

KEYWORD=$(echo "$word" | awk ' {
  for (i=1; i<=length; i++)
    printf "%s", substr($0, 1, a++) b substr($0, a+1) FS;
  for (j=length; j>1; j--)
   for (i=1; i<j; i++)
    printf "%s", substr($0, 1, length - j) c substr($0, length - j + 2, i-1) c substr($0, (i + length - j + 2)); FS;
  print "";
}' b="%" c="_"
)

echo "$KEYWORD"
1 Like

Here is one with bash:

#!/bin/bash
word=$1
len=${#1}
KEYWORD=$(
for ((i=1; i<=len; i++))
do
  for ((j=i+1; j<=len; j++))
  do
    # the following uses sed
    # echo "$word" | sed "s/./_/$i; s/./_/$j"
    # the following uses variable modifiers
    wrd=${word:0:i-1}_${word:i}
    echo "${wrd:0:j-1}_${wrd:j}"
  done
done
)
echo "$KEYWORD"
echo "in one row:"
(
  set -f
  echo $KEYWORD
)
1 Like
#!/bin/bash
sed -rn '
:3
s/\w\B/&./; T
:1
s/([.;]\w)(\B|$)/&,/; h
s/[; ]//g; s/\w\.|\w,/_/g; T2; p; g
s/,\b/;/; t1
:2
s/\./ /; s/[;,]//g; h; t3' <<<"domino"

Hello nezabudka,

Thanks a TON for providing nice solutions :cool: Request you to please do add explanation too in your post so that newcomers and OP's/members could understand it.
Moreover it completes our post too(solution + solution's details = GREAT answer :slight_smile: )

Keep doing the GREAT work what you are doing now.

Thanks,
R. Singh

1 Like

The algorithm of the program is the following.
In the main loop by tag :3 we divide the string into letters, putting a dot in
each loop after the next letter

sed -rn ':3;s/\w\B/&./; h; s/\w\./_./p;g; s/\./,/; t3' <<<"domino"

_.omino
d,_.mino
d,o,_.ino
d,o,m,_.no
d,o,m,i,_.o

Create an internal loop to iterate a pair letter. As a secondary mark, select the semicolon

sed -rn 's/\w\B/&./; :1; s/([.;]\w)(\B|$)/&,/;h;s/\w\.|\w,/_/g;T;p;g;s/,\b/;/; t1' <<<"domino"

__mino
_o;_ino
_o;m;_no
_o;m;i;_o
_o;m;i;n;_

As a secondary mark in outer loop select the space. We build the inner loop into the outer loop.

sed -rn ':3; s/[a-z]\B/&./;T; :1; s/([.;]\w)(\B|$)/&,/;h;s/\w\.|\w,/_/g;T2;p;g;s/,\b/;/; t1;:2; s/\./ /; s/[;,]//g;h; t3' <<<"domino"

__mino
_o;_ino
_o;m;_no
_o;m;i;_o
_o;m;i;n;_
d __ino
d _m;_no
d _m;i;_o
d _m;i;n;_
d o __no
d o _i;_o
d o _i;n;_
d o m __o
d o m _n;_
d o m i __

It remains only to delete all marks and arrange the script in a readable style

#!/bin/bash
sed -rn '
:3
s/\w\B/&./; T
:1
s/([.;]\w)(\B|$)/&,/; h
s/[; ]//g; s/\w\.|\w,/_/g; T2; p; g
s/,\b/;/; t1
:2
s/\./ /; s/[;,]//g; h; t3' <<<"domino"

__mino
_o_ino
_om_no
_omi_o
_omin_
d__ino
d_m_no
d_mi_o
d_min_
do__no
do_i_o
do_in_
dom__o
dom_n_
domi__

I apologize for not being able to comment in more detail, I just can't find the right words

2 Likes