Complex string operation (awk, sed, other?)

I have a file that contains RewriteRules for 200 countries (2 examples for 1 country below):

RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT [R=301,L]

#&

RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]

I have another list of redirects for the mobile versions of these sites in the following format:

RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]

Bear in mind the at_english is just 1 of the country codes, there are many more.

So my goals is to go from

RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]

#to

RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]

I'm supplying the awk / pseudo code for one way I've thought to do it.

awk '
{
newurl="m.website.com/www.website.com/"
one=substr($0,1,14)
two=substr($1,13,37)
rest=substr($4,1)

# The line below this comment is the section I'm having difficulty with because 
#I have country codes in multiple formats at / at_engilsh / at_french
#I want to select all characters between ^/ ---> (  
code=substr($2,1) 
     

printf ("%s%s%s%s%s %s\n", one,code,two,newurl,code, rest)
}' input

So I either need help converting the pseudo code into actual code, or suggestions on a better way to do this operation.

Thank you for any help

The term rewrite rules, to me, says sendmail.cf They have a special syntax and nature, and their placement and exact construction depends on the version fo sendmail you are writing for. Rewrite rules keep being applied until they do not change the entity any more, so sometimes you have to change it a+ to b and then b to a, becaulse a is in a+.

Google helps me see you might be more likely talking apache URL rewrite. I wonder if there is an apache forum?

Most of us write in our head in pseudo-code, not awk, and then translate it into the desired language.

It looks like you are short a slash in the example. If your object to to use m., then you need to not use HTTP_HOST or just prefix it with 'm.' if it is a domain name.

1 Like

What is "at_engilsh"?

I'm not sure I understand what you're trying to do either, but I think the following awk script does what your examples seem to request.

awk 'BEGIN { newurl = "m.website.com/www.website.com/" }
{       match($2, /[^(]*[(]/)
        code = substr($2, 3, RLENGTH - 3)
        match($3, /[^}]*}/)
        printf("%s %s %s%s%s %s\n",
                $1, $2, substr($3, 1, RLENGTH), newurl, code, $4)
}' input

With the following in the file named input:

RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT [R=301,L] 
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]
RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]
RewriteRule ^/at_french(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=fr_AT [R=301,L]

the output produced is:

RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_english [R=301,L]
RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_english [R=301,L]
RewriteRule ^/at_french(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_french [R=301,L]

As always, if you're running on a Solaris/SunOS system, use /usr/xpg4/bin/awk or nawk instead of awk .

1 Like

Sorry the at_engilsh was a typo. And it is an apache rewrite rule, though my question really doesn't pertain to the rewrite rule at all. The rules function just fine, the question is more just for operating on one iteration of the string and using awk to transform into the other iteration.

---------- Post updated at 04:46 PM ---------- Previous update was at 04:43 PM ----------

Yes this is exactly what I needed, thank you for your assistance and apologies for my less than stellar description of what I was looking for!

We have our ways of extracting requirements from the reticent! :smiley: