Using awk to split a string

Hey guys, I've been trying to find an answer to this, and I've been reading up on awk as much as possible, but I'm at a loss at the moment.

I'll start off by saying I'm trying to learn, so forgive me if I ask questions about your answers.

Here is what I'm trying to accomplish. I have a long string of text which I'll link below. I want to be able to go through the text and pull out certain keywords and place them and the text associated with them in a file.

Here is an example string:

013-01-10 16:21:59,911 SECURE POST Data (examplesite.com):
session%5Busername_or_email%5D=user1%40example.com&session%5Bpassword%5D=examplepass&scribe_log=&redirect_after_login=%2F&authenticity_token=4d27cf47496ec391e055eb78a6f4fa50a5a87e5b

So, what I'd like to pull out of that information is this:
examplesite.com (This is a changing variable depending on the output of it's parent script.)
username_or_email (This is static.)
user1%40example.com (Changing.)
password (Static)
examplepass (Changing.)

The desired output is as such:

Examplesite.com
Username = user1%40example.com
Password = examplepass

If you could at least point me in a direction to figure this out on my own, I'd be happy with that. If you would like to provide the answer to my problem for me, if you could break it out so I can learn from it, I'd appreciate that even more.

As I said before, I'm not just looking for the simple answer of how to make it work. I'd like to learn why it works as well :).

Cheers,
-Shadow

awk -F'[ |=]' ' \
 {
   for(i=1;i<=NF;i++)
   {
     if($i ~ /^\(/)
     {
       site=$i; gsub(/\(/,"",site); gsub(/\)/,"",site); gsub(":","",site);
     }
     if($i ~ /username/)
     {
       user=$(i+1); gsub(/\&.*/,"",user);
     }
     if($i ~ /password/)
     {
       pass=$(i+1); gsub(/\&.*/,"",pass);
     }
   }
 }
END {
   printf "%s\nUsername = %s\nPassword = %s\n", site, user, pass;
} ' filename

Any chance you could break down what each part of this does, or should I walk my happy ass on over to Google and research Awk and Gsub?

Thank you for the help bipinajith! I really do appreciate it :).

Condensing bipinajith's proposal slightly, here's some explanatory hints:

awk -F'[ =&]' '                               # set field separator to " ", "=", or "&" single char at which the line is broken into fields
     {for(i=1;i<=NF;i++)                      # check field one to last (NF is no. of fields)
       {if ($i ~ /^\(/)     {site=$i;         # parentheses enclose the site name ( in this case, but this is not necessarily an unambiguous identifier...)
                             gsub(/\(|\)|:/,"",site)}    # gsub removes them
        if ($i ~ /username/) user=$(++i)      # if field contains "username" string, the next field will hold the actual username
        if ($i ~ /password/) pass=$(++i)      # same - ++i is a bit safer than i+1 as it will increment i and thus skip the next field and not evaluate
       }
     }
     END {printf "%s\nUsername = %s\nPassword = %s\n", site, user, pass;}
    ' file

Thank you Rudi! I really appreciate the breakdown. I'd much rather know how it works so I can try to do it myself next time :)!

---------- Post updated at 03:46 PM ---------- Previous update was at 01:13 PM ----------

Question about the code. Where can I find more information on the syntax used on these lines:

{if ($i ~ /^\(/)     {site=$i;
gsub(/\(|\)|:/,"",site)}

Specifically " /^\(/ " and the " /\(|\)|:/,"",site "

Where can I find out what those symbols represent and how to use them? Also, do you have any suggetions for some good books to learn awk?

In this pattern we are escaping open round bracket ( and close round bracket ) because these meta-characters have special meaning.

  • / Opening pattern
  • \( Escaping (
  • | Represents OR
  • \( Escaping )
  • | Represents OR
  • : No need to escape this sign.
  • / Closing pattern

I hope you understood.

Thank you! That's exactly what I was looking for! I'll go read up on awk's metacharacters.

Just wanted to thank you guys for the help. I read up a bit, and I now understand what the regex patterns that you were using mean, and how to use them.

I edited your script a bit, so that I can use it to parse information that I pass it from the command line.

Thank you again for getting me kick started on using Awk!

#!/bin/bash
awk -F'[ =&]' '
   {
      for(i=1;i<=NF;i++)
      {
         if ($i ~ /^\(/) {site=$i; gsub (/\(|\)|:/,"",site); print "-=-=-=-=-=-=-=-\nSite: "site}
         if ($i ~ /username/) {user=$(++i); print "Username: "user }
         if ($i ~ /password/) {pass=$(++i); print "Password: "pass "\n-=-=-=-=-=-=-=-" }
      }
   }'

Then this should work as well:

.
.
.
         if ($i ~ /username/) print "Username: "$(++i)
         if ($i ~ /password/) print "Password: "$(++i)"\n-=-=-=-=-=-=-=-"
.
.
.

Be careful as the sequence of keywords is not guaranteed - if that matters on output, use the END version.