Need help with awk to get values from input (newbie here)

mehungry · November 18, 2011, 11:35pm

Hello everyone!

I have a small text file called processes.txt which contains a few lines in this fashion:

ID: 35; Arrival_Time: 0; Total_Exec_Time: 4;
ID: 36; Arrival_Time: 1; Total_Exec_Time: 6;

I am trying to figure out how I can get the values from between the delimiters ';' and ':' per line.

So if I want to first get the ID, would I have to write something like this? After that I want to store the ID times as variables that I can use to figure out other things.

ids='cat processes.txt'
echo $ids | awk -f ';' '{print $1}'

Its obviously incorrect (it gives me errors), which is why I'm asking here if anyone would kindly help me figure out a way to obtain those values for ID, arrival times, and total exec times.

Thanks and regards!

agama · November 18, 2011, 11:54pm

If you are looking for output like:

35 0 4
36 1 6

Then try this:

awk '{ for( i=2; i <= NF; i += 2 ) printf( "%d ", $(i) ); printf( "\n" );  }' data-file

Awk can read your input file, so no need to put it in a variable and echo the contents to awk.
This assumes that your values are always the second, fourth and sixth blank separated fields.

mehungry · November 19, 2011, 12:02am

Ah cool ;o

Can you please explain the following to me (im noob /cry):

%d, and \n?

agama · November 19, 2011, 12:24am

Gladly, we all started at some point.

The printf() function takes a format string as it's first parameter. This is used to determine the number of parameters, and their types, that follow. The manual page for the shell version does a wonderful job of explaining the details:
Man Page for printf (OpenSolaris Section 1) - The UNIX and Linux Forums

In brief, the percent sign followed by one or more characters, indicates the position that the next parameter is to be placed into the output. %d is integer, %s is string, %f is floating point decimal, etc. So, the format string "Today is %s the %d day of %s\n" would fill in the day of the week (first %s), the day of the month (%d) and the month (last %s) using the parameters. This assumes that the values are properly assigned to variables passed to the function. The following may help:

day=22;
month="May";
wday="Tue";
printf( "Today is %s the %d day of %s\n", wday, day, month );

This is a silly example, but I think it works to show how it goes.

The \n represents a newline character causing the next output written after this to be placed at the start of a new line. In the bit of code I wrote, I took advantage that without the new line, all three values are written on the same line, and then a final new line is written after the loop is done.

I also took avantage of a property of awk's printf with the integers. The fields were actually nn; and by using %d to print them, awk converted the string to integer and thus the trailing semicolon was removed.

The printf() function is wonderful and is supported in awk, shells and C/C++. Each function pretty much the same, but there are subtle differences like the automatic conversion of a string to integer by awk. Check out the manual page above, and the one for the C function to learn more.

mehungry · November 19, 2011, 12:44am

Awesome! That was perhaps the most well done piece of information relayed to me... EVER!

In my CS class we have to learn either bash or ksh on our own =/ I know Java and C++ but I'm finding this syntax really tough lol. >.<

Just glad there's ppl like you to help newbies like me

---------- Post updated at 12:44 AM ---------- Previous update was at 12:36 AM ----------

O just a question. Is the following supposed to return 30 0 4?

awk '{ for( i=2; i <= NF; i += 2 ) printf( "%d ", $(i) ); printf( "\n" );  }' processes.txt

agama · November 19, 2011, 1:03am

Why, thanks.

For the input line

ID: 35; Arrival_Time: 0; Total_Exec_Time: 4;

it should output
35 0 4

In my testing with your two samples it did (the other was 36 1 6). I did just cut and pasted the code from your post and it seems to generate 35 and not 30. If 30 wasn't a typo, or 35 wasn't a typo in your original message, then something isn't right, but I'm not seeing what it could be.

---------- Post updated at 01:03 ---------- Previous update was at 01:00 ----------

Awk, a programming language in itself, can be confusing at first, but is wonderful. I think this is a pretty decent tutorial:

Awk - A Tutorial and Introduction - by Bruce Barnett

mehungry · November 19, 2011, 4:43pm

O, I see!

Nice links

Hmmm when I put that code in and replace the last line with process.txt (which is the file I'm passing it), but the outputs I'm getting are 0 0 0

---------- Post updated at 04:43 PM ---------- Previous update was at 01:36 AM ----------

hey im sorry to ask again, but when I type that code in, the outputs are 0 0 0 for each line.

am I doing something wrong?

agama · November 19, 2011, 7:43pm

Sorry -- Thought I posted something ealrier; must have hit preview instead of submit. I suggested a small tweek to get some extra debug info out:

awk '{ 
print;   # debug output of each line
for( i=2; i <= NF; i += 2 ) 
   printf( "%d ", $(i) ); 
printf( "\n" ); 
}' processes.txt

If you could post the first 6 or 8 lines of output it'd be interesting to see what is going on.

mehungry · November 20, 2011, 3:09am

hi!

sure here it is!

bash-3.00$ awk '{ 
> print;   # debug output of each line
> for( i=2; i <= NF; i += 2 ) 
>    printf( "%d ", $(i) ); 
> printf( "\n" ); 
> }' processes.txt
ID: 35; Arrival_Time: 0; Total_Exec_Time: 4; 
0 0 0 
ID: 65; Arrival_Time: 2; Total_Exec_Time: 6; 
0 0 0 
ID: 10; Arrival_Time: 3; Total_Exec_Time: 3; 
0 0 0 
ID: 124; Arrival_Time: 5; Total_Exec_Time: 5; 
0 0 0 
ID: 182; Arrival_Time: 6; Total_Exec_Time: 2; 
0 0 0

agama · November 20, 2011, 10:40am

Ok, I'm stumped. I cut and pasted both your code and data and this is the output on both gnu awk version 3.1.6 and FreeBSD awk version 20091126 is the same and what I expected:

ID: 35; Arrival_Time: 0; Total_Exec_Time: 4; 
35 0 4 
ID: 65; Arrival_Time: 2; Total_Exec_Time: 6; 
65 2 6 
ID: 10; Arrival_Time: 3; Total_Exec_Time: 3; 
10 3 3 
ID: 124; Arrival_Time: 5; Total_Exec_Time: 5; 
124 5 5 
ID: 182; Arrival_Time: 6; Total_Exec_Time: 2; 
182 6 2

Next question... are you running on Solaris? If so use nawk instead of awk. If not try this (grasping for straws now):

awk '{ 
 gsub( ";", "", $0 ); 
 for( i=2; i <= NF; i += 2 ) 
    printf( "%d ", $(i) ); 
 printf( "\n" ); 
 }'  input-file

ahamed101 · November 20, 2011, 11:43am

agama's latest one with gsub should work...
Try this with a small modification...

awk '{
print;   # debug output of each line
for( i=2; i <= NF; i += 2 ) 
   printf( "%d ", $i+0 ); 
printf( "\n" ); 
}' processes.txt

BTW, everything works for me too...
--ahamed

mehungry · November 21, 2011, 12:01am

im using secure shell client to connect to my school's afs thingy

agama · November 21, 2011, 9:05pm

What is the output of these commands:

uname -a
awk --version

mehungry · November 21, 2011, 10:29pm

hey!

the output of uname -a is:

SunOS *this part has my school's server address idk if I should leave that in here lol* 5.10 Generic_142900-02 sun4u sparc SUNW,Sun-Fire-280R

nothing happens when I type awk --version

agama · November 21, 2011, 11:04pm

You are right to mark out identifying elements; the information I was looking for was there. Try running the command using nawk rather than awk:

nawk '
{ 
   gsub( ";", "", $0 );       # this might not be needed
   for( i=2; i <= NF; i += 2 ) 
       printf( "%d ", $(i)+0 ); 
   printf( "\n" ); 
 }'  input-file

For reasons unknown to me, awk on Solaris machines is very old and doesn't support many/any of the features found in modern versions of the tool. The installation of nawk seems to do better.

ahamed101 · November 21, 2011, 11:18pm

In Solaris, you can use nawk or /usr/xpg4/bin/awk

--ahamed

mehungry · November 21, 2011, 11:24pm

oh i see =/

Ugh I'm trying so hard to understand bash but I'm not sure how to make it easy to learn. For some reason I'm finding it really difficult lol, and as I said before I have to learn bash on my own for a project due on Dec. 12 =/

I'm getting frustrated because I have no idea how to properly use most of the features of bash =/

Thanks for the replies, I'm going to be working hardcore on this starting tomorrow night. I have a final exam tomorrow :S gotta look over notes one last time then sleep