Understanding the output of fwtmp

confusedAdmin · March 1, 2012, 7:25am

Hi all,

First time post, so please be gentle.

I'm writing a Solaris 10 ksh script to retrieve details of logins and logouts using specific user names. The details I want are quite basic - the username, the computer logged in from, and the date and time the user logged in and logged off.

I initially thought that the 'last' command would be perfect, however its output doesn't include a year in its date information, which I need.

After further searching on this site and others I came across the fwtmp command, which I can use as follows to read the information I need:

/usr/lib/acct/fwtmp < /var/adm/wtmpx > temp_ascii_login_file.txt

This works fine and returns a plain text file that includes full year information in the date, but the problem is that I don't understand all of the fields contained in the output.

As far as I can tell, if field 5 contains a 7, the entry shows a login, and if field 5 contains an 8, the entry shows a logout. Assuming this is correct, I've written the following nawk commands to extract the information I need:

 
nawk '$1 == "username" && $5 == "8" {print $1, $3, $4, $5, $12, $13, $14, $15, $16}' temp_ascii_login_file.txt > myoutput.txt
 
nawk '$1 == "username" && $5 == "7" {print $1, $3, $4, $5, $13, $14, $15, $16, $17, $12}' temp_ascii_login_file.txt >> myoutput.txt

I appear to need different commands for the logins and logouts, as the record structure seems to be a bit different for each.

I'm then sorting the file using the following command:

sort -k 3,3 -k 9,9 -k6M myoutput.txt

This seems to sort each login record chronologically, with its corresponding logout on the following line.

What I want to know is, are my assumptions about the output format of fwtmp correct? Also, will my sort command group all the results as I've outlined?

I've tried to find a reference that explains in plain english what the format of the fwtmp output is, but have not been successful. The best I've found is someone advising to run the command 'man 4 utmpx', which does seem related, but this refers to a c header file (utmpx.h), and my c is a bit rusty at this stage. :o

If someone could respond to my concerns, and point me in the direction of an explanation of the ascii output of the fwtmp command I'd be grateful.

Apologies if this has been answered before, but I don't think it has. Thanks in advance for any assistance anyone can provide.

Regards,
cA.

methyl · March 1, 2012, 8:45am

Please post a small representative data sample from temp_ascii_login_file.txt in code tags. Would expect the columns to line up.

confusedAdmin · March 1, 2012, 9:43am

Hi methyl.

Thanks for the response. Here are a couple of lines from the file. These are unedited, aside from me replacing the actual log in name they contain:

username                         s/10 pts/10                               23848  7 0000 0000 1203424585 4953 0 11 10.13.57.20 Tue Feb 19 12:36:25 2008
username                         s/10 pts/10                               23848  8 0000 0000 1203425529 334490 0 0  Tue Feb 19 12:52:09 2008

I've also spaced them out a bit, numbered the fields and labeled what I think the fields mean here:

Log in line:
$1        $2    $3      $4     $5      $6    $7    $8          $9      $10  $11  $12          $13    $14    $15       $16       $17
Username  ?     Term    Pid    Action  ?     ?     ?           ?       ?    ?    N/w host     Day    Month  Date      Time      Year
username  s/10  pts/10  23848  7       0000  0000  1203424585  4953    0    11   10.13.57.20  Tue    Feb    19        12:36:25  2008
 
Log out line:                                                                                                                
$1        $2    $3      $4     $5      $6    $7    $8          $9      $10  $11  $12          $13    $14    $15       $16
Username  ?     Term    Pid    Action  ?     ?     ?           ?       ?    ?    Day          Month  Date   Time      Year
username  s/10  pts/10  23848  8       0000  0000  1203425529  334490  0    0    Tue          Feb    19     12:52:09  2008

Space delimited field $5 in both lines identifies what I think is the action recorded by the line. I think '7' corresponds to a log in, and '8' corresponds to a log out.

$12 in the log in line contains what seems to be the ip address of the computer used to log in to Solaris from. This field is missing on the log out line, so it pushes out the rest of the fields - field $13 in the log in line corresponds to field $12 in the log out line, and so on for the remainder of the fields. This is why I need separate nawk commands for the two types of records.

Am I correct in what I've stated above? Also can you advise me what the fields I've labeled as '?' refer to?

Thanks,
cA.

methyl · March 1, 2012, 10:34am

You appear to have a known buggy version of "fwtmp" from SunOS 5.10. There should be a patch from Oracle for this.

confusedAdmin · March 1, 2012, 10:35am

Ah. Unfortunately patching it isn't an option for me as I'm not an admin on that server. I don't mind having to use two separate awk commands to retrieve the fields from the file in its current format though. Aside from the fields not lining up correctly, is there any fault with the actual data this version of fwtmp outputs?

Can you please advise me if I'm correct in my interpretation of the field contents?

methyl · March 1, 2012, 6:26pm

Sorry, I don't have access to the utmpx.h file on your system. I certainly agree with your interpretation of the essential fields (including the record type field).

Anybody got the same Solaris 10 release handy who can answer the question in full?

I cannot comment properly on your circumvention because the bug depends on whether the computer identity is available or not. Therefore it may not be consistent in every record.
I'd be tempted to detect whether $12 contains an invalid day and move an "invalid" field to the end of the record (which would then conform to the "normal" layour of a fwtmp login/logout record where the client IP address or name is the last field and has variable length).

Were it not for this awful bug I would normally split the multi-year wtmpx file into manageable chunks (years or even year-months) and use "fwtmp" in reverse to create individual archive wtmpx files with names which include the year and whereby each of which can be processed in "last".
Once you have done this once you automate the archive switchover to suit your local login/logout rate and stop the multi-year wtmpx situation ever occurring again.

confusedAdmin · March 2, 2012, 10:30am

Thanks for the reply methyl, you've been very helpful so far. Regarding this comment:

My reading of it is that if Solaris can't determine the ip address of the computer a user is logging in from, it doesn't include a hostname field in the login record at all, which in turn means that the login record's field numbers after the hostname are all decremented by one, which would cause my nawk command for login records to fail. :wall: Am I correct in my interpretation?

I did think I saw some inconsistent results in my output, which was one of the reasons why I started this thread in the first place. I'll have to review my results so far.

Edit: I was thinking that an easier way for me to detect log in lines that don't contain a hostname would be to simply count the number of space delimited fields nawk detects. Would that work?

Edit 2: After analysing output of various nawk commands on temp_ascii_login_file.txt I've determined that there are inconsistent formats for both Log Out and Log In lines. Please see the following:

# Log in lines:
 
$ nawk '$5 == "7" && NF == 16' temp_ascii_login_file.txt | wc -l
      71
 
$ nawk '$5 == "7" && NF == 17' temp_ascii_login_file.txt | wc -l
   13098
 
$ nawk '$5 == "7" && NF != 17 && NF != 16' temp_ascii_login_file.txt | wc -l
       0
 
# Log In records conclusion: 71 with 16 fields, 13098 with 17 fields, no other field counts.
 
# Log out lines:
 
$ nawk '$5 == "8" && NF == 17' temp_ascii_login_file.txt | wc -l
      31
$ nawk '$5 == "8" && NF == 16' temp_ascii_login_file.txt | wc -l
   13168
$ nawk '$5 == "8" && NF != 17 && NF != 16' temp_ascii_login_file.txt | wc -l
       0
 
# Log Out records conclusion: 31 with 17 fields, 13168 with 16 fields, no other field counts.

I'll have to review some more and will post again.

---------- Post updated at 03:30 PM ---------- Previous update was at 11:14 AM ----------

Further update:

I've confirmed that both the Log In and Log Out records have the same format. The difference is that the most Log In lines contain the host field, but most Log Out lines omit it.

Examples:

 
Log in records:
$ nawk '$5 == "7" && NF == 17' temp_ascii_login_file.txt | head -1
username                         xxxx xxxxx                                 2321  7 0000 0000 1157043775 324268 0 11 10.4.20.154 Thu Aug 31 18:02:55 2006
$ nawk '$5 == "7" && NF == 16' temp_ascii_login_file.txt | head -1
username                         xx   xxxxxxx                               2302  7 0000 0000 1157043698 0 0 0  Thu Aug 31 18:01:38 2006
 
Log out records:
$ nawk '$5 == "8" && NF == 17' temp_ascii_login_file.txt | head -1
username                         xxxx xxxxxxx                               4895  8 0000 0000 1162944785 0 0 13 10.30.52.112 Wed Nov  8 00:13:05 2006
$ nawk '$5 == "8" && NF == 16' temp_ascii_login_file.txt | head -1
username                         xxxx xxxxx                                 2321  8 0000 0000 1157044077 227301 0 0  Thu Aug 31 18:07:57 2006

Here they are nicely lined up:

(Log in)  username                         xxxx xxxxx                                 2321  7 0000 0000 1157043775 324268 0 11 10.4.20.154  Thu Aug 31 18:02:55 2006
(Log out) username                         xxxx xxxxxxx                               4895  8 0000 0000 1162944785 0      0 13 10.30.52.112 Wed Nov  8 00:13:05 2006
 
 
(Log in)  username                         xx   xxxxxxx                               2302  7 0000 0000 1157043698 0      0 0  Thu Aug 31 18:01:38 2006
(Log out) username                         xxxx xxxxx                                 2321  8 0000 0000 1157044077 227301 0 0  Thu Aug 31 18:07:57 2006

For each pair of log in and log out records, the format does indeed appear to be the same.

I've found a way around the variable field length in my nawk command though. Regardless of the number of fields in each line, all of the date fields are at the end, so I can count the field numbers from the end backwards. This will give a consistent result for both record formats, and can also be used for both the login and logout records.