Get lines in 5 seconds

Lestat · June 15, 2005, 12:15pm

Hello everybody, how i can get how many lines are writed in a file in the last 5 seconds?

For ezample i have 'file1' that is filled by a process automatically and i neet to know how many lines with the word 'EXACTO' were filled the last 5 seconds, can somebody help me?

I try with:

tail -f file1 | grep EXACTO > file2

but it just copy the last files to 'file2' but not in the last 5 seconds, that have sence? please your help.

Lestat

pixelbeat · June 15, 2005, 12:55pm

tail -f -s5 ?

Just_Ice · June 15, 2005, 1:27pm

try ...

currcnt=0
oldcnt=0
while true
do
    currcnt=$(wc -l file1 | awk '{print $1}')
    if [ $currcnt -ne $oldcnt ]
    then
        sed "1,${oldcnt}d" file1 | grep "EXACTO" >> file2
        oldcnt=$currcnt
    fi
    sleep 5
done

Lestat · June 15, 2005, 1:28pm

pixelbeat it still dont work, 'tail -f -s5 file1 works' but not just 5 seconds, it works like 'tail -f file1'

jim_mcnamara · June 15, 2005, 1:44pm

File writes do not always work the way you think they do.

Unless the process writing the file calls fflush() for every line or is using aio calls, the kernel accumulates file data in memory for a while, then writes a bunch of stuff all at once to the file. The bunch of stuff it decides to write may end somewhere in the middle of a line.

What this means is that you could wait for 20 seconds, while nothing is written to the file. Then during the 21st second, 8192 bytes of data is written to the file.

Just_Ice · June 15, 2005, 1:46pm

but that doesn't preclude the op from checking every 5 seconds if he wished ...

Lestat · June 15, 2005, 1:53pm

At first time i try to get whe num of lines actual, then the num of lines in 5 seconds then the difference between they, but in 'file1' i have thousand of lines and a 'wc -l file1' take to much time...

For Jim Mcnamara:
The file1 is filled in real time, so i dont have problem like that (i think)

Just_Ice · June 15, 2005, 4:01pm

... just curious --- what is the exact command line did you use to do the "wc -l"?

Lestat · June 15, 2005, 4:16pm

The script is:


wc -l file1 | grep EXACTO > my_file
sleep5
wc -l file1 | grep EXACTO >> my_file

so in my_file i get the number of lines, the difference between those are what i need but it takes too much time.

any other idea?

STiVo · June 15, 2005, 4:28pm

Do you have control of the process writing these logfiles? If so you could just timestamp the log entries....

Or, you could write a script that will mark the logfile every 5 seconds by adding a line that's simle to search for and contains the time...

<------- 06/15/2005 15:27:35 --------->

Lestat · June 15, 2005, 4:36pm

the problem is that i cannot alter those file cuz is used to generate statistics

Just_Ice · June 15, 2005, 6:45pm

lestat:

The script is:
wc -l file1 | grep EXACTO > my_file
sleep5
wc -l file1 | grep EXACTO >> my_file
so in my_file i get the number of lines, the difference between those are what i need but it takes too much time.

any other idea?

the problem here is that you are re-reading the file to grep out EXACTO after you read it the first time to count how many lines it has ... if the first read takes 3 seconds, the second read will probably takes just as long if not longer even though you really only need to read it once ...

... just grep out EXACTO from file1 then do the rest of the code i put in ... try it out and let us know if it helps or not ...

grep EXACTO file1 >> file2
currcnt=0
oldcnt=0
while true
do
    currcnt=$(wc -l file1 | awk '{print $1}')
    if [ $currcnt -ne $oldcnt ]
    then
        sed "1,${oldcnt}d" file1 | grep "EXACTO" >> file2
        oldcnt=$currcnt
    fi
    sleep 5
done

Lestat · June 20, 2005, 12:28pm

It works but it is still slow... there is not a way to do it with 'tail whatever'?

odashe · June 20, 2005, 2:27pm

I did not quite understand how free are you to interfere with the file.
If you cannot catch it when it is written, may be you could pipe the output of "tail -f file1" to the program in C that would sleep for 5 secs, then print a timestamp and all lines it can read and sleep again? There still may be problems with timing mentioned by Jim M. but you may get some idea about what's going on.

odashe · June 20, 2005, 2:30pm

Sorry, in my previous post I did not mention that youhave to read without waiting for the opertion to complete.

jim_mcnamara · June 20, 2005, 2:51pm

Try this kind of approach:

create a shell script like this (/tmp/filename is the example file name)

#!/bin/ksh
# test.sh

old="0"
while true
do
     echo "`last_written /tmp/filename EXACTO  $old`" | read old lines         
     if [ $old -lt 0 ] ; then
         echo "Error opening file"
         exit
     fi
     echo "$lines lines found `date +%c` "    
     sleep 5
done

compile this:

/******************************************************************
*
* last_written.c  
*  usage: last_written <filename> <search string> <last byte>
*                      
*                      
*  output  <total bytes searched> <total new lines with search string>
*  assumes:
*           searchstr occurs 0 or 1 times in a line only
*           filename is a carriage control file
*  prints -1 -1 to stdout on failure 
*******************************************************************/

#include <sys/types.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>

void usage(void)
{
    fprintf(stdout,"-1 -1 \n");
    fprintf(stderr," usage: last_written <filename> <search string> <last byte>	\n");
    exit(EXIT_FAILURE);
}

/* file close routine */
void closeit(int fd)
{
    if (close(fd)== (-1))
    {
    	fprintf(stdout,"-1 -1 \n");    	
        perror("Error opening input file");
        exit(EXIT_FAILURE);
    }	
}

/* count occurrences of searchstr in a buffer */
int find(char *buf, char *searchstr)
{
    int result = 0;
    char *p=NULL;
    
    p=strstr(buf,searchstr);
    while(p!=NULL)
    {        
        result++;
        p++;
        p=strchr(p,'\n');
        if(p!=NULL)
        {
        	p=strstr(p,searchstr);
        }
    }
    return result;
}

/* read nbytes from a file */
ssize_t readall(int fd, void *buf, size_t nbyte)
{
    ssize_t nread = 0, n=0; 

    do 
    {
        if ((n = read(fd, &((char *)buf)[nread], nbyte - nread)) == -1) 
        {
            if (errno == EINTR)
            {
                continue;
            }
            else
            {
                return (-1);
            }
        }
        if (n == 0)
        {
            return nread;
        }
        nread += n;
    } while (nread < nbyte);
    return nread;
}


int main(int argc, char *argv[])
{
    int fd=0;                      /* file descriptor */
    struct stat st;                /* stat structure for file size */  
    char *p=NULL;                  /* pointer into the file */
    char *buf=NULL;                /* holds file */
    ssize_t bytes=0;               /* bytes read from the file */
    long file_pos=0;               /* starting offset into file */
    long lines=0;                  /* number of new lines to search */
    long result=0;                 /* number of lines with searchstr */
    
    if(argc!=4) usage();          /* exit on bad parameters */
    
    fd=open(argv[1],O_RDONLY);
    if( fd< 0 || fstat(fd,&st) == (-1) )
    {
        fprintf(stdout,"-1 -1\n");  /* -1 is error */
        perror("Error opening input file");
        exit(EXIT_FAILURE);
    }    
    if(st.st_size==atol(argv[3]) ) /* old file size == new file size */
    {
        fprintf(stdout,"%10s %10d\n", argv[3],0);
        exit(EXIT_SUCCESS);
    }
    buf=malloc(st.st_size+ 1);     /*create a buffer to hold the file */
    if(buf==NULL)
    {
    	fprintf(stdout,"-1 -1 \n");
        perror("Error allocating memory");
        exit(EXIT_FAILURE);
    }
    memset(buf,0x0,st.st_size+1);
    bytes=readall(fd,buf,st.st_size);
    if(bytes==(-1))
    {
    	fprintf(stdout,"-1 -1 \n");    	
        perror("Error reading file");
        exit(EXIT_FAILURE);
    }
        
    /* start looking where we left off before */
    p=buf;
    p+=atol(argv[3]);
    bytes=st.st_size;
    if(*p)
    {                           
        result=find(p,argv[2]); /*count the number of lines with searchstr */
    }
    closeit(fd);
    free(buf);
    /*  output  <total bytes searched so far> <total new lines with search string> */
    fprintf(stdout,"%10d %10d\n",bytes,result);
    return 0;
}

into an executable image named last_written. I get this output from the script running a file stream writer in the background:

kcsdev:/home/jmcnama> lastw.sh
7 lines found Mon Jun 20 12:47:06 2005
2 lines found Mon Jun 20 12:47:11 2005
2 lines found Mon Jun 20 12:47:16 2005
1 lines found Mon Jun 20 12:47:21 2005
2 lines found Mon Jun 20 12:47:26 2005
1 lines found Mon Jun 20 12:47:31 2005
3 lines found Mon Jun 20 12:47:36 2005

A452917 · June 20, 2005, 4:24pm

Try using "tail +n".

#!/bin/ksh
# test.sh

old=0
while true
do
tail +$old /yourdir/yourfile > /tmp/temp_file
lines=grep -c EXACTO /tmp/temp_file
old=$old+`wc -l /tmp/temp_file'
echo "$lines lines found `date +%c` "
sleep 5
done