C Code to read a word from file

Hello All,
I have to write a C Code to read a word from file and Keep track of the number of word occurrence in each line and total in the file. Maintaining total count is easier but maintaining per line count is what I am struggling to achieve. I thought of maintaining linked list for maintaining line no and and count in that line but its not that efficient if used with files with bigger size. Your help needed.

Example :- Output should be

Word to search - "function"
Stats :-
line no 21 count 1 
line no 250 count 3
line no 400 count 5

Do you need to keep track of the line word counts until end of processing, or just can you just output them as you go?

If it's the latter then you only need to maintain your linked list while you're processing any given line, so it shouldn't be a significant performance hit.

Alternatively, you could extend whatever data structure you're using to hold the total count per word to include the per-line data for each word as well. Again, that means your lists won't be too long.

Also, you could use a hash table rather than a linked list.

1 Like

Dear CarloM,
Thanks man for your input. I eed to keep track of the line word counts until end of processing and then pass it to other thread. I was assuming worse case of finding a word on each line of 10K lined file.

How does your code get the word(s) to search for? Is there only one word, or are there more than one?

That could make a big difference in how you design your code.

Also, what do you know about the file before you start processing it? The more data you have, the better you can make your program perform. Because for a large file, scalability is going to be a problem.

Assuming you get the words to be searched via command line arguments, something like this could search a single line:

int searchLine( char *line, char **argv, int argc, int *counts )
{
    /* assume no matches - that means the "counts" array can be reused */
    int rc = 0;

    int ii;
    int jj;

    /* check input values to prevent SEGV */
    if ( ( NULL == line ) || ( NULL == argv ) || ( NULL == counts ) )
    {
        /* probably should emit an error here */
        return( rc );
    }

    /* zero out the counts array */
    for ( ii = 0, ii < argc, ii++ )
    {
        counts[ ii ] = 0;
    }

    /* determine how long the line is */
    int len = strlen( line );

    /* check if we match any of the words passed in starting at each
        character in the line of text */
    for ( ii = 0; ii < len; ii++ )
    {
        /* start from 1 instead of 0, assuming argv is the commaind-line
            args passed to main(), so argv[ 0 ] is the executable file name */
        for ( jj = 1; jj < argc; jj++ )
        {
            /* on a match, increment the count for that word and set return
                code to flag a match was found */
            if ( 0 == strcmp( &( line[ ii ] ), argv[ jj ] ) )
            {
                rc = 1;
                ( counts[ jj ] )++;
            }
        }
    }

    return( rc );
}

That should work to count words in a single line of text.

One thing that's important: having a maximum line length per file. Is that specified? If so, you can use "fgets()" and read one line at a time very simply:

FILE *ff;
char *result;
char line[ MAX_LINE_LENGTH ];

ff = fopen( fileName, "r" );
if ( NULL == ff )
{
    return( ERROR );
}
do
{
    result = fgets( line, sizeof( line ) / sizeof( line[ 0 ] ), ff );
    if ( NULL != result )
    {
        /* process line of text */
    }
}
while ( NULL != result );
fclose( ff );