Archiving or removing few data from log file in real time

Hi,

I have a log file that gets updated every second. Currently the size has grown to 20+ GB. I need to have a command/script, that will try to get the actual size of the file and will remove 50% of the data that are in the log file. I don't mind removing the data as the size has grown to huge size. Please advise as this is bit urgent related to space in the server.

May be hard to execute any kind of cleanup with new data added every second.

One theory for cleanup...
determine the line count, assuming each update is on its own line
divide that number in half
use a tail command to copy the 2nd half of the list to a new file
then copy it back to the original filename

yeah something like

var=`expr $(cat filename| wc -l) / 2`
tail -$var filename> newfile
mv newfile filename
1 Like

It seems cat will take long time to cat the file as it is huge in size now.
var=`expr $(cat filename| wc -l) / 2` is taking long time to execute. I am waiting though

try

var=`expr $(wc -l < filename) / 2`
1 Like

Deleting all the data would be easy, but half? Hmm.

What is making this logfile? Many daemons allow you to send a signal to them when you want to change the logfile, which would at least let you deal with the file without it stomping it several times a second.

Surely cat filename|wc -l adds a process and therefore considerable extra time. Would wc -l filename not be quicker?

Anyhow, is the file being appended to as in-use all the time or is it separate operations. Consider these two (probably not exactly true, but just for an example)

for i in 1 2 3 4 5
do
   echo "Hello $i" >> filename
   sleep 5
done

versus

for i in 1 2 3 4 5
do
   echo "Hello $i"
   sleep 5
done >>filename

In the first, you have five discreet "open-append and close" operations. In the second you have one, so in the gaps between the echo statements, the file remains open. If you delete the data and write the file back, where does the subsequent output go? If you rename the file, then the output follows the old file.

Like Corona688 says, we need to know what is generating the messages. It may be that you have to stop that process whilst you manipulate the file, then restart it if there is no signal you can send to get it to switch logs.

Robin

Thanks Makarand... I will try the same.
Corona688---the message is generated (what i could see) is as because there are some SSL handshake failure and hence the entries in the log file.

Which logfile? It may be syslog that actually writes the entries.