Using sed awk or perl I am trying to do something similar to
but my requirement is slightly different. What I am trying to accomplish is to reformat a logfile such that all lines start with the timestamp line and any lines that do no start with a timestamp are appended to the last line with a timestamp. Optionally I would like to do this up to the first semicolon.
A simplified input would be somthing like this
2009-05-27 02:37:27.283 The quick
brown fox;
The quick
brown fox
2009-05-28 10:10:28.000 Mary
had a
little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;
and ideally the output would be
2009-05-27 02:37:27.283 The quick brown fox;
2009-05-28 10:10:28.000 Mary had a little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;
although this is also acceptable
2009-05-27 02:37:27.283 The quick brown fox; The quick brown fox
2009-05-28 10:10:28.000 Mary had a little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;
The log files can be up to 10MB in size and there can be a hundred lines or more between timestamps. The purpose of this is to format the file so that it can be loaded into a database.
Any suggestions/solutions would be greatly appreciated.
#!/usr/bin/env python
fh=open("file")
s=""
f=0
for items in fh:
items=items.strip()
if f and items.startswith("2009"):
if ";" in s:
ind=s.index(";")
print s[:ind] #print from start till where ; is
else:
print s
s=""
f=0
if items.startswith("2009"):
f=1 #set flag
print items,
continue
if f and not items.startswith("2009"):
# join up those lines that doesn't start with 2009
s=s+items
fh.close() #close the file
output
# more file
2009-05-27 02:37:27.283 The quick
brown fox;
The quick
brown fox
2009-05-28 10:10:28.000 Mary
had a
little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;
adsf
sldkfdf
2009-05-28 10:10:28.000 Mary test
tester fmsd
2009-05-28 10:10:28.000
# ./test.py
2009-05-27 02:37:27.283 The quick brown fox
2009-05-28 10:10:28.000 Mary had alittle lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill; adsfsldkfdf
2009-05-28 10:10:28.000 Mary test tester fmsd
2009-05-28 10:10:28.000
sed -n '/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/ {
1 {
h
}
1 !{
x
s/\n/ /g
p
$ {
x
p
}
$ !{
d
}
}
}
/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/ !{
H
}' a.txt
perl:
undef $/;
my $str=<DATA>;
$str=~s/\n/ /g;
$str=~s/(?<=.)(?=[0-9]{4}-[0-9]{2}-[0-9]{2})/\n/g;
print $str;
__DATA__
2009-05-27 02:37:27.283 The quick
brown fox;
The quick
brown fox
2009-05-28 10:10:28.000 Mary
had a
little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;
-----Post Update-----
sed:
sed -n '/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/ {
1 {
h
}
1 !{
x
s/\n/ /g
p
$ {
x
p
}
$ !{
d
}
}
}
/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/ !{
H
}' a.txt
perl:
undef $/;
my $str=<DATA>;
$str=~s/\n/ /g;
$str=~s/(?:(?<=.))(?:(?=[0-9]{4}-[0-9]{2}-[0-9]{2}))/\n/g;
print $str;
__DATA__
2009-05-27 02:37:27.283 The quick
brown fox;
The quick
brown fox
2009-05-28 10:10:28.000 Mary
had a
little lamb.
2009-06-01 19:37:29.000 Jack and Jill ran up the hill;