Parsing a log file to cut off some parts

Dear all,

I would like to use SQL's log file to extract information from it.

This file can include four different types of instruction with the number of lines involved for each of them:

-> (1) "INSERT" instruction with the number of lines inserted
-> (2) "UPDATE" instruction with the number of lines updated
-> (3) "DELETE" instruction with the number of lines deleted
-> (4) "COLLECT STATISTICS" instruction with the number of lines involved.

The number of lines involved for "UPDATE" and "COLLECT STATISTICS" has exactly the same appearance in the log file:

-> for "COLLECT STATISTICS" instruction (check "Update completed" below)

COLLECT STATISTICS           COLUMN (STD_CDH_ID)
                           , COLUMN (PARTITION)
                           , COLUMN (METDVERS_OID_ID)
         -- AXE ITEMGROUP
                           , COLUMN (ITEMGROUP_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, ITEMGROUP_OID_CD)
                           -- AXE VISIT
                           , COLUMN (VISIT_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, VISIT_OID_CD)
         -- AXE FORM
                           , COLUMN (FORM_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, FORM_OID_CD)
         -- AXE SITE
                           , COLUMN (SITE_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, SITE_OID_CD)
       -- AXE SUBJCT
                           , COLUMN (SUBJCT_KEY_ID)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, SUBJCT_KEY_ID)

ON                           CDH_STGN_DEV1.RECRD_SHREDDED
;

 *** Update completed. 14 rows changed.
 *** Total elapsed time was 4 seconds.

-> for "UPDATE" instruction (check "Update completed" below)

UPDATE
 TGT
FROM
 CDH_REJT_DEV1.RAVE_LNKITEMGROUPITEM TGT
, CDH_STGN_DEV1.LNKITEMGROUPITEM SRC
SET
 LNKITEMGROUPITEM_REJTDELTD_TS = CURRENT_TIMESTAMP(0)
WHERE
 TGT.STD_CDH_ID   = SRC.STD_CDH_ID
AND TGT.METDVERS_OID_ID  = SRC.METDVERS_OID_ID
AND TGT.ITEM_OID_CD   = SRC.ITEM_OID_CD
AND TGT.LNKITEMGROUPITEM_REJTDELTD_TS IS NULL
;

 *** Update completed. No rows changed.
 *** Total elapsed time was 1 second.

The problem I face is the following: I am not interested by the number of lines involved in the "COLLECT STATISTICS" instruction and I am not able to recognize from the log file what kind of expression - "COLLECT STATISTICS" or "UPDATE" - produced "Update completed" in the log file !

So, I thought to cut off from the log file the multi-line record starting with "COLLECT STATISTICS" and ending with "changed." ... but I failed, I failed, I failed !

Thanks in advance for the attention you pay to my request and obviously, any suggestion over welcomed,

Didier.

Not sure I understand your request. Completing it with sample input data and the relevant code snippet from your failing attempts might help to come to a close interpretation.

Thanks a lot for your reply RudiC,

for instance, below is the whole content of the log file MyLogFile:

UPDATE
 TGT
FROM
 CDH_REJT_DEV1.RAVE_LNKITEMGROUPITEM TGT
, CDH_STGN_DEV1.LNKITEMGROUPITEM SRC
SET
 LNKITEMGROUPITEM_REJTDELTD_TS = CURRENT_TIMESTAMP(0)
WHERE
 TGT.STD_CDH_ID   = SRC.STD_CDH_ID
AND TGT.METDVERS_OID_ID  = SRC.METDVERS_OID_ID
AND TGT.ITEM_OID_CD   = SRC.ITEM_OID_CD
AND TGT.LNKITEMGROUPITEM_REJTDELTD_TS IS NULL
;

 *** Update completed. No rows changed.
 *** Total elapsed time was 1 second.

COLLECT STATISTICS           COLUMN (STD_CDH_ID)
                           , COLUMN (PARTITION)
                           , COLUMN (METDVERS_OID_ID)
         -- AXE ITEMGROUP
                           , COLUMN (ITEMGROUP_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, ITEMGROUP_OID_CD)
                           -- AXE VISIT
                           , COLUMN (VISIT_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, VISIT_OID_CD)
         -- AXE FORM
                           , COLUMN (FORM_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, FORM_OID_CD)
         -- AXE SITE
                           , COLUMN (SITE_OID_CD)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, SITE_OID_CD)
       -- AXE SUBJCT
                           , COLUMN (SUBJCT_KEY_ID)
                           , COLUMN (STD_CDH_ID, METDVERS_OID_ID, SUBJCT_KEY_ID)

ON                           CDH_STGN_DEV1.RECRD_SHREDDED
;

 *** Update completed. 14 rows changed.
 *** Total elapsed time was 2 seconds.

In that log file, we have two kinds of clause:
(1) an UPDATE clause (the first one)
(2) a "COLLECT STATISTICS" clause (the second one).

I want to know how many records were implied in the UPDATE clause only.

Using the following shell command, I got the two records output:

grep "Update completed" MyLogFile:
 *** Update completed. No rows changed.
 *** Update completed. 14 rows changed.

but only the first record is necessary to me (it allows to know the UPDATE clause did not have any effect: "No rows changed"), the second one is related to the "COLLECT STATISTICS" clause and I do not care about the impact of that clause.

The problem I face is the output is exactly the same for the two clauses: I am not able to link the first record from the grep command to the "UPDATE" clause, and the second record to the "COLLECT STATISTICS" !

So I thought to perform a treatment on "MyLogFile", before issuing the grep command, to delete all the information linked with the "COLLECT STATISTICS" clause in order to obtain only the record linked with UPDATE clause when performing "grep" !

But I do not know how to perform that operation ....

something along these lines:

awk '
/^UPDATE/ {type=1;next}
type==1 && /Update completed/ { print "Updated rows: " (($4=="No")?0:$4); type=0}
' myFile
1 Like

Hi - I may be missing the point here, but if it is only the first update line that you are interested in, you could simply do the following :-

grep "Update completed" MyLogFile | head -1
Is this any good to you.

1 Like