Find two words and join together in one file

Hi,

I have a huge text file like below , I need to select only lines having line Fatal joined with id.
like below i want the line to be Fatal Error for input record 25 is id = 543523. Waiting for your help.

-----Original Message-----
Acceptance with warnings for input record 24.
001 tag of record from input is
abc1123
003 tag of record from input is
ref
Record replaced based on bib control number match without a bib id match. for id = 523434 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 523434 idtype = 101[warning] with additional error text

-------------------------------------------------------------
Fatal Error for input record 25.
001 tag of record from input is
145
003 tag of record from input is
abc
Record replaced based on bib control number match without a bib id match. for id = 543523 idtype = 101
-----------------------------------------------------------------
Acceptance with warnings for input record 24.
001 tag of record from input is
124
003 tag of record from input is
def
Record replaced based on bib control number match without a bib id match. for id = 523438 idtype = 101[warning
-----------------------------------------------------
Thanks a lot

grep -i "Fatal"  Input.file >>  Output.file

i think ur looking for this

Hi ,
Sorry i dont want that it will give only Fatal line but i want the id line also like
id =543523 which is in the seperate line

my total output should be like
Fatal Error for input record 25 is id = 543523.

Please help

Try this:

awk '/^Fatal/{sub(/\.$/,"");p=$0}p&&/^Record/{print p" is id = "$17;p=""}' infile

Output:

Fatal Error for input record 25 is id = 543523

scutinizer can you explain me how does awk works??

i mean i knw

/^Fatal (means from beginning of the line)

bt not sure, what is the inner query in it. i mean the command with sub

{sub(/\.$/,"");p=$0}p&&/

i also knw that $17 is the parameter for the id no.
bt hw did u deleted all the things between it,.. cudnt get tht ??

please explain if u dnt mind. thanks a lot in advance
awk '/^Fatal/{sub(/\.$/,"");p=$0}p&&/^Record/{print p" is id = "$17;p=""}' infile

Hi dazdseg,

The awk command works like this: on the line that contains "Fatal" at the beginning the sub command deletes the trailing dot and the result is put in variable p.

The second part says if variable p is not empty and the start of the line contains "Record", print variable p and the text " is id = " followed by field 17. Afterwards clear the contents of variable p.

S.

Hi,

Sorry,
The output is fetching me like this

Fatal Error for input record 25 is 523438(this is the id from Acceptance with warning for input record 24
but i want Fatal Error for input record 25 is 543523 of the same record.

-------------------------------------------------------------
Fatal Error for input record 25.
001 tag of record from input is
145
003 tag of record from input is
abc
Record replaced based on bib control number match without a bib id match. for id = 543523 idtype = 101
-----------------------------------------------------------------
Acceptance with warnings for input record 24
001 tag of record from input is
124
003 tag of record from input is
def
Record replaced based on bib control number match without a bib id match. for id = 523438 idtype = 101[warning

Please help

---------- Post updated at 10:46 AM ---------- Previous update was at 10:09 AM ----------

The file is exactly similar to this as written below but huge, please check and kindly guide me.
Initially the awk works fine but later on it is giving
Fatal Error for input record 27889 is id = 2394833

Fatal Error for input record 27889.
001 tag of record from input is
ocn430426837
003 tag of record from input is
OCoLC
search does not exist: for id = 2396539 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Acceptance with warnings for input record 27890.
001 tag of record from input is
sls2010021181
Record replaced based on bib control number match without a bib id match. for id = 2394833 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 2394833 idtype = 101[warning] with additional error text
040 $c abc, 040 $d abc

----------------------------------------------------------------------------
Acceptance with warnings for input record 27891.
001 tag of record from input is
sls2010021182
Record replaced based on bib control number match without a bib id match. for id = 2394834 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 2394834 idtype = 101[warning] with additional error text
040 $c abc, 040 $d abc

----------------------------------------------------------------------------
Fatal Error for input record 27892.
001 tag of record from input is
ocn489289719
003 tag of record from input is
OCoLC
search does not exist: for id = 2396540 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Fatal Error for input record 27893.
001 tag of record from input is
ocn600208119
003 tag of record from input is
OCoLC
search does not exist: for id = 2396541 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Fatal Error for input record 27894.
001 tag of record from input is
ocn600519310
003 tag of record from input is
OCoLC
Subfield data is duplicated: for id = 2396542 idtype = 101[warning] with additional error text
040 $d abc
search does not exist: for id = 2396542 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Fatal Error for input record 27895.
001 tag of record from input is
ocn600208128
003 tag of record from input is
OCoLC
Subfield data is duplicated: for id = 2396543 idtype = 101[warning] with additional error text
040 $d abc
search does not exist: for id = 2396543 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Fatal Error for input record 27896.
001 tag of record from input is
ocn489399391
003 tag of record from input is
OCoLC
search does not exist: for id = 2396544 idtype = 101[error] with additional error text
7 / cct

----------------------------------------------------------------------------
Acceptance with warnings for input record 27897.
001 tag of record from input is
ocm61030676
003 tag of record from input is
OCoLC
Record replaced based on bib control number match without a bib id match. for id = 2394841 idtype = 101[warning]<DIAGNOSTIC CODE 15751> for id = 2394841 idtype = 101[warning] with additional error text
020 $a 9787534929922
Subfield data is duplicated: for id = 2394841 idtype = 101[warning] with additional error text
040 $c SLY

----------------------------------------------------------------------------
Acceptance with warnings for input record 27898.
001 tag of record from input is
ocn428012248
003 tag of record from input is
OCoLC
Record replaced based on bib control number match without a bib id match. for id = 2394842 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 2394842 idtype = 101[warning] with additional error text
040 $c CNPIT
Possible duplicate subject heading: for id = 2394842 idtype = 101[warning] with additional error text
650 $a Tales $z China.
Possible duplicate subject heading: for id = 2394842 idtype = 101[warning] with additional error text
650 $a Fishes $z China $x Folklore.

----------------------------------------------------------------------------
Acceptance with warnings for input record 27899.
001 tag of record from input is
sls2010021184
003 tag of record from input is
OCoLC
Record replaced based on bib control number match without a bib id match. for id = 2394856 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 2394856 idtype = 101[warning] with additional error text
040 $c SIabc, 040 $d abc
Possible duplicate user heading: for id = 2394856 idtype = 101[warning] with additional error text
260 $b Zhongguo wen lian chu ban she,
Possible duplicate subject heading: for id = 2394856 idtype = 101[warning] with additional error text
650 $a Chinese poetry $y 20th century.

----------------------------------------------------------------------------
Acceptance with warnings for input record 27900.
001 tag of record from input is
sls2010021185
003 tag of record from input is
OCoLC
Record replaced based on bib control number match without a bib id match. for id = 2394857 idtype = 101[warning]<DIAGNOSTIC CODE 1000016> for id = 2394857 idtype = 101[warning] with additional error text
040 $c CGP
Possible duplicate user heading: for id = 2394857 idtype = 101[warning] with additional error text
260 $b Zhongguo wen lian chu ban she,
Possible duplicate subject heading: for id = 2394857 idtype = 101[warning] with additional error text
650 $a Chinese poetry $y 20th century.
Authority deleted: for id = 2394857 idtype = 101[warning] with additional error text
21130374 - Title: Dang dai shi tan bai jie jia zuo xuan / Zhu bian Huo Songlin.

----------------------------------------------------------------------------
Fatal Error for input record 27901.
001 tag of record from input is
ocn600208134
003 tag of record from input is
OCoLC
Subfield data is duplicated: for id = 2396545 idtype = 101[warning] with additional error text
040 $d abc
search does not exist: for id = 2396545 idtype = 101[error] with additional error text
7 / cct

Waiting for your reply

Thanks a lot

Hi umapearl,

Your second file has a different format than the one originally specified. Try this:

awk '/^Fatal/{sub(/\.$/,"");p=$0}p&&/^search does not exist/{print p" is id = "$8;p=""}' infile

output:

Fatal Error for input record 27889 is id = 2396539
Fatal Error for input record 27892 is id = 2396540
Fatal Error for input record 27893 is id = 2396541
Fatal Error for input record 27894 is id = 2396542
Fatal Error for input record 27895 is id = 2396543
Fatal Error for input record 27896 is id = 2396544
Fatal Error for input record 27901 is id = 2396545

Hi Thanks a lot,

My file consist of both type , so is it possible to merge both the awk statements. Sorry for disturbing. Please help.

Thanks

Hi, this would combine the two criteria:

awk '/for input record/          {p=""}
     /^Fatal/                    {sub(/\.$/," is id = ");p=$0}
     p&&/^Record/                {print p$17}
     p&&/^search does not exist/ {print p$8}' infile

Output:

Fatal Error for input record 25 is id = 543523
Fatal Error for input record 35 is id = 643523
Fatal Error for input record 27889 is id = 2396539
Fatal Error for input record 27892 is id = 2396540
Fatal Error for input record 27893 is id = 2396541
Fatal Error for input record 27894 is id = 2396542
Fatal Error for input record 27895 is id = 2396543
Fatal Error for input record 27896 is id = 2396544
Fatal Error for input record 27901 is id = 2396545

friend, your posts are bit too long.

just tell me what is your input and output.

Does this work for you?

awk '
/^Fatal / {
	if (FatalMsg != "")
	    printf "%s, %s\n", FatalMsg, "No ID found"
	FatalMsg = $0
	sub(/\.$/, "", FatalMsg)
}
/ id = / {
	if (FatalMsg == "")
	    next
	sub(/^.* id =  */, "")
	sub(/ .*/, "")
	IdNum = $0
	printf "%s, id = %s\n", FatalMsg, IdNum
	FatalMsg = IdNum = ""
}
END {
	if (FatalMsg != "")
	    printf "%s, %s\n", FatalMsg, "No ID found"
}
' data

Sorry for the late reply,

Thanks a lot for all your help,My problem is sovled. could anyone help me to show how to use regex.

Thanks once again

Regular expressions are a pretty big subject. You may find this recent thread useful.