awk to print lines based on string match on another line and condition

Hi folks,

I have a text file that I need to parse, and I cant figure it out. The source is a report breaking down softwares from various companies with some basic info about them (see source snippet below). Ultimately what I want is an excel sheet with only Adobe and Microsoft software name and then its version number tab or comma delimited. Id like to use awk as Im trying to get more familiar with it, with a search string that says something like /^ *Location:.*Adobe/ to help determine which entries are applicable, and then print 4 and 6 lines before that. I cant figure out how to do that, and then on top of it, the main catch is that some entries will satisfy the string match, but wont have the Version on the appropriate line (could be 3 instead of 4 lines before matched line), so there has to be a check like "if NR-3 or NR-4 starts with /^ ?Version/ print that and then whatever is 2 lines before that." Ive tried a ton of code snippets and approaches, but nothing even came close to working, so I havent included any.

Here is a bit of the source. Notice the 3rd entry would be an example of something that satisfies the string match but has Version and Software name 3 and 5 lines before instead of 4 and 6. I wouldnt want the 4th entry included at all. Thanks so much for any help!

    Bridge CS3:

      Version: 2.1.1.9
      Last Modified: 11/6/08 10:27 AM
      Kind: Universal
      Get Info String: 2.1.1.9 (124992), Copyright 2003-2007, Adobe Systems, Inc.
      Location: /Applications/Adobe Bridge CS3/Bridge CS3.app

    Adobe Bridge CS4:

      Version: 3.0.0.464
      Last Modified: 9/15/09 5:34 PM
      Kind: Universal
      Get Info String: 3.0.0.464 (144651), Copyright 2003-2007, Adobe Systems, Inc.
      Location: /Applications/Adobe Bridge CS4/Adobe Bridge CS4.app

    Device Central:

      Version: 1.1.0
      Last Modified: 11/6/08 10:53 AM
      Kind: Universal
      Location: /Applications/Adobe Device Central CS3/Device Central.app

    Chess:

      Version: 2.4.1
      Last Modified: 9/16/09 10:30 AM
      Kind: Universal
      Get Info String: 2.4.1, Copyright 2003-2008 Apple Inc.
      Location: /Applications/Chess.app

The final output should look like this, so I can import it into excel:

Bridge CS3,2.1.1.9,Adobe Bridge CS4,3.0.0.464,Device Central,1.1.0
awk '{printf (/:$/)?$0:$0"\n"}' urfile |awk 'BEGIN{RS="";FS="\n"} /Adobe/ {split($2,a," ");print $1,a[2]}'

Bridge CS3: 2.1.1.9
Adobe Bridge CS4: 3.0.0.464
Device Central: 1.1.0

hi rdcwayx,

thanks a lot for the reply. three things:

one is that when i run that code, i only get an output of exactly the code you wrote:

Rowie718:~/Desktop rowie$ ./parse_app_report2.sh 
awk '{printf (/:$/)?$0:$0"\n"}' ~/Desktop/parse_app_report2.sh |awk 'BEGIN{RS="";FS="\n"} /Adobe/ {split($2,a," ");print $1,a[2]}' 
Rowie718:~/Desktop rowie$

the 2nd thing is that i need to include Microsoft software as well as Adobe.

the 3rd thing thing is I believe i need the output to be comma or tab delimited in order to import to excel. on top of all this, if i can throw another twist in.... the source report has multiple people in it. each persons entry starts with a line that looks like this: Rowie-G5 (192.168.0.153) --the consistent part being "192.168.0.". Ideally Id need this script to output in a way such that I can import it into an excel sheet which will look like this (the pipe just signifies field separation):

Rowie-G5 | Bridge CS3 | 2.1.1.9 | Adobe Bridge CS4 | 3.0.0.464 | Microsoft Word | 12.2.4 |
User2 | Bridge CS3 | 2.1.1.9 | Adobe Bridge CS4 | 3.0.0.464 | Microsoft Word | 12.2.4 |
User3 | Bridge CS3 | 2.1.1.9 | Adobe Bridge CS4 | 3.0.0.464 | Microsoft Word | 12.2.4 |

Thanks so much for your help

What's the meaning that you run the script with the script name as input file?

---------- Post updated at 07:18 PM ---------- Previous update was at 07:16 PM ----------

I don't see outputs from your input sample, so you need put some real sample here.

ah, sorry, youre right, i was half asleep. when i ran it with the proper input file, the code acted as expected. however, there are several problems. one is that there will be many entries, not just the 3 from the sample code. there are probably 60 entries for each person, but we only need to get the Adobe and Microsoft softwares. ive pasted a more representative sample input sample text at the bottom of this post. when i run your code on it i get this:

USER1-G5 (192.168.0.153) 09:41:55.996
    Device Central: 1.1.0
    Adobe Stock Photos CS3: Adobe
    Make Calendar: Modified:
USER2-iMac-Intel (192.168.0.241) 09:41:55.533
    Device Central: 1.1.0
    Adobe Stock Photos CS3: Adobe
    Make Calendar: Modified:

instead of this, which is what id like:

USER1-G5,Adobe Acrobat Pro,9.2.0,Device Central,1.1.0,Adobe Stock Photos CS3,1.5.0.466,Word,12.2.5
USER2-iMac-Intel,Adobe Acrobat Pro,9.2.0,Device Central,1.1.0,Adobe Stock  Photos CS3,1.5.0.466,Word,12.2.5

Problems with the code you supplied: we need the user as first output field. there are spaces in the output, and it is not comma separated to help export to excel. Adobe Stock Photos needs the version number, not just the first word after the colon (ideally we could take any number strings after the colon). The Make Calendar entry should not be included since there is no version. Microsoft needs to be included. Thanks so much for all your help, i really appreciate it.

Sample Input Text:

USER1-G5 (192.168.0.153)
2010-06-11 09:41:55.996 system_profiler[111:10b] CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary.
Applications:

    Adobe Acrobat Pro:

      Version: 9.2.0
      Last Modified: 9/15/09 5:38 PM
      Kind: Universal
      Get Info String: Adobe� Acrobat� 9.2.0, �1984-2009 Adobe Systems Incorporated. All rights reserved.
      Location: /Applications/Adobe Acrobat 9 Pro/Adobe Acrobat Pro.app

    Device Central:

      Version: 1.1.0
      Last Modified: 11/6/08 10:53 AM
      Kind: Universal
      Location: /Applications/Adobe Device Central CS3/Device Central.app

    Adobe Stock Photos CS3:

      Version: Adobe Stock Photos 1.5.0.466
      Last Modified: 11/5/08 1:32 PM
      Kind: Universal
      Get Info String: Adobe Stock Photos 1.5.0.466 (C) 2005 Adobe Systems, Inc. All rights reserved.
      Location: /Applications/Adobe Stock Photos CS3/Adobe Stock Photos CS3.app

    Word:

      Version: 12.2.5
      Last Modified: 6/9/10 5:18 PM
      Kind: Universal
      Get Info String: 12.2.5 (100505), � 2007 Microsoft Corporation. All rights reserved.
      Location: /Applications/Microsoft Office 2008/Microsoft Word.app

    Make Calendar:

      Last Modified: 11/5/08 1:37 PM
      Kind: Native (Preferred) or Classic
      Location: /Applications/Adobe Illustrator CS3/Scripting.localized/Sample Scripts.localized/AppleScript/Calendar.localized/Make Calendar.app

    SecureDownloadAgent:

      Version: 1.1
      Last Modified: 9/16/09 10:30 AM
      Kind: Universal
      Location: /System/Library/CoreServices/VerifiedDownloadAgent.app



USER2-iMac-Intel (192.168.0.241)
2010-06-11 09:41:55.533 system_profiler[2182:10b] CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary.
Applications:

    Adobe Acrobat Pro:

      Version: 9.2.0
      Last Modified: 9/15/09 5:38 PM
      Kind: Universal
      Get Info String: Adobe� Acrobat� 9.2.0, �1984-2009 Adobe Systems Incorporated. All rights reserved.
      Location: /Applications/Adobe Acrobat 9 Pro/Adobe Acrobat Pro.app

    Device Central:

      Version: 1.1.0
      Last Modified: 11/6/08 10:53 AM
      Kind: Universal
      Location: /Applications/Adobe Device Central CS3/Device Central.app

    Adobe Stock Photos CS3:

      Version: Adobe Stock Photos 1.5.0.466
      Last Modified: 11/5/08 1:32 PM
      Kind: Universal
      Get Info String: Adobe Stock Photos 1.5.0.466 (C) 2005 Adobe Systems, Inc. All rights reserved.
      Location: /Applications/Adobe Stock Photos CS3/Adobe Stock Photos CS3.app

    Word:

      Version: 12.2.5
      Last Modified: 6/9/10 5:18 PM
      Kind: Universal
      Get Info String: 12.2.5 (100505), � 2007 Microsoft Corporation. All rights reserved.
      Location: /Applications/Microsoft Office 2008/Microsoft Word.app

    Make Calendar:

      Last Modified: 11/5/08 1:37 PM
      Kind: Native (Preferred) or Classic
      Location: /Applications/Adobe Illustrator CS3/Scripting.localized/Sample Scripts.localized/AppleScript/Calendar.localized/Make Calendar.app

    SecureDownloadAgent:

      Version: 1.1
      Last Modified: 9/16/09 10:30 AM
      Kind: Universal
      Location: /System/Library/CoreServices/VerifiedDownloadAgent.app

this is the code ive worked out. i dont think its the most elegant way to do it, but it almost does what i need. the problem is because the RS is a blank line, the name of the software is in the previous record. i tried changing the RS to ".app" and variations on that, but I couldnt get it to work. so for the code i have now, in the last section id like to add something that says "print the previous record". ive been looking into things like getline, and trying various approaches, but im a bit stuck. any help would be greatly appreciated. thanks.

cat inFile | 
awk 'BEGIN{
    RS="";FS="\n"
} 

    /192.168.0/ {
        gsub(/ \(.*/, "",$1)
        printf "\n%s,", $1
}

(/Adobe/&&/Version/)||(/Microsoft/&&/Version/) {

    if ($1~"Version") {
        gsub(/^[ \t]+/, "",$1)
        printf "%s,", $1;
    }

}'