Count lines separated by new line

prashant2507198 · April 6, 2013, 7:07am

Hi guys,

I have a file which has random records like mentioned below

emcpower28a
pci@3,03 (disk physical name)
pci@3,04

emcpower9a
pci@1,03
pci@2,03
pci@3,01
pci@4,03

there could be any number of disk names for any LUN (emc...) So, I want a solution to count disk names for its respective LUN which means

emcpower28a - 2
emcpower9a - 4

Can somebody help me in this? I am not able to think of any idea to achieve this.

jim_mcnamara · April 6, 2013, 7:59am

You want to count the records the start with emc?

grep -c '^emc' inputfile

Don_Cragun · April 6, 2013, 8:50am

prashant2507198:

Hi guys,

I have a file which has random records like mentioned below
emcpower28a
pci@3,03 (disk physical name)
pci@3,04

emcpower9a
pci@1,03
pci@2,03
pci@3,01
pci@4,03
there could be any number of disk names for any LUN (emc...) So, I want a solution to count disk names for its respective LUN which means

emcpower28a - 2
emcpower9a - 4

Can somebody help me in this? I am not able to think of any idea to achieve this.

Where does the text I highighted in red above come from?

prashant2507198 · April 6, 2013, 9:26am

Well, as I have mentioned in my original post. Each LUN (emcpower...) can have multiple disk names. So your red marked numbers are count of those disks. In my example emcpower28a has 2 disks so I mentioned number 2 there. Thats what I want to calculate. You understand what I mean?

Yoda · April 6, 2013, 9:28am

awk '/^emc/{p=$0}!/^emc/&&NF{A[p]++}END{for(c in A) print c,A[c]}' OFS=" - " file

prashant2507198 · April 6, 2013, 9:30am

No, I want to count number of disks associated with each emcpower. As in my example 1 LUN has 2 disks associated and another has 4. This is what I want to calculate.

---------- Post updated at 08:30 AM ---------- Previous update was at 08:29 AM ----------

What it will do?

Yoda · April 6, 2013, 9:33am

It will produce an output as per your requirement.

Input file:

$ cat file
emcpower28a
pci@3,03 (disk physical name)
pci@3,04

emcpower9a
pci@1,03
pci@2,03
pci@3,01
pci@4,03

Output:

$ awk '/^emc/{p=$0}!/^emc/&&NF{A[p]++}END{for(c in A) print c,A[c]}' OFS=" - " file
emcpower9a - 4
emcpower28a - 2

prashant2507198 · April 6, 2013, 9:43am

yoda:

It will produce an output as per your requirement.

Input file:

$ cat file
emcpower28a
pci@3,03 (disk physical name)
pci@3,04

emcpower9a
pci@1,03
pci@2,03
pci@3,01
pci@4,03

Output:

$ awk '/^emc/{p=$0}!/^emc/&&NF{A[p]++}END{for(c in A) print c,A[c]}' OFS=" - " file
emcpower9a - 4
emcpower28a - 2

Well, it didn't give expected output.

Let me show you what I did.

File has records like this

Pseudo name=emcpower0a
3076 pci@2,600000/SUNW,emlxs@0/fp@0,0 c1t5Fd94s0 FA 13cA   active  alive      0      0
3075 pci@12,600000/SUNW,emlxs@0/fp@0,0 c3t5Dd94s0 FA  4cA   active  alive      0      0

Pseudo name=emcpower8a
3076 pci@2,600000/SUNW,emlxs@0/fp@0,0 c1t5Cd95s0 FA 13cA   active  alive      0      0
3075 pci@12,600000/SUNW,emlxs@0/fp@0,0 c3t53d95s0 FA  4cA   active  alive      0      0
3073 pci@12,600000/SUNW,emlxs@0/fp@0,0 c3t54d95s0 FA  4cA   active  alive      0      0

Pseudo name=emcpower15a
3076 pci@2,600000/SUNW,emlxs@0/fp@0,0 c1t57d165s0 FA 13cA   active  alive      0      0
3075 pci@12,600000/SUNW,emlxs@0/fp@0,0 c3t52d165s0 FA  4cA   active  alive      0      0

I fired below awk command and got some error

awk '/^emc/{p=$0}!/^emc/&&NF{A[p]++}END{for(c in A) print c,A[c]}' OFS=" - " /tmp/dev1
awk: syntax error near line 1
awk: bailing out near line 1

So I used nawk which ran successfully however didn't produce expected output.

nawk '/^emc/{p=$0}!/^emc/&&NF{A[p]++}END{for(c in A) print c,A[c]}' OFS=" - " /tmp/dev1
 - 696

---------- Post updated at 08:43 AM ---------- Previous update was at 08:40 AM ----------

Based on above reply I want output something like

emcpower0a - 2
emcpower8a - 3
emcpower15a - 2

Yoda · April 6, 2013, 9:52am

Do you expect this code to work when you change the file format?

Read it carefully and you will understand why it didn't work.

Here is something that will work for your current input:

nawk -F'=' '
                /emcpower/ {
                                d = $2
                }
                !/emcpower/ && NF {
                                A[d]++
                }
                END {
                        for ( c in A )
                                print c, A[c]
                }
' OFS=' - ' file

Please make sure that you post representative samples that are similar to your original input file data in future, otherwise it is a waste of time for you and people who are trying to help...

prashant2507198 · April 6, 2013, 10:18am

yoda:

Do you expect this code to work when you change the file format?

Read it carefully and you will understand why it didn't work.

Here is something that will work for your current input:
nawk -F'=' '
   /emcpower/ {
   d = $2
   }
   !/emcpower/ && NF {
   A[d]++
   }
   END {
   for ( c in A )
   print c, A[c]
   }
' OFS=' - ' file
Please make sure that you post representative samples that are similar to your original input file data in future, otherwise it is a waste of time for you and people who are trying to help...

WOW. It worked like butter. Can you please help me in understanding components of this command or any documentation about nawk?

Yoda · April 6, 2013, 11:04am

Here is what the code does:

awk -F'=' '                                     # Set = as field separator
                /emcpower/ {                    # Search for pattern: emcpower
                                d = $2          # If found, assign d to 2nd field (disk name)
                }
                !/emcpower/ && NF {             # Search for not pattern: emcpower and NF>=1
                                A[d]++          # Create and increment array indexed by disk name
                }
                END {                           # END block
                        for ( c in A )          # For each element in array
                                print c, A[c]   # Print disk name and count
                }
' OFS=' - ' file                                # Set ' - ' as output field separator

Read the nawk manual for further reference: man nawk

prashant2507198 · April 6, 2013, 11:58am

yoda:

Here is what the code does:

awk -F'=' '                                     # Set = as field separator
   /emcpower/ {                    # Search for pattern: emcpower
   d = $2          # If found, assign d to 2nd field (disk name)
   }
   !/emcpower/ && NF {             # Search for not pattern: emcpower and NF>=1
   A[d]++          # Create and increment array indexed by disk name
   }
   END {                           # END block
   for ( c in A )          # For each element in array
   print c, A[c]   # Print disk name and count
   }
' OFS=' - ' file                                # Set ' - ' as output field separator

Read the nawk manual for further reference: man nawk

AWESOME. I have found a book on awk which I will read later but for now I needed an urgent solution so I had posted here.

Can you please help me with this too? It would be really great great help for me.

File like this

        /dev/rdsk/c20t638d0s2
                Total Path Count: 6
                Operational Path Count: 6
       /dev/rdsk/c20t630d0s2
                Total Path Count: 6
                Operational Path Count: 6
        /dev/rdsk/c20t639d0s2
                Total Path Count: 6
                Operational Path Count: 6

I want output only

/dev/rdsk/c20t639d0s2 - 6 (total path count numbers)

RudiC · April 6, 2013, 4:02pm

Given your file above, and hoping you're NOT changing the file format, this should do the job for you:

$ awk '/\/dev\// {TMP=$0; getline; print TMP " - " $NF}' file
        /dev/rdsk/c20t638d0s2 - 6
       /dev/rdsk/c20t630d0s2 - 6
        /dev/rdsk/c20t639d0s2 - 6