Help Needed Using awk/CUT

itsme488 · January 18, 2014, 7:13pm

Hi Experts,

I am writing a script and struct at a part Need your help to get this

I have a file generated called /tmp/testify.log

$ cat testify.log

Machine Parts                   6       DREE
Mufler Strengths        33              XYNC
Siscos                  20      09      ABSC


$ cat /tmp/testify.log | grep -v '^$'|awk 'BEGIN { FS = "[ \t]+" } {for(i=1;i<=NF;i++) print $i;} END {print " "}'
Machine                                        ###>>#here the Column1 for Row 1 is lining into 2 seperate but i want them into a single  also the last column has null values jut want to replace null with -
Parts
6
DREE

Mufler
Strengths
33
XYNC

Siscos
20
09
ABSC

how can i acheive the required output... the anticipated output . .

Machine Parts
-
6
DREE

Mufler Strengths
33
-
XYNC

Siscos
20
09
ABSC

Scrutinizer · January 19, 2014, 1:46am

The question is what the format is of the input file. If it is TAB separated then something like this may do:

awk -F'\t' '{for(i=1; i<=NF; i++) print ($i=="")?"-":$i; print ""}' file

(don't use [\t]+ , since it will "eat" empty fields)

But your attached file looks like it may be fixed position format (with spaces) and then you would need something else.

Don_Cragun · January 19, 2014, 1:48am

The following seems to do what you want if the input file contains tabs, spaces, or a combination of spaces and tabs between fields. It assumes that the fields reside in fixed character positions and that tab stops are set after every eight characters:

awk '
function extract_field(first, last,     string) {
        string = substr($0, first, last - first)
        sub(/ *$/, "", string)
        return string == "" ? "-" : string
}
/^$/ {  next
}
/\t/ {  # Convert tabs in input to spaces assuming tab stops are set in columns
        # 9 + 8x.
        while(i = index($0, "\t")) $0 = substr($0, 1, i - 1) \
                sprintf("%.*s", 8 - (i - 1) % 8, "        ") substr($0, i + 1)
}
{       # At this point, field 1 is characters 1-24, field 2 is characters
        # 25-32, field 3 is characters 33-40, and field 4 is columns 41-EOL.
        printf("%s%s\n%s\n%s\n%s\n", oc++ ? "\n" : "", extract_field(1,24),
                extract_field(25,32), extract_field(33,40),
                extract_field(41,80))
}' testify.log

If you want to run this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk .

Akshay_Hegde · January 19, 2014, 7:06am

Could this help you ?

$ cat file
Machine Parts                   6       DREE
Mufler Strengths        33              XYNC
Siscos                  20      09      ABSC

awk  '

function spr(){
                  n=split($0,A,r) 
                  for(i=1;i<=n;i++)
                      {
                        so = so ? so A : A
                        if((i in W) || n == i)
                           {
                             print so ~ /[[:alnum:]]|[[:punct:]]/ ? so : "-"  
                             so = ""
                           }
                      }
                  printf FNR < C ? RS : NULL
              }

       FNR==NR{
                   j=0
                   n=split($0,A,r)
                   for(i=1; i<=n; i++)
                     {
                     if(f==0 && A!=" ")
                               {
                                  s=i
                                  f=1
                               }
                     if( f == 1 && A != " " && A[i+1]== " " && A[i+2] == " " || i == n)
                               {
                                  f=0
                                  ++j
                                  WI[FNR,j] = s     
                               }
                     }
        
                     # Consider width of max NF
                         if(j>p){
                                   mf  = j
                                   max = FNR
                                }
                       p = j
                       C = FNR
                       next
              }

        FNR==1{
                     for(i=1; i<=mf; i++)
                     W[(WI[max,i]-1) != 0 ? WI[max,i] -1 : -1 ]
              }

              {
                       spr()
              }

    '  file file

Resulting

Machine Parts           
-
6       
DREE

Mufler Strengths        
33      
-
XYNC

Siscos                  
20      
9      
ABSC

itsme488 · January 21, 2014, 11:04pm

Don - Thanks for your awk but i used my inputfile but still things are not working as expected

this is my input file... since the tab after each feild is differnt its touch to handle

any better was is helpfull

AB SISCO Transport SEED           2,189,675       4,308      2   52.3 USO
AB XPU                                            2,854          34.6
Dreed pinch Feed                    308,043         811      3    9.8 UIO
PX MNC: AB Load Del Info                950         189    199    2.3 Other
MN SISCO Transport FEED             215,085         165           2.0 ION
AB-TRANS SISCO Transport REED       105,321         127      1    8.1 SIO NB

Don_Cragun · January 22, 2014, 2:38am

I guess that I'm not surprised that code that was written to extract 4 left-justified fields from an input line doesn't correctly extract 6 fields (some left-justified and some right-justified) with completely different field alignments. If you give us sample input that is not representative of your real input, you're wasting all of our time. With your new sample input, the following updated awk script does something that I am guessing is closer to what you want, but (since you didn't show what the output should be), it is just a guess:

awk '
function extract_field(first, last,     string) {
        string = substr($0, first, last - first + 1)
        sub(/^ */, "", string)
        sub(/ *$/, "", string)
        return string == "" ? "-" : string
}
/^$/ {  next
}
/\t/ {  # Convert tabs in input to spaces assuming tab stops are set in columns
        # 9 + 8x.
        while(i = index($0, "\t")) $0 = substr($0, 1, i - 1) \
                sprintf("%.*s", 8 - (i - 1) % 8, "        ") substr($0, i + 1)
}
{       # At this point, field 1 is characters 1-32, field 2 is characters
        # 33-43, field 3 is characters 44-55, and field 4 is characters 56-62,
        # field 5 is characters 63-69, and field 6 is characters 70-EOL.  All
        # of these fields offsets are guesses based on provided sample input.
        printf("%s%s\n%s\n%s\n%s\n%s\n%s\n",
                oc++ ? "\n" : "",
                extract_field(1,32),
                extract_field(33,43),
                extract_field(44,55),
                extract_field(56,62),
                extract_field(63,69),
                extract_field(70,100))
}' testify.log2

When your new sample input is saved in a file named testify.log2, the above script produces the output:

db file scattered read
2,189,675
4,308
2
52.3
User I/O

DB CPU
-
2,854
-
34.6
-

direct path read
308,043
811
3
9.8
User I/O

PX Nsq: PQ load info query
950
189
199
2.3
Other

db file sequential read
215,085
165
1
2.0
User I/O

control file sequential read
105,321
127
1
8.1
System I/O

Of course, there are no tabs in the sample input you've shown us. And, if there is a tab at the end of each field but the last, your tab stops are not set at multiples of eight character boundaries.

Is that what you want?

Akshay_Hegde · January 22, 2014, 3:16am

Don : I don't think the new sample input provided in #5 is real one.. as I noticed input changed 2-3 times in post, I too felt it's just waste of time.

I think we should start questioning "what you have tried so far ?" before answering.. I seen many members posting input and expected output logging out, (very few are posting attempt towards thread) coming back after sometime, copy paste and again logout, if not answering or answer delayed sending private messages, dumping up / cross posting, etc. I think for this type of dumping up cases infractions are not just enough, we should make something like login denied for some hours, its my opinion.

itsme488 · January 23, 2014, 9:56pm

Thanks dan it worked thanks for your help