Splitting file based on pattern and first character

I have a file as below
pema.txt

s2dhshfu dshfkdjh dshfd 
rjhfjhflhflhvflxhvlxhvx vlvhx
sfjhldhfdjhldjhjhjdhjhjxhjhxjxh
sjfdhdhfldhlghldhflhflhfhldfhlsh
rjsdjh#error occured#
skjfhhfdkhfkdhbvfkdhvkjhfvkhf
sjkdfhdjfh#error occured#  

my requirement is to create 3 files frm the above file

1) pema.junk
will contain all the records where the txt #error occured# is present
2) pema.s
should contain all record starting with s except one's in pema.junk
3)pema.r
should contain all records starting with r except one's in pema.junk

The first line of the file doesnt contain any dta so should be ignored

and if possible all data after 135 position should be truncatd when copyin
into s and r file

Like this..?

awk '/#error occured#/ {printf ("%.135s\n",$0) > "pema.junk"} /^r/ && ! /#error occured#/ { printf ("%.135s\n",$0) > "pema.r"} /^s/ && ! /#error occured#/ {printf ("%.135s\n",$0) > "pema.s"}' pema.txt

I know it looks dirty :wink:

1 Like

It doesn't have to. Here's the same code formatted:

awk '
  /#error occured#/ {
    printf ("%.135s\n",$0) > "pema.junk"
  }

  /^r/ && ! /#error occured#/ { 
    printf ("%.135s\n",$0) > "pema.r"
  } 

  /^s/ && ! /#error occured#/ {
    printf ("%.135s\n",$0) > "pema.s"
  }
' pema.txt
2 Likes

Or:

awk '{f="pema." (/#error occured#/?"junk":substr($0,1,1)); print substr($0,1,135)>f}' infile
1 Like

thank you guys.. will try and let you know

---------- Post updated at 12:52 AM ---------- Previous update was at 12:42 AM ----------

i tried it but i'm getting an error
syntax error The source line is 13.
The error context is
>>> pema. <<< txt

---------- Post updated at 01:14 AM ---------- Previous update was at 12:52 AM ----------

apologies, the code works perfectly... the error was due to me..thank you very much

---------- Post updated at 01:24 AM ---------- Previous update was at 01:14 AM ----------

a slight modification on the below code

awk '
  /#error occured#/ {
    printf ("%.135s\n",$0) > "pema.junk"
  }

  /^r/ && ! /#error occured#/ { 
    printf ("%.135s\n",$0) > "pema.r"
  } 

  /^s/ && ! /#error occured#/ {
    printf ("%.135s\n",$0) > "pema.s"
  }
' pema.txt

can we make it so that the S and R is ignored and also if the record is shorted than 135 can we pad it with spaces so that its 135 in length?

Do you mean ignore the case of r or s? So that line that start with S and R are directed to the same file? Something like this?

awk '
  function pr(f){
    printf ("%-135s\n",substr($0,1,135))>f
  }     
                  
  /#error occured#/ {
    pr("pema.junk")
    next
  }

  /^(r|R)/ { 
    pr("pema.r")
  } 

  /^(s|S)/ {
    pr("pema.s")
  }
' infile

--
Some awks can do this:

awk '{NF=135; f="pema." (/#error occured#/?"junk":tolower($1)); print>f}' FS= infile

sorry for the confusion, i mean to to say , can i ignore 'S' and 'R'
fo that the record looks like

2dhshfu dshfkdjh dshfd 

instead of

s2dhshfu dshfkdjh dshfd

Like so?

awk '
  /#error occured#/ {
    print > "pema.junk"
    next
  }

  /^r/{ 
    printf "%-135s\n",substr($0,2,135) > "pema.r"
  } 

  /^s/{
    printf "%-135s\n",substr($0,2,135) > "pema.s"
  }
' pema.txt
1 Like

thanks works perfect