How to delete all lines before a particular pattern when the pattern is defined in a variable?

I have a file

Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5

I want to delete all lines before last occurrence of a line which contains something which is defined in a variable. Say a variable var contains 'Line 1', then I need the following in the output.

Line 4
Line 5
cat /dev/null >outfile
var="line 1"
while read line
do
   echo "$line" >>outfile
   if [ "$var" = "$line" ]
       then
       cat /dev/null >outfile
   fi
done <inputfile
1 Like

Thx.. But this will not work. Lines are not just 'Line 1'. They contains some more text a;so. And I need to delete all lines before the last occurrence.

I could not construct a simple sed or awk script.

Hi,
with your example file:

$ cat /tmp/bob2 
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
$ XX="Line 1"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 4
Line 5
$ XX="Line 2"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
$ XX="Line 22"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 1 c
Line 4
Line 5

Regards.

I tried this but I am not getting any output !! FYI, I am using ksh. I tried sh also but the same result.

There's far better ways to put variables in awk than that, and cramming it into RS is liable to produce gigantic records that will be truncated.

PAT="Line 1"
awk 'NR==FNR { if(match($0, PAT)) P=NR ; next } FNR > P' PAT="Line 1" inputfile inputfile

Note that the input file is given twice, once to find the last pattern, the second time to print everything after it.

If this doesn't work for you, please show exactly how you used it, word for word, letter for letter, keystroke for keystroke.

Now some compilation error

ksh: cat e
Line 1
Line 22
Line 33
Line 1
Line 22
Line 1
Line 4
Line 5
ksh:
ksh: PAT="Line 1"
ksh: awk 'NR==FNR { if(match($0, PAT)) P=NR ; next } FNR > P' PAT="Line 1" e e
awk: syntax error near line 1
awk: illegal statement near line 1
ksh:

Ok,
What 's your operating system ?

SUN OS

Could you try awk 's solutions with /usr/xpg4/bin/awk ?

Hi.

Here we start saving after a pattern match, so over-writing everything before the last match, then print.

#!/usr/bin/env bash

# @(#) s1       Demonstrate delete all previous line before last matching line, awk, gawk.

PATTERN=${1-"Line 1"}

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C awk

pl " Pattern: \"$PATTERN\""

FILE=data1
N=${FILE//[A-Za-z]/}
E=expected-output$N

pl " Input data file $FILE:"
cat $FILE

pl " Expected output:"
cat $E

pl " Results:"
awk -vPATTERN="$PATTERN" '
BEGIN   { i = 0 }
$0 ~ PATTERN    { i =  0; next }
                { i++ ; a = $0 } 
END     { size = length(a) ; for (i=1;i<=size;i++) { print a } }
' $FILE |
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C f1 $E || ( pe; pe " Results cannot be verified." ) >&2

exit 0

producing:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.9 (jessie) 
bash GNU bash 4.3.30
awk GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)

-----
 Pattern: "Line 1"

-----
 Input data file data1:
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5

-----
 Expected output:
Line 4
Line 5

-----
 Results:
Line 4
Line 5

-----
 Verify results if possible:

-----
 Comparison of 2 created lines with 2 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output1 have same content.

Although this was run in Linux, our Solaris has gawk available, so I would expect similar results:

OS, ker|rel, machine: SunOS, 5.11, i86pc
Distribution        : Solaris 11.3 X86
gawk GNU Awk 3.1.8

Best wishes ... cheers, drl

Exactly this was jgt's solution (see post #2) and it got disregarded, probably without any test. Your solution will perhaps meet the same fate.

Note also, that the data shown is not the real data, the expression is not a real expression and something tells me that everything is different including the clock spinning backwards. Good luck writing scripts to solve unknown requirements on unknown data with unknown restrictions.

bakunin

2 Likes

Hi.

Yes, same idea, but not using files, rather memory -- not a big difference for small files, but for very large files, awk will be probably faster up to a memory limit, after which the shell solution could be a winner just for not using too much memory. That'd be a big file.

The sample suggests not-huge data.

Apologies to jgt for not recognizing the same solution, thanks to bakunin for pointing it out... cheers, drl

jgt's solution has UUOC: >outfile simply truncates a file.
The following should be more efficient, and uses a partial *glob* match.
Yet untested, I hope that exec works like that in all shells

var="Line 1"
exec 3>outfile
while read line
do
  case $line in
  *$var*) exec 3>outfile
  ;;
  *)echo "$line" >&3
  ;; 
  esac
done <inputfile

May not be the most efficient:

tac file | sed -n "/$var/q; p;" | tac
Line 4
Line 5
1 Like

Just for fun (with gnu awk) :

$ XX="Line 1"
$ awk  'BEGIN{X=0;ARGV[ARGC++]=ARGV[ARGC-1]}FNR==NR && /'"$XX"'/ {X=FNR} FNR !=NR && FNR > X' file
Line 4
Line 5

Work as Corona688's idea.
Regards.

As a very different approach, how about:-

Req_Line="^Line 1 "                   # Note the leading carat to anchor to begginning of line (if that's what you want) and the trailing space to avoid matching Line 11
                                      # This is used as an Extended Regular Expression, so you can adjust this to suit your needs

IFS=":" read lastline rest < <(grep -En "$Req_Line" filename |tail -1)
((lastline=$lastline+1))

sed -n "$lastline,\$p" filename

For a large file, this has the overhead of perhaps reading the file twice, so you would have to trial it for performance.

It would be sensible to add some error checking, such as what to do if the expression does not match. As it is, this would display the whole file, which might not be what you want.

I hope that this helps, or at least gives an alternate.
Robin

1 Like

Hi.

Perhaps not the most efficient, but, in the absence of requirements from OP, it certainly is simple. I like simple ... cheers, drl

Thanks all.

Disedorgue's 'just for fun' solution worked for me with nawk.

This approach may be more efficient than others for large files and the pattern found towards the end of the file, as tac opens the file from the end (output of strace ):

.
.
.
open("TMPFILE", O_RDONLY)               = 3
lseek(3, 0, SEEK_END)                   = 37790
.
.
.

and, if sed found the pattern and exits, it quits due to a broken pipe

.
.
.
write(1, "sr/bin/gpgsplit\n26696\t/usr/bin/g"..., 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=14645, si_uid=1000} ---

NOT reading the entire input file.