Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
I want to delete all lines before last occurrence of a line which contains something which is defined in a variable. Say a variable var contains 'Line 1', then I need the following in the output.
Thx.. But this will not work. Lines are not just 'Line 1'. They contains some more text a;so. And I need to delete all lines before the last occurrence.
$ cat /tmp/bob2
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
$ XX="Line 1"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 4
Line 5
$ XX="Line 2"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
$ XX="Line 22"
$ awk -vRS="$XX(\n| [^\n]+\n)" -vORS="" 'END{print}' /tmp/bob2
Line 1 c
Line 4
Line 5
ksh: cat e
Line 1
Line 22
Line 33
Line 1
Line 22
Line 1
Line 4
Line 5
ksh:
ksh: PAT="Line 1"
ksh: awk 'NR==FNR { if(match($0, PAT)) P=NR ; next } FNR > P' PAT="Line 1" e e
awk: syntax error near line 1
awk: illegal statement near line 1
ksh:
Here we start saving after a pattern match, so over-writing everything before the last match, then print.
#!/usr/bin/env bash
# @(#) s1 Demonstrate delete all previous line before last matching line, awk, gawk.
PATTERN=${1-"Line 1"}
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C awk
pl " Pattern: \"$PATTERN\""
FILE=data1
N=${FILE//[A-Za-z]/}
E=expected-output$N
pl " Input data file $FILE:"
cat $FILE
pl " Expected output:"
cat $E
pl " Results:"
awk -vPATTERN="$PATTERN" '
BEGIN { i = 0 }
$0 ~ PATTERN { i = 0; next }
{ i++ ; a = $0 }
END { size = length(a) ; for (i=1;i<=size;i++) { print a } }
' $FILE |
tee f1
pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C f1 $E || ( pe; pe " Results cannot be verified." ) >&2
exit 0
producing:
$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.9 (jessie)
bash GNU bash 4.3.30
awk GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
-----
Pattern: "Line 1"
-----
Input data file data1:
Line 1 a
Line 22
Line 33
Line 1 b
Line 22
Line 1 c
Line 4
Line 5
-----
Expected output:
Line 4
Line 5
-----
Results:
Line 4
Line 5
-----
Verify results if possible:
-----
Comparison of 2 created lines with 2 lines of desired results:
Succeeded -- files (computed) f1 and (standard) expected-output1 have same content.
Although this was run in Linux, our Solaris has gawk available, so I would expect similar results:
OS, ker|rel, machine: SunOS, 5.11, i86pc
Distribution : Solaris 11.3 X86
gawk GNU Awk 3.1.8
Exactly this was jgt's solution (see post #2) and it got disregarded, probably without any test. Your solution will perhaps meet the same fate.
Note also, that the data shown is not the real data, the expression is not a real expression and something tells me that everything is different including the clock spinning backwards. Good luck writing scripts to solve unknown requirements on unknown data with unknown restrictions.
Yes, same idea, but not using files, rather memory -- not a big difference for small files, but for very large files, awk will be probably faster up to a memory limit, after which the shell solution could be a winner just for not using too much memory. That'd be a big file.
The sample suggests not-huge data.
Apologies to jgt for not recognizing the same solution, thanks to bakunin for pointing it out... cheers, drl
jgt's solution has UUOC: >outfile simply truncates a file.
The following should be more efficient, and uses a partial *glob* match.
Yet untested, I hope that exec works like that in all shells
var="Line 1"
exec 3>outfile
while read line
do
case $line in
*$var*) exec 3>outfile
;;
*)echo "$line" >&3
;;
esac
done <inputfile
Req_Line="^Line 1 " # Note the leading carat to anchor to begginning of line (if that's what you want) and the trailing space to avoid matching Line 11
# This is used as an Extended Regular Expression, so you can adjust this to suit your needs
IFS=":" read lastline rest < <(grep -En "$Req_Line" filename |tail -1)
((lastline=$lastline+1))
sed -n "$lastline,\$p" filename
For a large file, this has the overhead of perhaps reading the file twice, so you would have to trial it for performance.
It would be sensible to add some error checking, such as what to do if the expression does not match. As it is, this would display the whole file, which might not be what you want.
I hope that this helps, or at least gives an alternate.
Robin
This approach may be more efficient than others for large files and the pattern found towards the end of the file, as tac opens the file from the end (output of strace ):