KSH Output/Strip portion of file in HP-UX

austin881 · October 7, 2009, 3:45pm

I have a file, we'll call it file.txt. It has thousands of lines of all kinds of output at any given time (ie. foo bar foo bar)

I need to copy out just a portion of the file from Point-A to Point-B. I'd like to save off just that portion to a file called test123xyz.txt.

How do I do that?

1. foo
2. bar
3. foo
4. bar
5. <<<BEGIN test123xyz (<--Point-A)
6. foo
7. bar
8. foo
9. bar
~
~
~
~
~
~
991. foo
992. bar
993. foo
994. bar
995. >>>END test123xyz (<--Point-B)
996. foo
997. bar
998. foo

Now the good thing is, there are markers where I need to make the copy happen (ie. <<<BEGIN test123xyz)

I'm sure this is a simple grep, sed, awk or something that I can do but I've tried everything I know of with no success. Any ideas?

HP-UX 11.31, korn shell scripting.

steadyonabix · October 7, 2009, 4:15pm

nawk ' ( $0 ~ /<<<BEGIN test123xyz/ ) || ( />>>END test123xyz/ ) { limits = limits" "NR } END { print limits } ' infile | 
xargs -i nawk -v limits={} ' BEGIN{ split( limits, n) } ( NR > n[1] ) && ( NR < n[2] ) ' infile > test123xyz.txt

Scott · October 7, 2009, 4:30pm

C="Point-A"
awk '
/'$C'/ { P = 1; next }
/END/ && P { exit }
P
' file1 | tee test123xyz.txt

steadyonabix · October 7, 2009, 5:04pm

Nice

Two questions: -

Why not: -

awk '
 /BEGIN/ { P = 1; next }
 /END/ { exit }
P
 ' infile

instead of '$C'?

and why does the P on its own line make this work?

Scrutinizer · October 7, 2009, 5:19pm

What 's wrong with a good ol'e shell script
while read lineno line; do
case $line in
"<<<") exec 4>&1 >${line# } ;; # save old handle, redirect output
">>>"*) exec >&4 ;; # restore old handle
*) echo $lineno $line ;;
esac
done < infile > outfile

It wasn't clear to me if filenumbers are part of your input files. If not, the script becomes:
while read line; do
case $line in
"<<<") exec 4>&1 >${line# } ;; # save old handle, redirect output
">>>"*) exec >&4 ;; # restore old handle
*) echo $line ;;
esac
done < infile > outfile

Scott · October 7, 2009, 5:21pm

Hi.

$C was to show how shell variable could be used inside the awk (for example if it was an argument to the script:

C=${1:-Point-A}

etc.

It could have been done in other ways

/Point-A/ { ... }

awk '
...
'  C=Point-A

(and then refer to it as C instead of $C)

awk -v C=Point-A '
...
' ...

Matching "BEGIN" isn't enough anyway, as there are many BEGINs, but hopefully only one Point-A.

P makes it work because when P is true (i.e. 1) the default action (print) is executed. P is set when $C is matched.

Hoi, Scrutinizer, there is absolutely nothing wrong with the good ol' shell script! AWK is just faster if you have a fair amount of data.

steadyonabix · October 8, 2009, 2:11am

Thanks for the explanation.

It finally dawned on me last night that it must be because P evaluates true so nice and compact code, thanks.

I followed the alternatives you gave except this one: -

awk '
...
'  C=Point-A

How does setting C after the awk work?

Scott · October 8, 2009, 5:11am

Hi.

It works in the same way as setting it with -v C=... except that it is processed after the BEGIN clause. If you have a begin clause use the -v option.