searching and editing file contents

jayana · February 5, 2008, 8:39pm

Can you please help me to edit parts of a file and write into a new file.

Suppose I have a huge data dump in a file I need to search for a tag in that and cut few lines around that tag in the file. Is there a way to keep track of line numbers and operate on the file.

I can explain with example.
I have this file AAA
-----AAA------
Start:
name:1111
date:222
id:3333
address:12444
end
Start:
name:5555
date:3312
id:6666
address:qwds
end
Start:
name:7777
date:9090
id:4571
address:abc444
end
----------------
ID is unique for each start-end block in above file.
In above file, I need to search for a ID, say 4571 and just extract the block Start-End having details about id:4571
expected output is:
name:7777
date:9090
id:4571
address:abc444

can you please suggest how I can go about getting the necessary data from AAA given a ID as input?

rikxik · February 5, 2008, 9:55pm

$ cat AAA
Start:
name:1111
date:222
id:3333
address:12444
end
Start:
name:5555
date:3312
id:6666
address:qwds
end
Start:
name:7777
date:9090
id:4571
address:abc444
end
$
$
$
$ cat srch.sh
#!/usr/bin/ksh

idval=$1
fn=$2
[ -z "${idval}" ] && exit 1
[ ! -s "${fn}" ] && exit 2
ln=`grep -n "^id:${idval}$" $fn |cut -d":" -f1`
st=$(( ln - 2))
to=$(( ln + 1))
sed -ne "${st},${to}p" AAA
$
$
$ ./srch.sh 4571 AAA
name:7777
date:9090
id:4571
address:abc444

HTH

mirusnet · February 5, 2008, 9:57pm

grep -A 1 -B 2 'id:4571' file.txt

rikxik · February 5, 2008, 10:00pm

Context options in grep are not universal. e.g. the -A / -B options won't work on solaris grep.

mirusnet · February 5, 2008, 10:01pm

There is no comment regarding the OS.

rikxik · February 5, 2008, 10:03pm

Well, doesn't hurt to have a "more-compatible" solution!

mirusnet · February 5, 2008, 10:06pm

.

vgersh99 · February 6, 2008, 4:00am

Pls read the Rules of these forums : Rule 9
And btw: -

Klashxx · February 6, 2008, 4:32am

Hi jayana , try this one:

> ID=4571                                                                              
>  awk '/end/{f=0;next}/id:'"${ID}"'/{f=1}f' file

...ummm , actually this is the good one:

> ID=4571
> awk '/Start/{i=0}/id:'"${ID}"'/{f=1}{a=$0;i++}/end/&&f{exit}END{for (j=1;j<i-1;++j)print a[j]}' file

manas_ranjan · February 6, 2008, 7:18am

dear jayana,

please try the following one ,

awk -F":" '
/^Start:/ {
fn=0
}
/^id:/ {
close(fn)
fn = $2 ".txt"
$0 = prev RS $0
}
fn {
print > fn
}
{
prev = $0
}' FileName

LAKSHMI_NARAYAN · February 6, 2008, 7:22am

idval=$1
filename=$2
x=`grep -n "$idval" $filename | cut -d ":" -f1`
echo $x
start=`echo ${x}-2 | bc`
end=`echo ${x}+1 | bc`
echo $st
echo $to
sed -ne "${start},${end}p" $filename

jayana · February 6, 2008, 5:08pm

Hi friends, thanks for all your responses, but I have a concern here that the end need not always be after 1 line of the ID. we may have more information after the ID, but END just marks the end of the block of information.
Once I get the line number for ID (say line 35) using the grep -n, is there a way to search for the first instance of END(line 40) after it? I can get start as 2 lines behind ID (line 33) .
Once I have line 33 & 40, can I cut the lines in between and get the 8 lines into a new file?
Awk seems a little complex to understand, hence trying to explore more on grep solutions. The copy paste of awk command at shell prompt:awk '/Start/{i=0}/id:'"${ID}"'/{f=1}{a[i]=$0;i++}/end/&&f{exit}END{for (j=1;j<i-1;++j)print a[j]}' file
gave me syntax error at line 1, i could not debug the same.

Thanks for all your support

Klashxx · February 6, 2008, 5:52pm

Try this:

> ID=4571
>  awk '/Start/{i=0}/^id:'"${ID}"'$/{f=1}{a=$0;i++}/end/&&f{exit}END{if (f)for (j=1;j<i-1;++j)print a[j]}' file

drl · February 6, 2008, 6:10pm

Hi.

If you are going to do this a lot, it may make sense to put some work into it. This kind of task is well-suited to cgrep, context-grep, a utility available from Bell-Labs. Here is an example using some of your data:

#!/usr/bin/env sh

# @(#) s1       Demonstrate cgrep.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) cgrep
set -o nounset

echo

FILE=${1-data1}
echo " Input file $FILE:"
cat $FILE

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing:"
cgrep -I2 -w "Start:" +I2 +w "end" "id:4571" $FILE

echo
echo " Results from processing:"
cgrep -I2 -w "Start:" +I2 +w "end" "id:4600" $FILE

echo
echo " Results from processing:"
cgrep -I2 -w "Start:" +I2 +w "end" "id:4700" $FILE

exit 0

Producing:

% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
cgrep (local) - no version provided.

 Input file data1:
Start:
name:1111
date:222
id:3333
address:12444
end
Start:
name:5555
date:3312
id:6666
address:qwds
end
Start:
name:7777
date:9090
id:4571
address:abc444
end
Start:
id:4600
end
Start:
stuff1
stuff2
stuff3
id:4700
stuff4
stuff5
stuff6
stuff7
stuff8
end

 Results from processing:
========================================
name:7777
date:9090
id:4571
address:abc444

 Results from processing:
========================================
id:4600

 Results from processing:
========================================
stuff1
stuff2
stuff3
id:4700
stuff4
stuff5
stuff6
stuff7
stuff8

The "+-w" indicate the windowing patterns, the "+-I2" cause omission of the window bracket lines.

You need to get and compile the program. It comes with a man page. The web page is cgrep home page .

It is a very useful (but non-standard) member of the grep family ... cheers, drl

jayana · February 6, 2008, 7:18pm

Thanks for the solution with CGREP.
I tried downloading the TAR file from the link you specified, but it gives me checksum error when I try to untar, so i could not install cgrep on my system

drl · February 6, 2008, 7:27pm

Hi.

I just downloaded it again, and tar didn't complain.

You noticed that it was a compressed tar archive, cgrepsrc.tar.gz, yes? ... cheers, drl

jayana · February 7, 2008, 11:47am

In continuation to this problem, using the tips given by all of you for GREP, I am able to find the line number for start & end. Any hints about how to cut part of a file and write into another file?

Klashxx · February 7, 2008, 1:19pm

> cat kk
line 1
line 2
line 3
line 4
line 5
line 6
> sed -n "3,5p" kk
line 3
line 4
line 5

summer_cherry · February 12, 2008, 3:45am

Hi,
I think below one is ok for you,just try it.

echo "input id"
read id
cat filename | paste - - - - - - | sed 's/	/,/g' | nawk -v var="$id" 'BEGIN{FS=","}
{
temp=sprintf("id:%s",var)
if($4==temp)
{
	print $2
	print $3
	print $4
	print $5
}
}'