Delete values between 2 patterns

redse171 · May 23, 2014, 1:42pm

Hi,

How can i delete values between 2 patterns like below:-

input.txt

192.1.1.2.22      blablabala
23.1.A.1.2        blablabalbl
5.4.1.1.12        blablaba

i need to delete all values between starting from "." no 3 and second column. the output should be:

192.1.1      blablabala
23.1.A       blablabalbl
5.4.1        blablaba

i tried using sed like this but get error.

sed '/\./3,/\t/d' input.txt

i can feel that it is not that difficult but just couldn't find what is wrong with it. Can anyone help me with my code? thanks

bartus11 · May 23, 2014, 1:49pm

Try:

perl -pe 's/(([^.]+\.){2}[^.]+).*?(\s+.*)/$1$3/' input.txt

redse171 · May 23, 2014, 2:33pm

Hi bartus11,

yes, it worked perfectly. Thanks so much. However, I am trying to learn awk and sed. I don't know perl and i just run your code to get the result. But, i need to know how to do it in Sed or awk. i did the following codes to get my results. it is a two-stage work:

1) i changed the third "." to other unique char (#) and save in a temporary file (out1)

sed 's/\./#/3' inputfile.txt > out1

then, i deleted the values between the patterns

sed 's/#.*[\t]/\t/' out1 > out2

where out2 is the final output. I got the right output but, i want a sed or awk code that only one liner as your perl. Thanks

Akshay_Hegde · May 23, 2014, 2:35pm

$ awk '{split($1,A,/\./);$1=A[1]"."A[2]"."A[3]}1' OFS='\t'  file
192.1.1	blablabala
23.1.A	blablabalbl
5.4.1	blablaba

alister · May 23, 2014, 2:47pm

redse171:

But, i need to know how to do it in Sed or awk. i did the following codes to get my results. it is a two-stage work:

1) i changed the third "." to other unique char (#) and save in a temporary file (out1)
sed 's/\./#/3' inputfile.txt > out1
then, i deleted the values between the patterns
sed 's/#.*[\t]/\t/' out1 > out2
where out2 is the final output. I got the right output but, i want a sed or awk code that only one liner as your perl. Thanks

If the file uses a tab delimiter and there is only 1 per line:

sed 's/\./\t/3; s/\t.*\t/\t/' file

That isn't portable sed, because \t is primarily a GNU extension. However, if it worked for you, and you only care about that platform, it should suffice.

Regards,
Alister

redse171 · May 23, 2014, 3:06pm

Hi Akshay Hedge,

For your code, i got the right output for the first column. However, i have more more information in column 2. as the result, i got this output:

192.1.1	blablabala       kdlfkll    dhdkskdks
23.1.A	blablabalbl      mdkskd   
5.4.1	blablaba         blablaba        blablaba

all the whitespace between the words becomes tab delimited. It is my bad though as i did not give u better sample. thanks

protocomm · May 23, 2014, 3:10pm

sed -n 's|\([0-9]*\.[0-9]*\.[A-Z0-9]*\)\(\.[0-9]*\.[0-9]*\)\(.*\)|\1\3|p'

redse171 · May 23, 2014, 3:14pm

Hi Alister,

yes, the file only contains 2 columns and it is tab delimited. and yes, your codes worked great. Thanks

---------- Post updated at 03:14 PM ---------- Previous update was at 03:12 PM ----------

Hi protocomm,

tried your codes but unfortunately it didn't give me any result. thanks

protocomm · May 23, 2014, 3:36pm

My code works on my mac, i use bash.

sorry.

Akshay_Hegde · May 23, 2014, 3:47pm

Hi!

if you have gnu awk you may try this also

$ awk --re-interval '{match($0,/[^.]+(\.[^.]+){2}/);$0 = substr($0,RSTART,RLENGTH) substr($0,length($1)+1)}1'   file 

192.1.1      blablabala
23.1.A        blablabalbl
5.4.1        blablaba

---------- Post updated at 02:17 AM ---------- Previous update was at 02:11 AM ----------

else we can modify something like this

$ awk '{split($1,A,/\./);$0= A[1]"."A[2]"."A[3] substr($0,length($1)+1)}1' file

redse171 · May 23, 2014, 3:50pm

Hi Akshay Hedge,

I dont have gnu awk. i tried your second codes and it worked perfectly. Thanks

Don_Cragun · May 23, 2014, 4:29pm

If you would tell us what system you're using, we'd have a better chance of giving you working code. There are significant differences between how sed behaves with back references depending on which sed you're using.

Apparently you are not using Mac OS X and you are not using any version of Linux. What OS are you using?

redse171 · May 23, 2014, 5:29pm

I am using linux (ubuntu).

alister · May 23, 2014, 5:56pm

Hi, Don. Can you elaborate?

Regards,
Alister

---------- Post updated at 05:56 PM ---------- Previous update was at 05:52 PM ----------

Makes sense. Last I heard, default sed on Ubuntu is GNU sed and default AWK is mawk.

Regards,
Alister

Don_Cragun · May 23, 2014, 6:41pm

I should have said BREs rather than back references, but the times I most often see a sed substitute command fail is when I'm using back references. On the Linux sed man page provided in these forums, you'll find:

I don't currently have access to a Linux system and I can't give a clear statement of what POSIX.2 BRE features are not supported by the Linux sed utility, but I have seen several cases in these forums where a standards conforming sed script (such as the one protocomm posted in this thread:

sed -n 's|\([0-9]*\.[0-9]*\.[A-Z0-9]*\)\(\.[0-9]*\.[0-9]*\)\(.*\)|\1\3|p'

or a similar suggestion I would have made until I saw protocomm's suggestion and the statement that it didn't work:

sed 's/^\([[:alnum:]]*[.][[:alnum:]]*[.][[:alnum:]]*\)[.[:alnum:]]*/\1/' input.txt

) fail on the GNU utilities version of sed but work as specified by the standards with an AIX, HP/UX, OS X, or Solaris system sed utility.

protocomm · May 24, 2014, 2:59am

don cragun:

I should have said BREs rather than back references, but the times I most often see a sed substitute command fail is when I'm using back references. On the Linux sed man page provided in these forums, you'll find:

I don't currently have access to a Linux system and I can't give a clear statement of what POSIX.2 BRE features are not supported by the Linux sed utility, but I have seen several cases in these forums where a standards conforming sed script (such as the one protocomm posted in this thread:
sed -n 's|\([0-9]*\.[0-9]*\.[A-Z0-9]*\)\(\.[0-9]*\.[0-9]*\)\(.*\)|\1\3|p'
or a similar suggestion I would have made until I saw protocomm's suggestion and the statement that it didn't work:
sed 's/^\([[:alnum:]]*[.][[:alnum:]]*[.][[:alnum:]]*\)[.[:alnum:]]*/\1/' input.txt
) fail on the GNU utilities version of sed but work as specified by the standards with an AIX, HP/UX, OS X, or Solaris system sed utility.

Don Cragun, your line code works fine on my macbook.