Hi
I have the input file as below
***TEST10067
00567GROSZ 099
00567CTCTSDS90
***TEST20081
08233GROZWEWE
00782GWERW899
***TEST30088
08233GROZWEWE
00782GWERW899
I am finding the lines starting with *** and outputing as below
TEST10067
TEST20081
TEST30088
I need a space between TEST1 and 0067 and similarly for the other records.
grep ^\* 100109.C|cut -c 4-8,' ',9-12
I tried with the above. Is there anyway to format the output with a space.
as i am not getting the desired results.
Regards
Dhana
jaduks
May 17, 2008, 12:49am
2
$ cat dh.txt
***TEST10067
00567GROSZ 099
00567CTCTSDS90
***TEST20081
08233GROZWEWE
00782GWERW899
***TEST30088
08233GROZWEWE
00782GWERW899
$ sed -n '/^\*/p' dh.txt
***TEST10067
***TEST20081
***TEST30088
$ sed -n '/^\*/p' dh.txt | sed 's/\*\*\*\([A-Z]*\)\([0-9]*\)/\1 \2/'
TEST 10067
TEST 20081
TEST 30088
//Jadu
era
May 17, 2008, 3:50am
3
Might as well condense it into a single sed script.
sed -n '/^*/!d;s/\*\*\*\([A-Z]*\)\([0-9]*\)/\1 \2/p' dh.txt
Hi
I got the output as below
$ sed -n '/^*/!d;s/\*\*\*\([A-Z]*\)\([0-9]*\)/\1 \2/p' filename
TEST1 006701
TEST2 0081 02
TEST3 0088*03
I need only
TEST1 0067
TEST2 0081
TEST3 0088
and not *01,*02 and *03. I am trying it out.
Meanwhile is there any other simplest way of doing it.
Regards
Dhana
rubin
May 18, 2008, 10:01pm
5
awk -F"TEST" '/^\*/{split($2,a,"");print FS a[1], a[2]a[3]a[4]a[5]}' filename
qneill
May 18, 2008, 10:08pm
6
If your input is indeed fixed width, here's a one-liner in ruby:
$ ruby -e 'STDIN.readlines.each { |l| puts "#{l[3..7]} #{l[8..-1]}" if l[0..2] == "***" }' <filename
And the same sort of thing in python:
python -c 'import sys; print "\n".join([ "%s %s" % (l[3:7], l[8:-1]) for l in sys.stdin.readlines() if l[0:3] == "***" ])' <filename
Or in awk
awk '/^\*\*\*/ { print(substr($1,4,5), substr($1,9,10)); }' <filename
--
Q
Hi
The awk looks good to me and i tried changing on it
awk '/^\*\*\*/ { printf "%s %s\n",substr($1,4,5), substr($1,9,4) }' < filename
if my file has the following inputs
***BRRAA0067**
TESTSS
sdfasdf
SIZZ 0081
sdfas
sdfasd
***TYPEE0078 *
dsfas
asdfasdf
I am getting the below output
BRRAA 0067
SIZZ
TYPEE 0078
If you note that SIZZ has only four characters and a space then i am getting the output as shown above. But i need 0081 also coming up after SIZZ .
Do any one of you have some idea on what needs to be changed ?
REgards
Dhana
Instead of $1 , it must be $0
Try this,
awk '/^\*\*\*/ { printf "%s %s\n",substr($0,4,5), substr($0,9,4) }' file
Thanks
Penchal
sumeet
May 19, 2008, 12:36am
9
$cat sum
TEST10067
TEST20081
TEST30088
$ sed -e 's/00/ 00/g' sum
TEST1 0067
TEST2 0081
TEST3 0088
Hi
This works fine
awk '/^\*\*\*/ { printf "%s %s\n",substr($0,4,5), substr($0,9,4) }' file
Thanks Penchal
Also the last thread which uses sed will not help much as you are hardcoding 0's for searching the string which is not generalised.
Anyways thanks for your ideas.
Regards
Dhana