sdf
November 23, 2011, 6:21am
1
Have columns with digits and strings like:
input.txt
3840 3841 3842 Dav Thun Tax
Cahn 146; Dav.
3855 3853 3861 3862 Dav Thun Tax
2780 Karl VI.,
3873 3872 3872 Dav Thun Tax
3894 3893 3897 3899 Dav Thun Tax
403; Thun 282.
3958 3959 3960 Dav Thun Tax
3972 3972 3972 3975 Dav Thun Tax
Rom. Dav. 145;
4006 4005 4007 Dav Thun Tax
output.txt
3842 Dav Thun Tax
Cahn 146; Dav.
3862 Dav Thun Tax
2780 Karl VI.,
3872 Dav Thun Tax
3899 Dav Thun Tax
403; Thun 282.
3960 Dav Thun Tax
3975 Dav Thun Tax
Rom. Dav. 145;
4007 Dav Thun Tax
Can anybody help me on the code
ygemici
November 23, 2011, 7:05am
2
if you have gnu sed maybe u can try this
$ sed '1~2s/.* \([^ ]* [^ ]* [^ ]* [^ ]*\)/\1/'
1 Like
sdf
November 23, 2011, 7:08am
3
Thanks, though i use gawk on windows i can't use sed.
Hi ygemici,
Nice
$ sed '1~2s/.* \([^ ]* [^ ]* [^ ]* [^ ]*\)/\1/'
Can you please explain me how it works/run.
sdf
November 23, 2011, 8:51am
5
OK got sed to run on gnu it erases a lot of strings. So i will want to run part on code and the rest i will do by hand. I came up with this code
awk '{if(length($1)==4 && $1=="[0-9]" && length($2)==4 && $2=="[0-9]" && length($3)==4 && $3=="[0-9]" ) print $1,$2,$3}' input.txt to_correct_ouput.txt
The code won't work can anybody help on correcting.
ygemici
November 23, 2011, 8:54am
6
you can use sed on windows..
sed for Windows
CarloM
November 23, 2011, 9:12am
7
1~2s
for line 1 and every 2nd line after that, substitute
/.*
any number of any characters, followed by a space and
\([^ ]* [^ ]* [^ ]* [^ ]*\)
(stored sub-pattern) any number of non-space characters followed by a space (*3), followed by any number of non-space characters (i.e. the last 4 fields in the line, space-separated).
/\1/
replace with the text that matched stored sub-pattern 1
1 Like
Try this...
awk '/^[0-9]/{match($0,"([0-9]*[; ]*[a-zA-Z].*)",a); $0=a[1]}1' input_file
--ahamed
1 Like
ygemici
November 23, 2011, 9:42am
9
1~2p says to SED that work on every 2nd line starting from 1. line..
so skip/ignore 2-4-6...lines and works other lines..