I have data a.txt:
I want to reformat file to look like this:
basically with the 3rd columns having leading zeros removed.
My code a.awk:
awk '{ v=substr($0, 48,10); print substr($0,1,7)"|"substr($0,8,30)"|"gsub("0*","
",v)"|"substr($0,59,4)substr($0,64,2)substr($0,67,2)"|"}' a.txt
always returns '2' in the third column and I'm wondering why and how i can strip out the leading zeroes while awk'ing:
I'm trying to avoid doing another while read file just to make that substitution.
I would use 'bc" if it is possible.
Thanks.
Gianni
gsub dosn't return a string result, it updates variable in place so do gsub on your variable before the print.
awk '{ v=substr($0, 48,10); gsub("^0*","",v);
print substr($0,1,7)"|"substr($0,8,30)"|"v"|"substr($0,59,4)substr($0,64,2)substr($0,67,2)"|"}' a.txt
1 Like
anbu23
November 17, 2010, 2:37pm
3
$ awk '{ v=substr($0, 48,10); gsub("^0*","",v); print substr($0,1,7)"|"substr($0,8,30)"|" v "|"substr($0,59,4)substr($0,64,2)substr($0,67,2)"|"}' file
1234567|01234567890abcdefghijklmnopqrs|1|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|9|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|50|ccyymmdd|
$ awk '{ print substr($0,1,7)"|"substr($0,8,30)"|" int(substr($0, 48,10)) "|"substr($0,59,4)substr($0,64,2)substr($0,67,2)"|"}' file
1234567|01234567890abcdefghijklmnopqrs|1|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|9|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|50|ccyymmdd|
1 Like
ctsgnb
November 17, 2010, 2:38pm
4
sed 's:0:|0:;s:ccyy-mm-dd0*:|:;s:P:|:;s:-::g;s:[A-Z]*$:|:' a.txt
$ cat a.txt
123456701234567890abcdefghijklmnopqrsccyy-mm-dd0000000001Pccyy-mm-ddABCDEFGH
123456701234567890abcdefghijklmnopqrsccyy-mm-dd0000000009Pccyy-mm-ddABCDEFGH
123456701234567890abcdefghijklmnopqrsccyy-mm-dd0000000050Pccyy-mm-ddABCDEFGH
$ sed 's:0:|0:;s:ccyy-mm-dd0*:|:;s:P:|:;s:-::g;s:[A-Z]*$:|:' a.txt
1234567|01234567890abcdefghijklmnopqrs|1|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|9|ccyymmdd|
1234567|01234567890abcdefghijklmnopqrs|50|ccyymmdd|
$
1 Like
awk '{ v=substr($0, 48,10)+0; print substr($0,1,7),substr($0,8,30),v,substr($0,59,4)substr($0,64,2)substr($0,67,2)OFS}' OFS='|' a.txt
1 Like
Wow. Thank you so much everyone!! I like the +0 solution for it's simplicity.
-Gianni
sed 's/\(.\{7\}\)\(.\{30\}\).*00*\(.*\)P\(....\)-\(..\)-\(..\).*/\1|\2|\3|\4\5\6|/' file
GNU sed:
sed -r 's/(.{7})(.{30}).*00*(.*)P(....)-(..)-(..).*/\1|\2|\3|\4\5\6|/' file
ctsgnb
November 17, 2010, 3:50pm
8
i intended to avoid \<n> use ( \1 \2 ...)
Could you run a runtime test and post the result between with and without \<n> notation ?
Yes, it is good to avoid grouping, but in this case the selection depends on positional selection and therefore I don't think your solution will work with the actual input file because it bases on specific values of certain characters that I think will probably end up being different. You could make a combination of positional orientation and pattern anchoring, for example:
sed 's/.\{7\}/&|/;s/|.\{30\}/&|/;s/....-..-..0*0//;s/P/|/;s/-//g;s/.\{8\}$/|/'