how to extract columns from a text file

ihot · May 4, 2008, 2:18am

Hi,
In ksh, I have a file with similar rows as follows:
Department = 1234 G/L Asset Acct No = 12.0000. 2/29/2008
Department = 1234 G/L Asset Acct No = 13.0000. 3/29/2008.

I want to create a new text file that contains only the numbers and date:
1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

Should I use "cut" to cut out what I need?
Also, how do I make sure blank lines in original file do not get copied over to the new file?
Thanks in advance.

swamymns · May 4, 2008, 2:29am

hi,
try this,

awk '{print $9$10}' test > new_file

where the content of test is ,
Department = 1234 G/L Asset Acct No = 12.0000. 2/29/2008
Department = 1234 G/L Asset Acct No = 13.0000. 3/29/2008

tjmannonline · May 4, 2008, 6:06am

you forget $3

learnbash · May 4, 2008, 6:16am

with spaces we can use actual like that.

awk '{print $3 " " $9 " " $10}' afile

1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

Regards,
Bash

learnbash · May 4, 2008, 6:36am

This can also help.

cut -f 3,9,10 -d ' ' afile

1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

Regards,
Bash

ihot · May 4, 2008, 11:05am

Thanks a million gentlement...You guys are the greatest. You just helped out a deptartment in high tech company in Silicon Valley.

ihot · May 4, 2008, 7:31pm

Oops, as it turns out I have access only to "cut" command in UnixDos. My text file actually has a lot of spaces, so the rows look like these:
Department = 1234 __________G/L Asset Acct No = 12.0000._____2/29/2008
Department = 1234___________G/L Asset Acct No = 13.0000._____3/29/2008
I had to use underscore instead of spaces above because this website removes extra spaces from the rows above.

I tried to use cut -c14-18 -c37-39 textfile but it would not cut multi columns.
My final output file should look like these:
1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

Is there a way to cut multiple columns? UnixDos does not have "awk".
Thanks again.

danmero · May 4, 2008, 8:48pm

Check @learnbash cut and awk solution!

gnom · May 4, 2008, 8:54pm

The cut from learnbash works perfectly

cut -f 3,9,10 -d ' ' exampldata.txt
1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

What is ur problem with it?

Cheers
gnom

ihot · May 4, 2008, 10:25pm

Terribly sorry wizards to bug you all again. There was an additional extraction I needed to perform. If I use the cut -f suggest above, I will get this interim text file:

1234 12.0000. 2/29/2008
1234 13.0000. 3/29/2008

However, I really want to create a text file that looks like this:
1234 12 0000 2/29/2008
1234 13 0000 3/29/2008

so that I can load into my Hyperion database. So if I can do one cut using -c (character/column) at multiple columns, that would give me the final result using only one "cut".

By the way, does anyone know of a software similar to UnixDos5.1a that will give me "awk" as well as "cut" Unix tools? UnixDos5.1a deficiency is that it does not have "awk" utility.

Annihilannic · May 4, 2008, 10:44pm

Cygwin includes all of the GNU utilities.

ihot · May 4, 2008, 11:44pm

Hi again,
the following will not work:
cut -f 3,9,10 -d ' ' exampldata.txt

The reason you think it works is because this website removes spaces from my rows. So let me try to post my rows again but this time I will use underscore instead of spaces

My text file actually has a lot of spaces, so the rows look like these:
Department = 1234 __________G/L Asset Acct No = 12.0000._____2/29/2008
Department = 1234___________G/L Asset Acct No = 13.0000._____3/29/2008
(again, I had to use underscore instead of spaces above because this website removes extra spaces from the rows above)

I tried to use cut -c14-18 -c37-39 textfile but "cut" would not cut multiple columns.
My final output file should look like these:
1234 12 0000 2/29/2008
1234 13 0000 3/29/2008

To rephrase my question: Is there a way to cut multiple sections from a row (so that I can paste it into another row in another file)? "cut" does not seem to allow -c of multiple columns.

Annihilannic · May 5, 2008, 12:53am

How about:

awk -F '[ .]+' '{print $3,$9,$10,$11}'

era · May 5, 2008, 8:34am

cut -c14-17,48-55,61-70 appears to work for the data you posted, but it's not clear if the data you posted is entirely correctly represented. Please use code tags and post the data exactly as it should be if you still need help.

inquirer · May 5, 2008, 10:21am

the real problem is that the spaces between the columns are not constant. without awk you will need to make a work around using cut and paste and sed. these 3 are available in UnixDos.

instead of thinking space delimited, your sample data is also "=" delimited.

try this:

cut -f2 -d"=" exampledata.txt | sed 's/ //' | cut -f1 -d" " > output1.txt
cut -f3 -d"=" exampledata.txt | sed 's/ //g' | sed 's/\./ /g' > output2.txt

paste -d" " output1.txt output2.txt > output.txt

hope this helps

ihot · May 5, 2008, 6:40pm

Hi,
cut -c14-17,48-55,61-70 works the way I wanted it.
I will try some of the others as well - just to educate myself.
Thanks again wizards. It is great to know that there are champs out there taking their valuable time to help others. This encourages me to do likewise.

inquirer · May 5, 2008, 11:33pm

cut with the "-c" is only good if your data has consistent length. otherwise, you will have problem in your future data.