Need to undo hyphenation in columns

jongudm · December 15, 2011, 8:21pm

I have a file with two columns (output from Tivoli Storage Manager) where each column has 13 character spaces and they are separated by 5 spaces. The columns are schedule names and node names and many of them are longer than 13 characters so they get hyphenated by TSM during the output. I want to undo the hyphenation to avoid losing part of the names when I need to grep out of the file later in the process. A small (somewhat fictionalised) sample of the data:

FILE_DAILY_20         THOISEDI01   
FILE_DAILY_20         UMSISHUB01   
FILE_DAILY_20         SKOISWEB03   
FILE_DAILY_20                     APPDK-IDES60-
                                                              SOLS        
FILE_DAILY_20         NYHREKEDI02  
FILE_DAILY_20          LANDSBJ-EXPR-
ESS

This does not display properly but both short lines (SOLS and ESS) are parts of the second column, not the first, and start at character 20.

My approach to this (for the second column) has been to find lines where the 31st character is a hyphen and try to replace the hyphen with the 20th to 30th characters from the next line. I've been trying to do this with sed (which I've never really used before) and what I've got so far looks like this:

sed 's/^\(.\{30\}\)-/\1placeholder/'

with 'placeholder' standing in for 'something that finds the 20th to 30th characters from the next line and puts them where the hyphen is'. How do I do that part? Is this a workable approach at all? If not, then what should I be using instead?

Any help or suggestions would be much appreciated.

balajesuri · December 15, 2011, 11:19pm

perl -ne 's/-\n$/-/g; print' inputfile.txt

michaelrozar17 · December 16, 2011, 3:14am

With Sed..

sed -n '1{h;n};H;${x;s/-\n/-/gp}' inputfile

jongudm · December 16, 2011, 4:38am

Thanks very much, that solves it!