Preserving file format and spacing in output file

sudeep.id · April 27, 2012, 9:37am

Hi

I have a file with the following structure

"VATTENFALL GLOBAL"                        "Vattenfall Tray"  
"BARCLAYS BANK LONDON"                   "Capula"                      
"P1 AGEAS GLOBAL COMPANY"              "AAC - Optiver"

The requirement is like this

1) Take 2 input from user, 1st the string to replace and 2nd new string.
2) Serach the 1st string in the file and replace it with second string.
3) The alignment and spacing between the columns should not be altered if there is difference between the old string and new string.

How can i preserve the alignment of the columns. Any help is welcomed.

Thanks,
Sudeep

radoulov · April 27, 2012, 9:46am

What have you tried so far?

panyam · April 27, 2012, 9:47am

 
SCRIPTS>cat input_file
"VATTENFALL GLOBAL" "Vattenfall Tray"
"BARCLAYS BANK LONDON" "Capula"
"P1 AGEAS GLOBAL COMPANY" "AAC - Optiver"

SCRIPTS>v1="Hello man"

SCRIPTS>v2="GLOBAL"

SCRIPTS>sed "s/$v2/$v1/" input_file
"VATTENFALL Hello man" "Vattenfall Tray"
"BARCLAYS BANK LONDON" "Capula"
"P1 AGEAS Hello man COMPANY" "AAC - Optiver"

sudeep.id · April 27, 2012, 9:52am

panyam:

 
SCRIPTS>cat input_file
"VATTENFALL GLOBAL" "Vattenfall Tray"
"BARCLAYS BANK LONDON" "Capula"
"P1 AGEAS GLOBAL COMPANY" "AAC - Optiver"
 
SCRIPTS>v1="Hello man"
 
SCRIPTS>v2="GLOBAL"
 
SCRIPTS>sed "s/$v2/$v1/" input_file
"VATTENFALL Hello man" "Vattenfall Tray"
"BARCLAYS BANK LONDON" "Capula"
"P1 AGEAS Hello man COMPANY" "AAC - Optiver"

With the above code, the string would be replaced but the alignment would not be preserverd. What I require is suppose column 1 is 45 chars and col 2 starts from 46th chars, even after substitution of the string the col 2 should start from 46th char only.

Scrutinizer · April 27, 2012, 10:24am

What have you tried so far?

panyam · April 27, 2012, 10:36am

 
With the above code, the string would be replaced but the alignment would not be preserverd. What I require is suppose column 1 is 45 chars and col 2 starts from 46th chars, even after substitution of the string the col 2 should start from 46th char only.

What if the new string which will replace column1 is more than "45" charactes in length?

Post a sample output on what exactly you are expecting.

sudeep.id · April 27, 2012, 10:41am

I have tried sed and awk but the alignment is getting compromised.

Scrutinizer · April 27, 2012, 10:44am

Can you show an example of the desired output ? Can you post the awk script that was not working?

sudeep.id · April 27, 2012, 10:48am

panyam:

 
With the above code, the string would be replaced but the alignment would not be preserverd. What I require is suppose column 1 is 45 chars and col 2 starts from 46th chars, even after substitution of the string the col 2 should start from 46th char only.
What if the new string which will replace column1 is more than "45" charactes in length?

Post a sample output on what exactly you are expecting.

Hi

No the new string will always have the string length less than 44 chars. It will not pass 45 chars limit

Sample text
_______________

"PROD GLOBAL"                        "BPROD Tray"  
"ABC LONDON"                         "Capla"                      
"ARAS GLOBAL COMPANY"          "AAC - Optiver"

Suppose I replace ARAS GLOBAL COMPANY with ARAS LIMITED, then also the column 2 shld start from 46th char only and not charecters before.

Thanks,
Sudeep

---------- Post updated at 08:18 PM ---------- Previous update was at 08:16 PM ----------

I am not able to preserve the alignment in the above post but it is like...
column A (length 45 chars)
column B (length 50 chars) starts from 46th char from left

Scrutinizer · April 27, 2012, 10:49am

Please just show a sample output file.

sudeep.id · April 27, 2012, 10:55am

 
sample.txt
__________
"PROD GLOBAL"                           "BPROD Tray" 
"ABC LONDON"                            "Capla" 
"ARAS GLOBAL COMPANY"                   "AAC - Optiver"
 
> sed 's/PROD GLOBAL/PROD/g' sample.txt> new_sample.txt 
 
new_sample.txt
__________
"PROD"                         "BPROD Tray" 
"ABC LONDON"                            "Capla" 
"ARAS GLOBAL COMPANY"                   "AAC - Optiver"
 
The required o/p file is 
 
new_sample.txt
__________
"PROD"                                  "BPROD Tray" 
"ABC LONDON"                            "Capla" 
"ARAS GLOBAL COMPANY"                   "AAC - Optiver"

Please note that the string to replace and the replacement string are user input, and replacement string has variable length but which is less than 45 chars in all the cases.

Scrutinizer · April 27, 2012, 11:06am

There is a difference in format between the input file and the required output file if you look at the line with Capla. Is that line supposed to remain unchanged?
I gather the user input is

string1="PROD GLOBAL"

string2="PROD"

Is that assumption correct?
Does that need to be replace in every occurrence or only the first occurrence or only the last occurrence, for example?

sudeep.id · April 27, 2012, 11:29am

scrutinizer:

There is a difference in format between the input file and the required output file if you look at the line with Capla. Is that line supposed to remain unchanged?
I gather the user input is
string1="PROD GLOBAL"
string2="PROD"
Is that assumption correct?
Does that need to be replace in every occurrence or only the first occurrence or only the last occurrence, for example?

Sorry for typo... yes the assumption is correct. Only the first instance has to be replaced. I wrongly copied the command. It is

sed 's/PROD GOBAL/PROD/' sample.txt > new_sample.txt

I am not able to retain the proper formating while posting here. For clarity.. the first column starts at char 1 and the second column satrts from char 46. the spaces are getting trimmed while posting so i manually inserted the spaces to give a rough idea of the spacing.

Thanks,
Sudeep

Scrutinizer · April 27, 2012, 12:01pm

Hi, no worries, but it important that you pay proper attention to the correctness of you post by using preview first. Go over it a couple of times. People are putting in time and effort to help and it is good if they waste as little time as possible...
You can retain proper formatting by using code tags.

The format of your sample is not 45/50 characters..

sudeep.id · April 27, 2012, 12:19pm

Ok.. will keep that in mind... Is there any solution for my issue?

Scrutinizer · April 27, 2012, 1:02pm

Try this:

$ cat infile
"PROD GLOBAL"                                "BPROD Tray" 
"ABC LONDON"                                 "Capla" 
"ARAS GLOBAL COMPANY"                        "AAC - Optiver"
123456789012345678901234567890123456789012345

awk -F\" 'sub(from,to){sub($2FS$3,sprintf("%44-s",$2FS))}1' from="PROD GLOBAL" to="PROD" infile

bakunin · April 27, 2012, 1:23pm

First, i think you requirements are not well defined:

Suppose you enter "GLOBAL" as your search string and "x" as replacement. Now suppose you have the following file contents (exact formatting not preserved):

"GLOBAL"                "something"
"BARCLAY GLOBAL"        "something else"
"SOMETHING"             "GLOBAL whatever"

Should only the first line be changed? The first and the second? All three? Depending on your answer the necessary sed-script will look differently.

Second, here is how to preserve the formatting of tabular data with spaces with sed:

First identify the lines that have to be changed. See above, it is not clear yet how this should be done. All the other lines can pass unchanged.

Then, identify the lines where the "second field" has to be changed. (If this is part of the requirement, i.e. if the third line in the above example has to be changed.) These lines are the easiest, because there is no following field where the formatting has to be preserved. These lines are simply changed by a substitution-directive the way you did it already. s/pattern/replacement/ With these lines you are done now.

Then you tackle the lines which need replacement in the first field. Apply the following procedure to these:

Do your pattern replacement as usual.

Replace the spaces between the fields with exactly 45 spaces (the width of your first column, as i recall).

Do a substitution on the line with the following pattern: The first 45 characters plus all following characters beginning at the first non-whitespace. This effectively cuts out the excessive spaces and reformats the line.

You might ask why you have to first delete all the spaces, then enter some and lastly cut some of them out again. This is necessary because the replacement string could be shorter than the original. This way you make sure you have excessive spaces in the line so that the last step always "trims down" the line.

In fact, if you follow the above procedure, you will see that every paragraph i wrote matches exactly one sed-statement. So, get you sed man-page and start trying. If you have trouble writing the sed-script show what you did and we will help you gladly.

I hope this helps.

bakunin