Not able to remove special character ^I in file in Linux system

Shruthi_GM · September 18, 2016, 12:32pm

Hi All,

We are fetching data from database and writing a file in linux system.
Where in space between the data in one of the columns get converted into ^I which is creating an issue.

Column data.

710060     123CH
723000     CH786
813300     123RC
6HR001     045RU

There should be 5 spaces in between( 710060 123CH )
this space is getting converted like below when we open the file using cat -vet filename command.

710060  ^I   123CH
723000^I     CH786
813300     123RC
6HR001   ^I045RU

To remove this special character, i used below command.

<filename LC_ALL=C sed 's/[^-~]/     /g'> filename

.

Above command is removing special char ^I but space between the data in the column is more than 5 as ^I is occurring at different positions.

could some one please guide me,how else the special char can be removed by retaining 5 spaces in between the data 710060 123CH

RudiC · September 18, 2016, 1:04pm

As the <TAB> (0x09) character can be composed with the two keys <CTRL> and I , it sometimes is displayed as ^I (read: control-I). It belongs to the [[:blank:]] character class and is frequently used as a field/column separator in DB output, when its about aligning columns, regardless of how long the previous one was, and the position where it terminated. For this reason it is quite pointless to replace ^I with 5 spaces as it could be any value between 1 and (usually, default) 8 characters.

It would be suprising if a DB system would mix <TAB> and space characters for column separation, or use <TAB>s inconsistently in different positions.

Try finding the original query/script and modify that.

Shruthi_GM · September 18, 2016, 1:29pm

Hi Rudic,

Thanks for your suggestion.

But in database side i could see that there is no issue. columns are separated by | symbol.
The issue im facing is for only one column where in 5 spaces are required in between the data 71000 CH678.

Is there any other way to remove this ^I

Don_Cragun · September 18, 2016, 3:19pm

If you use the following sed command, replacing each occurrence of <space> with a single space character and replacing <tab> with a single tab character:

sed 's/<space>*<tab>[[:space:]]*/<space><space><space><space><space>/g' input_file > new_file

it will replace all sequences of zero or more spaces followed by a tab character followed by zero of more spaces and tabs in the file named input_file to exactly five space characters in the output file named new_file .

If someone else wants to try this on a Solaris/SunOS system, change sed to /usr/xpg4/bin/sed .