Removing \r and \n during reading file through while loop

Pulkit_Lall · August 29, 2014, 2:56am

Hi,
I am writing in a file through cat command. This file will contain the path of file along with filename. e.g. /home/user/folder1/folder2/filename.txt
There might be very large number of this path in same file like say 140

when I try to run while command:

while read -r file
do
 //command
done

Here I am able to see the file path which has '\n' as EOL but those which have '\r' are skipped.

I also tried using IFS='\r\n' but this splitted the file path from letter 'r'.

I need to display in the below format:

/home/user/folder1/folder2/filename.txt
/home/user/folder1/folder2/filename1.txt
/home/user/folder1/folder2/filename2.txt
/home/user/folder1/folder2/filename3.txt
/home/user/folder1/folder2/filename4.txt
/home/user/folder1/folder2/filename5.txt
/home/user/folder1/folder2/filename6.txt
/home/user/folder1/folder2/filename7.txt

RavinderSingh13 · August 29, 2014, 3:04am

Hello Pulkit,

It is advised to use code tags for any commands and code as per forum rules. Before clicking on submit button you can preview your post. Could you please be more clear in your requirement, also kindly let us know the input and expected output on same.

Thanks,
R. Singh

Pulkit_Lall · August 29, 2014, 3:33am

Hi Ravinder,
I have one executable file of two version i.e. 1st version - 114, 2nd version - 120
I am converting from one file to another say bmp to pdf in these 2 versions.
Once the file is converted, a output file is created in pdf format in both versions folder.

There paths with filename are stored in a textfile using cat command. I am now reading this textfile using while loop to read the filepaths.
The problem is in the while loop when it is reading the textfile, it is accepting only the filepaths which has '\n' (New Line) at the end of line
Those which has '\r' (Carriage Return) are not accepted and skipped.

bakunin · August 29, 2014, 3:34am

You were on the right track, but the shell doesn't work like printf . "\r" is just an escaped "r" (an "r" made sure to be literally meant) and this is why it split at the letter "r".

To use control codes you need to enter them as they are, (single-) quoted like you did, but otherwise "uncooked". Depending on the editor you use there are different ways to accomplish this, here is how it is done in vi :

Enter a CTRL-V in input mode. The next character you enter is interpreted as literal, then.

Press "<ENTER>", for instance, and a "^M" will appear. Notice that this is one character, not two! You see that when you switch back to command mode and go over the character with the cursor. Pressing "<TAB>" instead will give you a "^I", which is the display equivalent of the TAB-character (you will see these control characters also when you set your vi to ":set line". Switch back to normal via ":set nolines".

I hope this helps.

bakunin

/PS: this works on the command line too if you use Korn shell and switch to vi-mode ("set -o vi").

EDIT: you posted while i was writing my answer. Probably there is a way easier solution for your problem: you got your file perhaps transferred from some DOS/Windows system and this is what causes your problem.

Either: ftp your file with the "ASCII"-mode set instead of the default binary next time;
or: change the file by running it through one of these "dos2unix"-commands or something similar;
or: do that yourself with a small sed-script. Notice, you will need the above-mentioned method for this, the "^M" is a literally entered "<ENTER>".

sed 's/^M$//' /your/input/file > /some/output/file

RavinderSingh13 · August 29, 2014, 3:38am

Hello Pulkit,

You can use gsub utility of awk by using \r in it. One very good example is present in following thread, just take a look on same.

http://www.unix.com/shell-programming-and-scripting/179559-awk-remove-carriage-return-65th-field.html

Thanks,
R. Singh

Scrutinizer · August 29, 2014, 5:06am

They are not some much skipped, but the output gets printed and then overwritten, since the Carriage Return makes the the writes start from the beginning of the line...

There is no need to specify '\n' for IFS (it won't hurt either) since the newlines will be automatically stripped by the read command...

Alternative methods of specifying a CR as IFS:

CR=$(printf '\r')
while IFS=$CR read file; do
  printf "%s\n" "$file"
done < infile

or bash/ksh93/zsh:

while IFS=$'\r' read file; do
  printf "%s\n" "$file"
done  < infile

These methods only change IFS local to the read command. So IFS retains its old value and the old IFS does not need to be saved first and can remain at the default of $' \t\n'

Of course one can always remove it from the file first:

tr -d '\r' < infile |
while read file; do
  printf "%s\n" "$file"
done

or bash/ksh93/zsh:

while read file; do
  printf "%s\n" "$file"
done < <(tr -d '\r' < infile)

Pulkit_Lall · August 29, 2014, 6:40am

None of them worked.

Scrutinizer · August 29, 2014, 6:42am

What does not work? What are you getting? What does the input file look like? Can you post a sample?

gull04 · August 29, 2014, 7:02am

Hi,

You could try setting the IFS variable in the script to;

IFS='
'

Although the above has worked for similar problems for me in the past - particularly when using /bin/ksh

It would give us a better chance of helping you if you'd provide the samples that Scrutinizer has requested.

Regards

Dave