Formatting file data to another file (control character related)

I have to write a program to read data from files and then format into another file. However, I face a strange problem related to control character that I can't understand and solve.

The source file is compose of many lines with such format:

T_NAME|P_NAME|P_CODE|DOCUMENT_PATH|REG_DATE

Expected formatted output is:

T_NAME<tab>[T_NAME]
P_NAME<tab>[P_NAME]
P_CODE<tab>[P_CODE]
PATH<tab>[DOCUMENT_PATH]
DATE<tab>[REG_DATE]
<empty line>
while read line
do
T_NAME=`echo $line| cut -d "|" -f1` 
P_NAME=`echo $line| cut -d "|" -f2` 
P_CODE=`echo $line| cut -d "|" -f3` 
PATH=`echo $line| cut -d "|" -f4` 
DATE=`echo $line| cut -d "|" -f5` 
 
echo "T_NAME\t$T_NAME" >> $target_file
echo "P_NAME\t$P_NAME" >> $target_file
echo "P_CODE\t$P_CODE" >> $target_file
echo "PATH\t$PATH" >> $target_file
echo "DATE\t$DATE" >> $target_file
echo " " >> $target_file
 
done <$source_file

The problem happens when I have such a test case in source file:

SPARK|CYBER_DEF|XRS001|abcabc\\nababa\\tc|2012-09-09 12:34:54.005

The file output:

T_NAME<tab>SPARK
ababa<tab>c
P_NAME<tab>CYBER_DEF
2012-09-09 12:34:54.005
P_CODE<tab>XRS001
PATH<tab>abcabc
DATE

I know there is some problem when cutting fields, but no idea how to fix it. Can anybody help with this?
I am using kshell.

I suppose the problem being the "echo" statement. It is fed the content of the variable "$line" and interprets it. "\n" and "\t" in your example are control characters, though. "\n" is "newline", "\t" is tab.

Anyhow, your implementation isn't all too good anyway, because you use several commands and a pipeline to do what shell exansion could do too - at considerably less computing costs:

x="abc|def|ghi|jkl"
echo "${x%%|*}"
echo "${x#*|}"
y="${x#*|}"
x="${x#${y}|}"
echo "${x%%|*}"
echo "${x#*|}"

Change your code accordingly and it should not only work but run faster too.

An independent observation: DO NOT use backticks any more. They are deprecated and the shell only understands them for backward compatibility issues. Use the modern "$(...)" instead, which is a lot more flexible, can be nested, can be quoted, ...

Another hint: in case you write for Korn shell you might consider using file descriptors instead of redirection. Instead of:

command1 > "$file"
command2 >> "$file"
command3 >> "$file"
...etc.

you can write:

exec 3> "$file"     # file descriptor 3 to write into $file
# exec 3>> "$file"    # alternatively open it in append mode

print -u3 - "first line"        # -u3 is FD 3
print -u3 - "second line"
print -u3 - "third line"

exec 3>&-                 # close FD 3

I hope this helps.

bakunin

1 Like

Uhm.
What's wrong with:

lem@biggy:/tmp$ while IFS="|" read -r T_NAME P_NAME P_CODE _PATH _DATE; do
cat <<END >>outfile
T_NAME  $T_NAME
P_NAME  $P_NAME
P_CODE  $P_CODE
PATH    $_PATH 
DATE    $_DATE

END
    done <infile
lem@biggy:/tmp$ cat outfile
T_NAME  SPARK
P_NAME  CYBER_DEF
P_CODE  XRS001
PATH    abcabc\\nababa\\tc
DATE    2012-09-09 12:34:54.005

lem@biggy:/tmp$

Or, according with your preferences:

lem@biggy:/tmp$ while IFS="|" read T_NAME P_NAME P_CODE _PATH _DATE; do
cat <<END >>outfile
T_NAME  $T_NAME
P_NAME  $P_NAME
P_CODE  $P_CODE
PATH    $_PATH
DATE    $_DATE

END
     done <infile
lem@biggy:/tmp$ cat outfile
T_NAME  SPARK
P_NAME  CYBER_DEF
P_CODE  XRS001
PATH    abcabc\nababa\tc
DATE    2012-09-09 12:34:54.005

lem@biggy:/tmp$

--
Bye

1 Like
x="abc|def|ghi|jkl"
echo "${x%%|*}"
echo "${x#*|}"
y="${x#*|}"
x="${x#${y}|}"
echo "${x%%|*}"
echo "${x#*|}"

The above code give this output:


abc|def|ghi|jkl

abc|def|ghi|jkl

I don't know if the code is incorrect or my working environment does not support shell expansion.
Anyway, using file descriptors instead of redirection is new to me.

Thanks to Lem, the lower one is what I want.
Thanks to bakunin and Lem for quick reply.