how to extract a tilde delimited file in unix

i have a file in unix in which datas are like this

07 01 abc data entry Z3 data entry ASSISTANT Z3 39
08 01 POD peadiatrist Z4 POD PeDIATRY Z4 67
01 operator specialist 00 operator UNSPECIFIED A0 00
02 OLD NPA SPECIALTY 01 GP GENERAL PRACTICE C5 01

i want the datas to be tilde delimited as
07~01~abc~data entry~Z3~data entry ASSISTANT~Z3~39
~01~~operator specialist~00~operator UNSPECIFIED~A0~00

can anybody help me on this in writing code for this

sed 's/ /~/g' file

else

awk '{ for(i=1; i<=NF; i++) { if( i==NF ) { printf "%s", $i } else { printf "%s~", $i } } printf "\n" }' file
tr ' ' '~' < file
awk ' gsub(" ","~",$0) ' file

or
In ksh

awk ' { OFS="~" ; $1=$1; print } ' file

or
In bash

while read str
do
   echo ${str// /~}
done < file

don't think they produce the desired output.
not all blank spaces need to be converted to tildes, as shown in the sample output OP provided

You are right.

trichyselva,
Can you be more specific on how do you want to add tilde?

try this:

awk '{
while ( $0 ~ "  " )
{
sub("  ", " ", $0)
}  print }' test |  sed 's/ /~/g;'

In which fields do you want to avoid the ~ sign???

try this code:

awk '
{
while ( $0 ~ "  " )
{
sub("  ", " ", $0)
}  print }' test |  sed "s/ /\~/g;" >result1
awk '
{
while ( $0 ~ "  " )
{
sub("  ", " ", $0)
}  print }' result1 |  sed "s/ /\~/g;" |sed "s/\(^.*\~..\~.*\)\(\~..\~\)\(.*\)\(\~.....$\)/\3/g" >result2
sed 's/~/ /g' result2>result3
count=0
while read line; do
count=`echo $count+1|bc`
export value=${count}P
sed -n "$value" result2 |read val
sed -n "$value" result1 | read res1
echo $res1 | sed "s/$val/$line/g"
done <result3

Well, Forum members, This might be a bit clumsy but thats the best i could get. If there is a better way please let me know.

even if there are mulitple spaces in between fields i just want one tilde

sed 's/  */~/g' file

Not the two whitespace between / and *

This will match with a single space as well
*=0 or more

I'd want more specifications before I could whip up a good script/program. For example, I see it could be lined up. If everything's fixed width, you could just use a cut command to break each line up into manageable bits and then just output the tilde's. Are their always the same number of fields? Are fields 1, 2, 5, 7, and 8 always two digit fields? Can other fields besides 1 and 3 be blank?

I'd do something like read the entire line into an array. I'd check each of the array elements to see if they match rules (like the two digit fields) and put everything in the right place. The only wrench is that third field and what might be in it.

So the basic answer here is, insufficient specifications. Sorry :slight_smile:

Carl

This was the requirement note from the OP

Thats is the reason I had

<whitespace><whitespace>*

to match both single and multiple whitespaces