Manipulating Data

[LEFT]Hi. I haven't had to write bash scripts in a long time and have a simple task to do, but need some help:

Input:

chrY:22627291-22651542
chrY:23045932-23070172
chrY:23684890-23696359
chrY:25318610-25330083
chrY:25451096-25462570
chr10:1054847-1061799
chr10:1058606-1080131
chr10:1075964-1085061

Desired Output:

chrY 22627291 22651542
chrY 23045932 23070172
chrY 23684890 23696359
chrY 25318610 25330083
chrY 25451096 25462570
chr10 1054847 1061799
chr10 1058606 1080131
chr10 1075964 1085061

Also, the input is in a input.txt file, the output file should also be a text file. So basically the colon and the dash need to be replaced by one space. I'm thinking I would use the "cut c" function but not sure because the number of characters that have to be separated from the second column varies.. Any help would be great and really appreciated. Thanks![/LEFT]

Hi.

The sed utility is one way:

#!/usr/bin/env bash

# @(#) s1	Demonstrate substitution and comparison.

# Infrastructure details, environment, commands for forum posts. 
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo ; echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p sed cmp
set -o nounset
echo

FILE=${1-data1}

echo " Data file $FILE:"
cat $FILE

echo
echo " Expected output:"
cat expected-output.txt

echo
echo " Results:"
sed "s/[-:]/ /g" $FILE |
tee t1

echo 
echo " Comparison:"
cmp t1 expected-output.txt && echo OK || echo KO

exit 0

producing:

% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
GNU sed version 4.1.5
cmp (GNU diffutils) 2.8.1

 Data file data1:
chrY:22627291-22651542
chrY:23045932-23070172
chrY:23684890-23696359
chrY:25318610-25330083
chrY:25451096-25462570
chr10:1054847-1061799
chr10:1058606-1080131
chr10:1075964-1085061

 Expected output:
chrY 22627291 22651542
chrY 23045932 23070172
chrY 23684890 23696359
chrY 25318610 25330083
chrY 25451096 25462570
chr10 1054847 1061799
chr10 1058606 1080131
chr10 1075964 1085061

 Results:
chrY 22627291 22651542
chrY 23045932 23070172
chrY 23684890 23696359
chrY 25318610 25330083
chrY 25451096 25462570
chr10 1054847 1061799
chr10 1058606 1080131
chr10 1075964 1085061

 Comparison:
OK

cheers, drl

Tr is another

tr ':-' '  '

use perl code below:-

perl -wnl -e 's/:/ / and s/-/ / and print ; ' input.txt > output.txt

;);):wink:

awk -F[:-] ' { print $1,$2,$3 } ' file
awk ' gsub("[:-]"," ") ' file
sed "y/:-/  /" file

more shorter code:-

perl -wpl -e 's/(-|:)/ /g ;'  input.txt 

:cool::cool::cool:

awk -F[:-] '$1=$1' file

Hi.

In lua:

#!/usr/bin/env bash

# @(#) s1	Demonstrate string substitution, lua.

# Infrastructure details, environment, commands for forum posts. 
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo ; echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p lua
set -o nounset
echo

FILE=${1-data1}

echo " Data file $FILE:"
cat $FILE

echo
echo " lua program file:"
cat l1

echo
echo " Results:"
./l1 < data1

exit 0

producing:

% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39
Lua 5.0.3

 Data file data1:
chrY:22627291-22651542
chrY:23045932-23070172
chrY:23684890-23696359
chrY:25318610-25330083
chrY:25451096-25462570
chr10:1054847-1061799
chr10:1058606-1080131
chr10:1075964-1085061

x:y-z

 lua program file:
#!/usr/bin/env lua

-- @(#) l1	Demonstrate character substitution, lua.

while true do
  local line = io.read()
  if line == nil then break end
  line = string.gsub(line,"[-:]"," ")
  print(line)
end

 Results:
chrY 22627291 22651542
chrY 23045932 23070172
chrY 23684890 23696359
chrY 25318610 25330083
chrY 25451096 25462570
chr10 1054847 1061799
chr10 1058606 1080131
chr10 1075964 1085061

x y z

cheers, drl

---------- Post updated at 15:25 ---------- Previous update was at 14:50 ----------

Hi.

With shell constructs:

#!/usr/bin/env bash

# @(#) s1	Demonstrate substitution with parameter expansion, bash

# Infrastructure details, environment, commands for forum posts. 
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo ; echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p
set -o nounset
echo

FILE=${1-data1}

echo " Data file $FILE:"
cat $FILE

echo
echo " Results:"
while IFS="\n" read line
do
  echo "${line//[-:]/ }"
done <$FILE |
tee t1
# or IFS=$'\012', whichever one works for you

echo 
echo -n " Comparison is ... "
cmp t1 expected-output.txt && echo OK || echo KO

exit 0

producing:

% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0 
GNU bash 3.2.39

 Data file data1:
chrY:22627291-22651542
chrY:23045932-23070172
chrY:23684890-23696359
chrY:25318610-25330083
chrY:25451096-25462570
chr10:1054847-1061799
chr10:1058606-1080131
chr10:1075964-1085061

x:y-z

 Results:
chrY 22627291 22651542
chrY 23045932 23070172
chrY 23684890 23696359
chrY 25318610 25330083
chrY 25451096 25462570
chr10 1054847 1061799
chr10 1058606 1080131
chr10 1075964 1085061

x y z

 Comparison is ... OK

cheers, drl