Hi Hans,
You say that the # characters are unprintable characters, but from the output you say you want from your input ( $astring = "xxxxxx ABC+10+\x39\x55\x12\x84\xA7\x9F\x2C\xB1\xFF\x12+DEF xxxx"
) we see that some of these bytes represent printable characters (assuming you're using a codeset with ASCII underpinnings). The hexadecimal escape codes \x39, \x55, and \x2C are the characters '9', 'U', and ',', respectively. This isn't necessarily bad, but none of the scripts that have been presented here so far will work correctly if one of these characters represented by a "#" is a newline character and these scripts may fail if the "x"s or "#"s contain a sequence that matches the form "ABC+<digits>+". And has already been stated, there is nothing we can do for you in a shell script if any of the bytes represented by a "#" is a null byte ('\x00').
As long as you can guarantee that there won't be any null bytes in the string except for the terminating null byte at the end of every string and can guarantee that exactly one substring of the form "ABC+<digits>+" will appear in echo string, the following script does that you have requested:
#!/bin/ksh
### Functions:
# Usage: hexit bytes
# Convert the string ("bytes") into printable hexadecimal escape sequences
# corresponding to the values of the bytes in the string. This function will
# not work correctly if a null byte appears in the string other than as the
# string terminator. It will correctly handle newline characters in the bytes
# operand.
hexit() {
printf "%s" "$1" | od -An -tx1 | while read x
do set -- $x
while [ $# -gt 0 ]
do printf '\\x%s' "$1"
shift
done
done
}
### Main program:
# Usage: hexstring string...
# This utility will process each string operand (which must be of the form:
# <front><hex-head><hex-bytes><tail>
# where <front> is any sequence of zero or more printable characters not
# containing any substring that matches the format specified for
# <hex-head>.
# <hex-head> is composed of three parts in sequence:
# <hex-head-start><count><hex-head-end>
# where <hex-head-start> is the characters "ABC+",
# <count> is one or more characters from the current locale's
# digit character class) that will be interpreted as a
# decimal digit string specifying the number of bytes
# included in <hex-bytes> (see below), and
# <hex-head-end> is a "+" character.
# <hex-data> is string of <count> bytes. These bytes can contain any
# value except the null byte as long as no substring of these
# bytes constitute a string that can be interpreted as a valid
# <hex-head> string either by itself or when combined with the
# following <tail>.
# <tail> is zero or more printable characters not containing any
# substring that matches the format specifeid above for
# <hex-head>.
# When processing is complete, a string will be written to stdout containging
# <front> (unchanged), <hex-head> (unchanged), <hex-data> (converted to the
# four character hexadecimal escape sequence representing each byte in the
# <hex-data>), and <tail> (unchanged).
#
# Example: (Assuming this script is invoked by a recent ksh running on a system
# with the ASCII codeset underlying the current locale):
# hexstring $'start ABC+5+a\tb\nc+end'
# would produce the following output:
# start ABC+5+\x61\x09\x62\x0a\x63+end
ec=0 # Exit code (0 unless an error is detected)
while [ $# -gt 0 ]
do
# Extract the <count> field.
count=$(expr "$1" : ".*ABC+\([0-9]\{1,\}\)+")
if [ "$count" == "" ]
then
printf "%s: \"ABC+<digits>+\" not found in \"%s\"\n" \
$(basename "$0") "$1"
shift
ec=1
continue
fi
# Calculate the offset to the start of <hex-bytes>.
off=$(expr "$1" : ".*ABC+[0-9]\{1,\}+")
# Print <front> and <hex-head>
printf "%s" "${1:0:off}"
# Print <hex-bytes> as hexadecimal escape sequences.
hexit "${1:off:count}"
# And, print <tail>
printf "%s\n" "${1:off + count}"
shift
done
exit $ec
I realize this is a long script, but it is mostly comments. Note that some features used in the above script are only available in versions of ksh newer than November 16, 1988 and some of the od utiity's options used here weren't defined by the standards until 1992.
Presumably, you have a source that creates strings containing binary data so I won't worry about it here. It is easy to create strings like this with $'...' in recent versions of ksh, in a C or C++ program, and using the printf utility with hex escape sequences (but I assume if you're creating hex escape sequences to generate these strings, you don't need to convert them back to hex).