This seems like it should be an easy problem, but for some reason I am struggling with the solution.
I simply want to replace all characters after the first 3 characters with another character, preferably with sed.
Thanks in advance.
Like this, but producing the proper number of *'s:
Thanks. Although this works fine for the case I presented, I noticed that there could be an issue if one of the first 3 characters was a * to begin with.
@sea: note that ${line:0:3} is bash syntax (or ksh93 or zsh), so the script should be called with bash , rather than sh . Even though on some systems sh is a link to bash , which will result in calling bash with the --posix option, there are many systems where this is not the case and so there the script would fail if called this way.
To preserve content of a file including spaces while processing line by line in shell,
Set IFS to "" and local to the read operation
Use -r to avoid interpretation of \
Use printf , rather than echo to make sure that a particular version of echo does not interpret the content of the variable, and the output will come out right.
Prevent variable expansion from field splitting and globbing by using double quotes around it.
So:
while IFS= read -r line
do
printf "%s\n" "$line"
done < file
Since the IFS variable is set local to the read operation, it will retain its original value after the fact..
Unfortunately, the standards say the results are unspecified if you have both a number and a g flag for a sed substitute command. On some systems, you will get what you showed above. On others, you'll get something like:
sed: 1: "s/./*/4g": more than one number or 'g' in substitute flags
I did not forget IFS="" to account for leading spaces, I just worked on the OP's post to have alpha characters starting but allowed for spaces inside any line which worked.
Yes there are spaces at the end of one line.
However with the extra line added, (note IFS is not saved in this DEMO.)
AMIGA:barrywalker~> IFS=""
AMIGA:barrywalker~> echo 'kajhd(*&&#$%^ASDFGHJ{}][!\\|\00
123131,.,xv.,c.,.?><lksdlfk
barry
A*b*C***Klom
abnm
n n n n n
n n n n n' > /tmp/txt
AMIGA:barrywalker~> while read -r line; do pad="${line:3}"; echo $line; echo -E "${line:0:3}${pad//[' '-~]/*}"; done < /tmp/txt
kajhd(*&&#$%^ASDFGHJ{}][!\\|\00
kaj****************************
123131,.,xv.,c.,.?><lksdlfk
123************************
barry
*******
A*b*C***Klom
A*b*********
abnm
abn*
n n n n n
n n******
n n n n n
n **********
AMIGA:barrywalker~> _
Did a small test and indeed, those 1 character loops are expensive.
It turns the awk solutions appear to be fastest.
$ time sed -e :a -e 's/\(...\)[^*]/\1*/;ta' file > /dev/null
real 0m0.114s
user 0m0.110s
sys 0m0.003s
$ time { sed '
s/^.\{0,3\}/&\
/
' file | sed '
$q
N
P
s/.*\n//
s/./*/g
'|sed '
$q
N
s/\n//
' ;} > /dev/null
real 0m0.015s
user 0m0.015s
sys 0m0.006s
$ time sed 'h;s/\(...\).*/\1/;x;s/^...//;s/./*/g;H;x;s/\n//' file > /dev/null
real 0m0.018s
user 0m0.013s
sys 0m0.002s
$ time perl -pe 's/(?<=...)./*/g' file >/dev/null
real 0m0.030s
user 0m0.022s
sys 0m0.005s
$ time awk '{x=substr($0,N+1); gsub(".","*",x); print substr($0,1,N) x}' N=3 file > /dev/null
real 0m0.010s
user 0m0.007s
sys 0m0.002s
$ time gsed "s/./*/4g" file > /dev/null
real 0m0.022s
user 0m0.018s
sys 0m0.002s
time awk -F'^...' '{a=$2; gsub(/./,"*",a); print substr($0,1,3) a}' file > /dev/null
real 0m0.009s
user 0m0.006s
sys 0m0.003s
The '\(...\)' and '\1' feature has been known to be a bit slow relative to simpler choices.
Maybe awk could get the length of the line - 3 and substring a string of "***************" to produce the rest of the line, faster? Or do you need perl/python/ruby for that?
+ ~/tmp $ cat ./leolson
while read line
do
len=${#line} ; [ $len -gt 3 ] && len=$(( $len - 3))
tmp="$(printf '%*s' $len|sed s,\ ,*,g)"
echo "${line:0:3}${tmp}"
done<file1
+ ~/tmp $ time bash ./leolson
aba*****
cdc*******
efe******
a*b****
real 0m0.007s
user 0m0.003s
sys 0m0.006s
I did a few runs, and felt tendency is more around 6-11 than 12+.
(had more 0.006 + 0.007 than everything else together)
Though, with MadeInGermany's 0.010 sec awk code i get 0.001
time awk '{x=substr($0,N+1); gsub(".","*",x); print substr($0,1,N) x}' N=3 file1 > /dev/null
real 0m0.001s
user 0m0.000s
sys 0m0.001s
And your perl code behaves quite irrrational:
+ ~/tmp $ time perl -pe 's/(?<=...)./*/g' file1 >/dev/null
real 0m0.119s
user 0m0.003s
sys 0m0.002s
+ ~/tmp $ time perl -pe 's/(?<=...)./*/g' file1 >/dev/null
real 0m0.006s
user 0m0.002s
sys 0m0.004s
I find the time diffrence quite immmense and confusing.
Sure, some mili-secs diffrence can happen - but by factor 19.8:1?