Core dump in a simple shell script!

Hi

I have a very simple shell script that is dumping core -

testScript -

#!/bin/ksh
PROG=${0##*/}   # basename
if [ $# -ne 1 ]; then
        print -u2 "Usage: $PROG filename"
        exit 1
fi
MDY=$(date '+%m%d%y')
if [ -f ${1}.${MDY} ]; then
        cat ${1}.out >>${1}.${MDY}
else
        cp -p ${1}.out ${1}.${MDY}
fi
gdb testScript core.11632
Core was generated by `/bin/ksh /bin/testScript /data/SREF'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004c16e7 in ?? ()

Here we are passing another file, named /data/SREF as the only argument.

I have tested the code several times and it is working without any issue. So why it's dumping core in the customer's machine?

I read somewhere that when a shell script dumps core, then the issue should be related to the shell binary itself and not with the code. The shell binary is corrupted and we should install Linux patches to solve the problem.

Is the above statement true? Why this simple shell script is dumping core?

Thanks

Have a try run with some quotings.
Like around "${1}.${MDY}"
And add a space & quotes >> "${1}.${MDY}"

Welll, not sure if that helps, but might be worth a try.

Just recently i've experienced a core dump too while running a shell script.
Though it just occoured when the binary was executed in background (subshell of script), but it worked well when called regulary.

hth

1 Like

Is this running as root? What does backtrace in gdb reveal (calling sequence) or bt full for detail.
Did you try gdb /bin/ksh <corefile>, gdb only requires an executable call, but the ?? indicates it can't figure out the symbolic reference for the address (function or library).
At a guess resource (ulimit/environment difference) between test environment and customer's.

1 Like

That's a bug in the shell all right. What version and variety of ksh is this?

1 Like

Hi All

Sorry for the late update in this post. Actually I don't have direct access to the customer system and customer support folk went for a vacation.

I edited the code as "sea" has suggested and placed it in the customer machine. Time will tell whether the changes are working or not.

GDB "bt full" does not give anything -

Program terminated with signal 11, Segmentation fault.
#0  0x00000000004c16e7 in ?? ()
(gdb) bt full
#0  0x00000000004c16e7 in ?? ()
No symbol table info available.

As Corona688 has suggested it's a bug in the KSH and requested for more information, here it is -

$ksh --version
sh (AT&T Research) 93t+ 2010-06-21

OS version -

$uname -a
2.6.18-348.16.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux 

How to know that this is indeed a bug in KSH?

Many thanks in advance.

Yes, this looks like the shell is at fault but maybe not because of a bug. I suggest to reinstall the ksh first, just to make sure the binary is unaltered.

I hope this helps.

bakunin

1 Like