Why does this example C code run and yet SHOULD either not compile or give a segmentation fault?

Apologies for any typos...

Well guys, been researching 'goto' in C and they say that you can't 'goto' labels in another function as a segmentation fault will occur.
However I have found a way to 'goto' a label in another function that is NOT main() using the asm() function.
As you know I love doing things with languages that they were not designed to do.

This works on gcc 2.95.3, gcc 4.2.1 and gcc 7.3.0. AMIGA OS 3.0.x using ADE, OSX 10.14.3 and Linux Mint 19.

Although I am aware of what is happening, what I don't understand is why gcc and/or its assembler, (? 'as' ?), up to at least version 7.3.0 does not give a warning or error report and NOT allow compilation.
I don't have the current gcc which I think is version 8.2.0 so it might have been found now.

#include <stdio.h>

/* NO segmentation fault? */

void test1()
{
    /* This never sees 'return' nor the second 'nop'. */
    asm(
        "nop;"
        "jmp    jump_test2;"
        "nop;"
    );
    printf("This will never be seen!\n");
    return;
}

void test2()
{
    /* The 'test2:' label sits in here. */
    asm(
        "nop;"
        "jump_test2:"
        "nop;"
    );
    printf("This will be printed.\n");
    return;
}

int main()
{
    test1();
    printf("Hello World!\n");
    return(0);
}

Results on OSX 10.14.3, default bash terminal, gcc 4.2.1.

Last login: Wed Mar 27 20:28:14 on ttys000
AMIGA:amiga~> cd Desktop/Code/C
AMIGA:amiga~/Desktop/Code/C> gcc cross_function_jump.c
AMIGA:amiga~/Desktop/Code/C> ./a.out
This will be printed.
Hello World!
AMIGA:amiga~/Desktop/Code/C> hexdump -C a.out
00000000  cf fa ed fe 07 00 00 01  03 00 00 80 02 00 00 00  |................|
00000010  0f 00 00 00 c0 04 00 00  85 00 20 00 00 00 00 00  |.......... .....|
00000020  19 00 00 00 48 00 00 00  5f 5f 50 41 47 45 5a 45  |....H...__PAGEZE|
00000030  52 4f 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |RO..............|
00000040  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 00 00 00 00 00 00  19 00 00 00 d8 01 00 00  |................|
00000070  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000080  00 00 00 00 01 00 00 00  00 10 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
000000a0  07 00 00 00 05 00 00 00  05 00 00 00 00 00 00 00  |................|
000000b0  5f 5f 74 65 78 74 00 00  00 00 00 00 00 00 00 00  |__text..........|
000000c0  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
000000d0  b0 0e 00 00 01 00 00 00  8f 00 00 00 00 00 00 00  |................|
000000e0  b0 0e 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 04 00 80 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  5f 5f 73 74 75 62 73 00  00 00 00 00 00 00 00 00  |__stubs.........|
00000110  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000120  40 0f 00 00 01 00 00 00  06 00 00 00 00 00 00 00  |@...............|
00000130  40 0f 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |@...............|
00000140  08 04 00 80 00 00 00 00  06 00 00 00 00 00 00 00  |................|
00000150  5f 5f 73 74 75 62 5f 68  65 6c 70 65 72 00 00 00  |__stub_helper...|
00000160  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000170  48 0f 00 00 01 00 00 00  1a 00 00 00 00 00 00 00  |H...............|
00000180  48 0f 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |H...............|
00000190  00 04 00 80 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001a0  5f 5f 63 73 74 72 69 6e  67 00 00 00 00 00 00 00  |__cstring.......|
000001b0  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
000001c0  62 0f 00 00 01 00 00 00  3f 00 00 00 00 00 00 00  |b.......?.......|
000001d0  62 0f 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |b...............|
000001e0  02 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  5f 5f 75 6e 77 69 6e 64  5f 69 6e 66 6f 00 00 00  |__unwind_info...|
00000200  5f 5f 54 45 58 54 00 00  00 00 00 00 00 00 00 00  |__TEXT..........|
00000210  a4 0f 00 00 01 00 00 00  54 00 00 00 00 00 00 00  |........T.......|
00000220  a4 0f 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000230  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000240  19 00 00 00 e8 00 00 00  5f 5f 44 41 54 41 00 00  |........__DATA..|
00000250  00 00 00 00 00 00 00 00  00 10 00 00 01 00 00 00  |................|
00000260  00 10 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 10 00 00 00 00 00 00  07 00 00 00 03 00 00 00  |................|
00000280  02 00 00 00 00 00 00 00  5f 5f 6e 6c 5f 73 79 6d  |........__nl_sym|
00000290  62 6f 6c 5f 70 74 72 00  5f 5f 44 41 54 41 00 00  |bol_ptr.__DATA..|
000002a0  00 00 00 00 00 00 00 00  00 10 00 00 01 00 00 00  |................|
000002b0  10 00 00 00 00 00 00 00  00 10 00 00 03 00 00 00  |................|
000002c0  00 00 00 00 00 00 00 00  06 00 00 00 01 00 00 00  |................|
000002d0  00 00 00 00 00 00 00 00  5f 5f 6c 61 5f 73 79 6d  |........__la_sym|
000002e0  62 6f 6c 5f 70 74 72 00  5f 5f 44 41 54 41 00 00  |bol_ptr.__DATA..|
000002f0  00 00 00 00 00 00 00 00  10 10 00 00 01 00 00 00  |................|
00000300  08 00 00 00 00 00 00 00  10 10 00 00 03 00 00 00  |................|
00000310  00 00 00 00 00 00 00 00  07 00 00 00 03 00 00 00  |................|
00000320  00 00 00 00 00 00 00 00  19 00 00 00 48 00 00 00  |............H...|
00000330  5f 5f 4c 49 4e 4b 45 44  49 54 00 00 00 00 00 00  |__LINKEDIT......|
00000340  00 20 00 00 01 00 00 00  00 10 00 00 00 00 00 00  |. ..............|
00000350  00 20 00 00 00 00 00 00  50 01 00 00 00 00 00 00  |. ......P.......|
00000360  07 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000370  22 00 00 80 30 00 00 00  00 20 00 00 08 00 00 00  |"...0.... ......|
00000380  08 20 00 00 18 00 00 00  00 00 00 00 00 00 00 00  |. ..............|
00000390  20 20 00 00 10 00 00 00  30 20 00 00 48 00 00 00  |  ......0 ..H...|
000003a0  02 00 00 00 18 00 00 00  80 20 00 00 07 00 00 00  |......... ......|
000003b0  00 21 00 00 50 00 00 00  0b 00 00 00 50 00 00 00  |.!..P.......P...|
000003c0  00 00 00 00 01 00 00 00  01 00 00 00 04 00 00 00  |................|
000003d0  05 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
000003e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000003f0  f0 20 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |. ..............|
00000400  00 00 00 00 00 00 00 00  0e 00 00 00 20 00 00 00  |............ ...|
00000410  0c 00 00 00 2f 75 73 72  2f 6c 69 62 2f 64 79 6c  |..../usr/lib/dyl|
00000420  64 00 00 00 00 00 00 00  1b 00 00 00 18 00 00 00  |d...............|
00000430  4f 70 c6 4a 81 dc 38 76  93 62 3b d6 09 bd 94 37  |Op.J..8v.b;....7|
00000440  32 00 00 00 20 00 00 00  01 00 00 00 00 0e 0a 00  |2... ...........|
00000450  00 0e 0a 00 01 00 00 00  03 00 00 00 00 0c 99 01  |................|
00000460  2a 00 00 00 10 00 00 00  00 00 00 00 00 00 00 00  |*...............|
00000470  28 00 00 80 18 00 00 00  10 0f 00 00 00 00 00 00  |(...............|
00000480  00 00 00 00 00 00 00 00  0c 00 00 00 38 00 00 00  |............8...|
00000490  18 00 00 00 02 00 00 00  05 c8 e4 04 00 00 01 00  |................|
000004a0  2f 75 73 72 2f 6c 69 62  2f 6c 69 62 53 79 73 74  |/usr/lib/libSyst|
000004b0  65 6d 2e 42 2e 64 79 6c  69 62 00 00 00 00 00 00  |em.B.dylib......|
000004c0  26 00 00 00 10 00 00 00  78 20 00 00 08 00 00 00  |&.......x ......|
000004d0  29 00 00 00 10 00 00 00  80 20 00 00 00 00 00 00  |)........ ......|
000004e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000eb0  55 48 89 e5 48 83 ec 10  90 e9 2b 00 00 00 90 48  |UH..H.....+....H|
00000ec0  8d 3d 9c 00 00 00 b0 00  e8 73 00 00 00 89 45 fc  |.=.......s....E.|
00000ed0  48 83 c4 10 5d c3 66 2e  0f 1f 84 00 00 00 00 00  |H...].f.........|
00000ee0  55 48 89 e5 48 83 ec 10  90 90 48 8d 3d 8b 00 00  |UH..H.....H.=...|
00000ef0  00 b0 00 e8 48 00 00 00  89 45 fc 48 83 c4 10 5d  |....H....E.H...]|
00000f00  c3 66 66 66 66 66 66 2e  0f 1f 84 00 00 00 00 00  |.ffffff.........|
00000f10  55 48 89 e5 48 83 ec 10  c7 45 fc 00 00 00 00 e8  |UH..H....E......|
00000f20  8c ff ff ff 48 8d 3d 68  00 00 00 b0 00 e8 0e 00  |....H.=h........|
00000f30  00 00 31 c9 89 45 f8 89  c8 48 83 c4 10 5d c3 90  |..1..E...H...]..|
00000f40  ff 25 ca 00 00 00 00 00  4c 8d 1d b9 00 00 00 41  |.%......L......A|
00000f50  53 ff 25 a9 00 00 00 90  68 00 00 00 00 e9 e6 ff  |S.%.....h.......|
00000f60  ff ff 54 68 69 73 20 77  69 6c 6c 20 6e 65 76 65  |..This will neve|
00000f70  72 20 62 65 20 73 65 65  6e 21 0a 00 54 68 69 73  |r be seen!..This|
00000f80  20 77 69 6c 6c 20 62 65  20 70 72 69 6e 74 65 64  | will be printed|
00000f90  2e 0a 00 48 65 6c 6c 6f  20 57 6f 72 6c 64 21 0a  |...Hello World!.|
00000fa0  00 00 00 00 01 00 00 00  1c 00 00 00 01 00 00 00  |................|
00000fb0  20 00 00 00 00 00 00 00  20 00 00 00 02 00 00 00  | ....... .......|
00000fc0  00 00 00 01 b0 0e 00 00  38 00 00 00 38 00 00 00  |........8...8...|
00000fd0  40 0f 00 00 00 00 00 00  38 00 00 00 03 00 00 00  |@.......8.......|
00000fe0  0c 00 03 00 18 00 01 00  00 00 00 00 39 00 00 01  |............9...|
00000ff0  60 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |`...............|
00001000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001010  58 0f 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |X...............|
00001020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  11 22 10 51 00 00 00 00  11 40 64 79 6c 64 5f 73  |.".Q.....@dyld_s|
00002010  74 75 62 5f 62 69 6e 64  65 72 00 51 72 00 90 00  |tub_binder.Qr...|
00002020  72 10 11 40 5f 70 72 69  6e 74 66 00 90 00 00 00  |r..@_printf.....|
00002030  00 01 5f 00 05 00 03 5f  6d 68 5f 65 78 65 63 75  |.._...._mh_execu|
00002040  74 65 5f 68 65 61 64 65  72 00 27 74 65 73 74 00  |te_header.'test.|
00002050  2b 6d 61 69 6e 00 3d 02  00 00 00 00 02 31 00 33  |+main.=......1.3|
00002060  32 00 38 03 00 b0 1d 00  03 00 e0 1d 00 03 00 90  |2.8.............|
00002070  1e 00 00 00 00 00 00 00  b0 1d 30 09 27 00 00 00  |..........0.'...|
00002080  43 00 00 00 0e 01 00 00  e9 0e 00 00 01 00 00 00  |C...............|
00002090  02 00 00 00 0f 01 10 00  00 00 00 00 01 00 00 00  |................|
000020a0  16 00 00 00 0f 01 00 00  10 0f 00 00 01 00 00 00  |................|
000020b0  1c 00 00 00 0f 01 00 00  b0 0e 00 00 01 00 00 00  |................|
000020c0  23 00 00 00 0f 01 00 00  e0 0e 00 00 01 00 00 00  |#...............|
000020d0  2a 00 00 00 01 00 00 01  00 00 00 00 00 00 00 00  |*...............|
000020e0  32 00 00 00 01 00 00 01  00 00 00 00 00 00 00 00  |2...............|
000020f0  05 00 00 00 06 00 00 00  00 00 00 40 05 00 00 00  |...........@....|
00002100  20 00 5f 5f 6d 68 5f 65  78 65 63 75 74 65 5f 68  | .__mh_execute_h|
00002110  65 61 64 65 72 00 5f 6d  61 69 6e 00 5f 74 65 73  |eader._main._tes|
00002120  74 31 00 5f 74 65 73 74  32 00 5f 70 72 69 6e 74  |t1._test2._print|
00002130  66 00 64 79 6c 64 5f 73  74 75 62 5f 62 69 6e 64  |f.dyld_stub_bind|
00002140  65 72 00 6a 75 6d 70 5f  74 65 73 74 32 00 00 00  |er.jump_test2...|
00002150
AMIGA:amiga~/Desktop/Code/C> _

The important bit is this:

00000eb0  55 48 89 e5 48 83 ec 10  90 e9 2b 00 00 00 90 48  |UH..H.....+....H|
00000ec0  8d 3d 9c 00 00 00 b0 00  e8 73 00 00 00 89 45 fc  |.=.......s....E.|
00000ed0  48 83 c4 10 5d c3 66 2e  0f 1f 84 00 00 00 00 00  |H...].f.........|
00000ee0  55 48 89 e5 48 83 ec 10  90 90 48 8d 3d 8b 00 00  |UH..H.....H.=...|
00000ef0  00 b0 00 e8 48 00 00 00  89 45 fc 48 83 c4 10 5d  |....H....E.H...]|
00000f00  c3 66 66 66 66 66 66 2e  0f 1f 84 00 00 00 00 00  |.ffffff.........|
00000f10  55 48 89 e5 48 83 ec 10  c7 45 fc 00 00 00 00 e8  |UH..H....E......|
00000f20  8c ff ff ff 48 8d 3d 68  00 00 00 b0 00 e8 0e 00  |....H.=h........|
00000f30  00 00 31 c9 89 45 f8 89  c8 48 83 c4 10 5d c3 90  |..1..E...H...]..|
00000f40  ff 25 ca 00 00 00 00 00  4c 8d 1d b9 00 00 00 41  |.%......L......A|
00000f50  53 ff 25 a9 00 00 00 90  68 00 00 00 00 e9 e6 ff  |S.%.....h.......|
00000f60  ff ff 54 68 69 73 20 77  69 6c 6c 20 6e 65 76 65  |..This will neve|
00000f70  72 20 62 65 20 73 65 65  6e 21 0a 00 54 68 69 73  |r be seen!..This|
00000f80  20 77 69 6c 6c 20 62 65  20 70 72 69 6e 74 65 64  | will be printed|
00000f90  2e 0a 00 48 65 6c 6c 6f  20 57 6f 72 6c 64 21 0a  |...Hello World!.|
00000fa0  00 00 00 00 01 00 00 00  1c 00 00 00 01 00 00 00  |................|

From main(), test1() is called which returns via the test2() function and from reading the hexdump...
The "nop"s, ([0x]90), are only there for easy detection inside the hexdump...
So starting at the end of the first 'nop' in function test1() the first instruction is a 32 bit jump of length 43, ([0x]2b), bytes.
So at byte position '00000ec9' we get:
e9 2b 00 00 00 which brings you inside the second function 'nop' located at byte position '00000ee8' ready to execute the next 'nop' instruction.
And from then runs the second function which prints a string, the string in test1() is ignored.

My C days are long gone, and assembler even longer, so I don't have any authority to speak up, but to me it seems clear and logical that no strange behaviour e.g. "segmentation fault" comes up with your above code. One reason amongst others for segmentation faults is stack corruption, which may occur if a function is not left (and tidied up) correctly. But, in above example, the two functions have the same parameter / argument structure (none, to be specific), and identical local variable definitions (namely none), so the (quite complex, generated internally by the compiler) return operation includes the same stack tidying up, resulting in test2() 's return statement leaving behind a clean stack although geared up by test1() .

What if you specify a large argument list for one of the functions, and define several local variables? Pls try and report back.

Aside: I'm afraid you're slightly off with your hex locations. The first jump takes off from location 0xEBE, and it lands on 0xEE9, right between the two NOPs, which is exactly where the label definition occurred.

1 Like

This is basically another way to do the undefined operation thing.

asm :
Not part of standard C, so whatever asm does is implementation defined, i.e., the people who wrote gcc
Since it's not mentioned in ISO C standard (n1570 draft-- C2011), but mentioned in annex J (common extensions):

Annex J is informative, not normative, so an implementation need not provide inline assembly, and if it does it's not prescribed in which form. But it's a widespread extension, though not portable since compilers do indeed implement it differently.

In the C++ standard (n3376 draft of the C++11 standard), it is NOT mentioned in the body of the standard.

I think Rudi correct. Try using return statements. That return values used by the calling code. Other than learning what not to do for reasonable code, what does this do for you? If you had not asked here your might have accidentally created a horrible bug in a piece of code that you thought was okay. It's okay with me, but it does not seem all that helpful....

1 Like

Hi,
I do not quite understand what the problem is. But a walk in the GDB will not be amiss

gcc -O0 -fno-asynchronous-unwind-tables -S test.c
gcc -g test.s

I always create a file
cat sc

file a.out
b main
r
tui enable
la src 
la r
fs p

Open next terminal

tty
/dev/pts/2

Return to the first and run GDB

gdb -x sc -tty /dev/pts/2

And go through the program

(gdb) next

maybe this will helpful

1 Like

Thanks Jim...
I only attempted it to see if it was possible, and it DOES give a segmentation fault IF and only IF the 'jmp' goes directly into main() .
BUT it still compiles...

I created this absolute meaningless garbage and it compiles without warnings and errors and look what happens:

#include <stdio.h>
#include <stdlib.h>

int NUMBER1 = 255;
float NUMBER2 = 3.1415926;
char CHARACTER = 'X';

int test1(int NUMBER1)
{
    __asm__ volatile(
        "nop;"
        "jmp    jump_test2;"
        "nop;");
    NUMBER1 = 123;
    NUMBER2 = 1.414;
    CHARACTER = '!';
    printf("This will never be seen!\n");
    return NUMBER1;
}

float test2(float NUMBER2)
{
    /* The 'test2:' label sits in here. */
    asm(
        "nop;"
        "jump_test2:"
        "nop;");
    NUMBER1 = -65;
    NUMBER2 = 2.718;
    CHARACTER = '?';
    printf("This will be printed.\n");
    printf("%.03f%c\n", NUMBER2, CHARACTER);
    return NUMBER2;
}

int main(int NUMBER1, char **CHARACTER)
{
    NUMBER1 = test1(NUMBER1);
    printf("%d %s\n", NUMBER1, *CHARACTER);
    printf("Hello World!\n");
    return(0);
}

Results on OSX 10.14.3, default bash terminal, gcc 4.2.1.
(IMPORTANT! NOT checked on gcc 2.95.2 or 7.3.0.)

Last login: Thu Mar 28 08:36:48 on ttys000
AMIGA:amiga~> cd Desktop/Code/C
AMIGA:amiga~/Desktop/Code/C> gcc cross_function_jump.c
AMIGA:amiga~/Desktop/Code/C> cp a.out cross_function_jump
AMIGA:amiga~/Desktop/Code/C> ./cross_function_jump
This will be printed.
2.718?
7 ./cross_function_jump
Hello World!
AMIGA:amiga~/Desktop/Code/C> _

7 ./cross_function_jump is obviously wrong but I have successfully got 'argv[0]'.
I don't care what is happening but compiling AND running without a segmentation fault is is not a fault of the programmer but of the compiler.

Any 'asm()' whether part of compliance or not code should never be allowed to jump out of its own function domain and this was my point entirely.

What I have done I would never use in practice but I would use inline assembly for mission critical stuff inside its own function.

All I wanted to know why these compile and run, garbage results or not.

Why should your first program crash? You're not touching any memory you shouldn't, and one "ret" is as good as another as long as you've got the same size stack context, which you do to the last byte.

Because argv[0] actually exists, even in a program with no arguments - it's the name of the calling program.

Again, why should it segfault? What exact fault should it be catching here? Segmentation fault means "touched memory I don't have permission to use", and if you don't do that, you don't get a segfault, even if you leap around like a flea on a hot griddle.

Apologies if this attaches itself to to my previous post, and for any typos!

Well the story so far:
This is working code and compiles on gcc versions 2.95.3, 4.2.1 and 7.3.0, AMIGA OS 3.0.x inside ADE, OSX 10.14.3 and Linux Mint 19.

I have called it 'obfuscate_asm.c'.
Although jumping INTO main() compiles BUT causes s segmentation fault, jumping OUT if it doesn't.
Just read the code and see the possibilities. Makes me wonder if this is done by commercial coders.

/* Real obfuscation scenario. */

#include <stdio.h>

int addition(int NUM1, int NUM2);
int obfuscate(int NUM1, int NUM2);

int main()
{
    int NUMBER1;
    int NUMBER2;
    int SUM;

    asm("nop;"
    "jmp    out;"
    "in:"
    "nop;");

    printf("Enter two integer numbers:- ");
    scanf("%d %d", &NUMBER1, &NUMBER2);

    SUM = addition(NUMBER1, NUMBER2);

    printf("SUM = %d.\n", SUM);

    return(0);
}

int addition(int NUM1,int NUM2)
{
    int RESULT;
    RESULT = NUM1 + NUM2;
    return RESULT;
}

int obfuscate(int NUM1,int NUM2)
{
    int RESULT;

    asm("jmp    getout;"
    "nop;"
    "out:"
    "nop;"
    "jmp    in;"
    "getout:"
    "nop;");
    /* Any number of these jimps could be used to hide stuff. */

    RESULT = NUM1 / NUM2;
    return RESULT;
}

Results; OSX 10.14.3, default bash terminal, gcc 4.2.1.

Last login: Thu Mar 28 14:55:00 on ttys000
AMIGA:amiga~> cd Desktop/Code/C
AMIGA:amiga~/Desktop/Code/C> gcc obfuscate_asm.c
AMIGA:amiga~/Desktop/Code/C> ./a.out
Enter two integer numbers:- 123 456
SUM = 579.
AMIGA:amiga~/Desktop/Code/C> hexdump -C a.out
........
00000e90  55 48 89 e5 48 83 ec 20  c7 45 fc 00 00 00 00 90  |UH..H.. .E......|
00000ea0  e9 8b 00 00 00 90 48 8d  3d cb 00 00 00 b0 00 e8  |......H.=.......|
00000eb0  92 00 00 00 48 8d 3d da  00 00 00 48 8d 75 f8 48  |....H.=....H.u.H|
00000ec0  8d 55 f4 89 45 ec b0 00  e8 7f 00 00 00 8b 7d f8  |.U..E.........}.|
00000ed0  8b 75 f4 89 45 e8 e8 25  00 00 00 48 8d 3d b9 00  |.u..E..%...H.=..|
00000ee0  00 00 89 45 f0 8b 75 f0  b0 00 e8 57 00 00 00 31  |...E..u....W...1|
00000ef0  f6 89 45 e4 89 f0 48 83  c4 20 5d c3 0f 1f 40 00  |..E...H.. ]...@.|
00000f00  55 48 89 e5 89 7d fc 89  75 f8 8b 75 fc 03 75 f8  |UH...}..u..u..u.|
00000f10  89 75 f4 8b 45 f4 5d c3  0f 1f 84 00 00 00 00 00  |.u..E.].........|
00000f20  55 48 89 e5 89 7d fc 89  75 f8 e9 07 00 00 00 90  |UH...}..u.......|
00000f30  90 e9 6f ff ff ff 90 8b  45 fc 99 f7 7d f8 89 45  |..o.....E...}..E|
00000f40  f4 8b 45 f4 5d c3 ff 25  c4 00 00 00 ff 25 c6 00  |..E.]..%.....%..|
00000f50  00 00 00 00 4c 8d 1d ad  00 00 00 41 53 ff 25 9d  |....L......AS.%.|
00000f60  00 00 00 90 68 00 00 00  00 e9 e6 ff ff ff 68 0e  |....h.........h.|
00000f70  00 00 00 e9 dc ff ff ff  45 6e 74 65 72 20 74 77  |........Enter tw|
00000f80  6f 20 69 6e 74 65 67 65  72 20 6e 75 6d 62 65 72  |o integer number|
00000f90  73 3a 2d 20 00 25 64 20  25 64 00 53 55 4d 20 3d  |s:- .%d %d.SUM =|
00000fa0  20 25 64 2e 0a 00 00 00  01 00 00 00 1c 00 00 00  | %d.............|
........
 AMIGA:amiga~/Desktop/Code/C> _

As you can see by checking the 32 bit jumps, in this case e9 xx xx xx xx you can obfuscate your code by jumping all over the place to non used functions that actually would do something if called, obfuscate() would actually divide two numbers.
So IMHO gcc is not absolutely foolproof.
I have always quoted, even on here, "IF there is a back door I will find it!"
When I am really interested in finding something I will do my best to find it.

How do you compile ? Try gcc -Wall -ansi mycode.c -o mycode

I'm guessing, completely. You have to read your manpage for gcc , look for the options for your hardware e.g., SPARC
Compile with the strictest settings you can find. This will eliminate some of the problems you see: gcc compiling stuff that should fail.
Try to use -std=c99 if your compiler supports it, for example. Since you run on OSX and Amiga (I think), I have to punt on what the exact command should be.

PS: a good clean compile means zero warnings/errors

1 Like

Ah, now you're starting to jump between function with different amounts of local variables. Meaning these local variables may not actually be allocated properly when you use them or freed properly when you return, causing corruption on the stack (i.e. important pointer values on the stack overwritten with your local variables since stack space was never made for them), causing potential crashes on return when RET jumps into lala land. This is not recommended.

Also, main() is somewhat special, to the point newer compilers have stopped letting you take the address of it.

Further, doing things you didn't ask the compiler to do is begging for trouble. The compiler loves to remove things you "don't use", to the point that if you never touch a variable in your program, it might optimize it away completely and use hardcoded values instead. You have to make these variables 'volatile' to force the compiler to not do anything smart and helpful for you.

IMO gcc should check for any assembly jumps outside the bounds of any individual function, irrespective of how one compiles it. Just because the first code uses two near identical functions is irrelevant the 'asm()' function is taking functions out of their boundaries.
And, I don't intend, ever, to use this bad coding method at all. It was an exercise in finding out as I was researching about 'goto' inside a function.

--- Post updated at 03:56 PM ---

Hi Jim...
Absolutely.
However I shouldn't have to type all that just to find out, gcc filename.c should be sufficient.

Last login: Thu Mar 28 14:56:34 on ttys000
AMIGA:amiga~> cd Desktop/Code/C
AMIGA:amiga~/Desktop/Code/C> gcc -Wall -ansi obfuscate_asm.c -o obfuscate_asm
obfuscate_asm.c:14:2: warning: implicit declaration of function 'asm'
      [-Wimplicit-function-declaration]
        asm("nop;"
        ^
obfuscate_asm.c:40:2: warning: implicit declaration of function 'asm'
      [-Wimplicit-function-declaration]
        asm("jmp    getout;"
        ^
2 warnings generated.
Undefined symbols for architecture x86_64:
  "_asm", referenced from:
      _main in obfuscate_asm-09a969.o
      _obfuscate in obfuscate_asm-09a969.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
 AMIGA:amiga~/Desktop/Code/C> _

It should have been __asm__ volatile() but I used the generic 'asm()' as it seems 'universal' even on my AMIGA C compilers.

Thanks for your help...

EDIT:
I didn't add this as an edit, it did so itself.

Apologies if this attaches to the previous post.
As a finale to this thread I decided to compile using the correct 'asm()' function __asm__ volatile() .

/* Real obfuscation scenario. */

#include <stdio.h>

int addition(int NUM1, int NUM2);
int obfuscate(int NUM1, int NUM2);

int main()
{
    int NUMBER1;
    int NUMBER2;
    int SUM;

    __asm__ volatile("nop;"
    "jmp    out;"
    "in:"
    "nop;");

    printf("Enter two integer numbers:- ");
    scanf("%d %d", &NUMBER1, &NUMBER2);

    SUM = addition(NUMBER1, NUMBER2);

    printf("SUM = %d.\n", SUM);

    return(0);
}

int addition(int NUM1,int NUM2)
{
    int RESULT;
    RESULT = NUM1 + NUM2;
    return RESULT;
}

int obfuscate(int NUM1,int NUM2)
{
    int RESULT;

    __asm__ volatile("jmp    getout;"
    "nop;"
    "out:"
    "nop;"
    "jmp    in;"
    "getout:"
    "nop;");
    /* Any number of these jimps could be used to hide stuff. */

    RESULT = NUM1 / NUM2;
    return RESULT;
}

Results, OSX 10.14.3, default bash terminal, gcc 4.2.1.
The results are similar for AMIGA OS 3.0.x gcc 2.95.3 and Linunx Mint 19 gcc 7.3.0.

Last login: Thu Mar 28 18:58:36 on console
AMIGA:amiga~> cd Desktop/Code/C
AMIGA:amiga~/Desktop/Code/C> gcc -Wall -ansi obfuscate_asm.c -o obfuscate_asm
AMIGA:amiga~/Desktop/Code/C> ./obfuscate_asm
Enter two integer numbers:- 123 456
SUM = 579.
AMIGA:amiga~/Desktop/Code/C> hexdump -C obfuscate_asm
........
00000e90  55 48 89 e5 48 83 ec 20  c7 45 fc 00 00 00 00 90  |UH..H.. .E......|
00000ea0  e9 8b 00 00 00 90 48 8d  3d cb 00 00 00 b0 00 e8  |......H.=.......|
00000eb0  92 00 00 00 48 8d 3d da  00 00 00 48 8d 75 f8 48  |....H.=....H.u.H|
00000ec0  8d 55 f4 89 45 ec b0 00  e8 7f 00 00 00 8b 7d f8  |.U..E.........}.|
00000ed0  8b 75 f4 89 45 e8 e8 25  00 00 00 48 8d 3d b9 00  |.u..E..%...H.=..|
00000ee0  00 00 89 45 f0 8b 75 f0  b0 00 e8 57 00 00 00 31  |...E..u....W...1|
00000ef0  f6 89 45 e4 89 f0 48 83  c4 20 5d c3 0f 1f 40 00  |..E...H.. ]...@.|
00000f00  55 48 89 e5 89 7d fc 89  75 f8 8b 75 fc 03 75 f8  |UH...}..u..u..u.|
00000f10  89 75 f4 8b 45 f4 5d c3  0f 1f 84 00 00 00 00 00  |.u..E.].........|
00000f20  55 48 89 e5 89 7d fc 89  75 f8 e9 07 00 00 00 90  |UH...}..u.......|
00000f30  90 e9 6f ff ff ff 90 8b  45 fc 99 f7 7d f8 89 45  |..o.....E...}..E|
00000f40  f4 8b 45 f4 5d c3 ff 25  c4 00 00 00 ff 25 c6 00  |..E.]..%.....%..|
00000f50  00 00 00 00 4c 8d 1d ad  00 00 00 41 53 ff 25 9d  |....L......AS.%.|
00000f60  00 00 00 90 68 00 00 00  00 e9 e6 ff ff ff 68 0e  |....h.........h.|
00000f70  00 00 00 e9 dc ff ff ff  45 6e 74 65 72 20 74 77  |........Enter tw|
00000f80  6f 20 69 6e 74 65 67 65  72 20 6e 75 6d 62 65 72  |o integer number|
00000f90  73 3a 2d 20 00 25 64 20  25 64 00 53 55 4d 20 3d  |s:- .%d %d.SUM =|
00000fa0  20 25 64 2e 0a 00 00 00  01 00 00 00 1c 00 00 00  | %d.............|
........
AMIGA:amiga~/Desktop/Code/C> _

Voila! No error but the jumps are still there.

It's necessary sometimes, if you're building an operating system for example, to insert special instructions here and there without the compiler's interference. That's the kind of thing asm() is for. gcc will insert raw assembly if you ask, but you really have to know what you're doing since it can't protect you( though some more advanced syntax lets you warn gcc about side-effects instead). Plain, non-ASM goto (yes, it exists, very rarely used) wouldn't let you jump out of bounds.

1 Like

As I have said in the distant past that I have coded assembly in 16 and 32 bit intel architecture but no experience in 64 bit, although I suspect there is not much difference.
And, I would only use it for mission critical stuff of which these days there is no need as HW interface APIs are usually far more than good enough for this purpose especially as the UNIX ethos is that everything is a file.

One thing for sure I am getting to know how gcc __thinks__ and compared to say Dice-C or VBCC for the AMIGA it is mega-powerful.
And finally I know that 'goto' is local to the function that uses it, and a good thing it is too.