Need more info on internals of c compilers

Hello Gurus,

i am ok with the concepts of c language but i would like to know
more about the internals of c with respect to the compilers

what happens when we say
gcc filename.c

the a.out will get created(what actaully compiler does to the code inaddition to generating object code)

gcc -c filename.c ( )

creates only .o files we need to link them to get a.out(i need extra)

gcc -o fn filename.c ( )

request you to provide me the more info on following
symbol table( how the linker actually uses this info),virtual address(what is this why its needed), physical address(what is this why its needed),linker,loading,locating.
expecting the above with repect to the gcc compilers

thanks in advance
either links or some descrption i am expecting

Homework?

this is not a home work?
i am not even a student, i recently joined a company but struggling to understand the internals of the compilers.

Everything you asked is outside the work the compiler does.

The compiler's main objective is to take (preprocessed) high-level code, and translate it to assembly code. Example:

$ cat hello.c
#include <stdio.h>

int main(int argc, char **argv)
{
        printf("Hello, World!\n");
        return 0;
}
$ gcc -S -o hello.s hello.c
$ cat hello.s
        .file   "hello.c"
        .section        .rodata
.LC0:
        .string "Hello, World!"
        .text
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        subl    $4, %esp
        movl    $.LC0, (%esp)
        call    puts
        movl    $0, %eax
        addl    $4, %esp
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.2.1 (SUSE Linux)"
        .section        .note.GNU-stack,"",@progbits

The assembler then uses that to create an object file (the output of 'gcc -c').
The linker takes the object file and connects it with various libraries to create an executable.
The loader reads the executable, and loads the referenced libraries, if available, and maps them into virtual memory.

The 'gcc' command, by default, hides these steps from the user, and invokes the complete toolchain (for a C program: cpp [C Pre-Processor], cc [C Compiler], as [Assembler], and ld [Linker])

Further reading: Compiler, Assembler, Linker, Virtual memory, and a series of blog posts on Linkers

---------- Post updated at 11:37 ---------- Previous update was at 11:18 ----------

Addendum: if you compile the program above (hello.c) using

gcc -v -save-temps -o hello hello.c

you'll see all the "compilation" steps, and the intermediate steps will be saved to hello.i (preprocessed), hello.s (compiled), hello.o (assembled), and hello (linked).

for your understanding at a higher, non-programmer level -

cc myfile.c -o myfile -lm
# or maybe gcc
gcc myfile.c -o myfile -lm

The above statement does this

  1. preprocesses - meaning it invokes a program to honor the #include directives, #define directives in myfile.c. The name of the program that does this is usually cpp.

  2. compile - transform the preprocessed code into assembly language - see pludi's
    nice description above

  3. link edit - this is done by the ld program. The -lm brings in a library (precompiled shared library of code) called libm. The ld program looks for all of the external symbols in your code (like printf() ) and find the external symbol in a library. Each external function you call has to be tracked down then evaluated by ld. It also finds external variables for you. It packs all of this into a specially formatted file - in "a.out" format or ELF format - or whatever file format your machine uses. There are others.

The final product is compiled or binary code, which can be run directly by the operating system.

The stement aboves means "compile myfile.c, write it out to a binary file called myfile and link against libm". BTW, ld by default, also links against the standard C library, libc, so you do not have to "ask" for it.

thank you for your support.

soon i will come up with some other queries.