Compilation, Interpretation, Virtualisation etc. How Computer Programming Works - PART 3
How computer programs run
The OS manufacturers didn't just give out the code (you wish). Instead, the code (or rather bunch of code) is in object form (it has been compiled) itself.
But why do we even need this in the first place?
Let's recall our simple C program from PART 2
/*Context: C */
int add(int i, int j)
{
int p = i + j;
return p;
}
Suppose, we choose to print the result to the output device (specifically the screen) so we add another line to print the result as below.
/*Context: C */
#include <stdio.h> //notice this new code
int add(int i, int j)
{
int p = i + j;
printf(%d, p); //and this one
}
But hold on! we never said anything about printing. The last line of code in the above snippet didn't even mention where the device is located, which device, an actual printer or the monitor, what character encoding (type of text) are we expecting the result in.
Well, we could go through the process of determining how to connect to the monitor ourselves by studying where the display device's framebuffer is mapped in the memory, and maybe get a copy of the Intel Processor Developer's Manual and spend one more year trying and failing but not before contacting the manufacturers of the display device themselves for a copy of their developer's manual.
You can imagine the amount of work that would be needed.
So the OS manufacturers have created these routines (written the code) and have compiled them into object formats. The set of object files that have been made available by a third party (someone other than us) to make performing some tasks easily are called Libraries.
But what if I don't need to print, do I still need these sets of external object code?
Yes darling.
Printing and device-specific stuff (like reading inputs from a keyboard) aren't the only things we need libraries for. Remember I mentioned in PART 2 that you don't just expect your instructions (let's start calling them programs) to be consumed by the processor anyhow, they must go through the OS - because in order to maintain the sanity of the whole system (for example; by ensuring your program is not intruding on other programs' space in the memory), the OS is the only one that has the authority to package your code and give it to the processor execution. The process by which it does this successfully is beyond the scope of this series and within another domain of study entirely.
So the OS manufacturers have created some libraries that are needed by any program that aims to get to the processor just to make sure it follows the rules.
So that been said, we can create a model of what we expect the final result (what will actually get to run) to be made of.
your_own_object_code + optional_libraries (eg for printing and input) + those_mandatory_libraries = final result
These process of wiring all the object codes together ladies and gentlemen, is called Linking and the final result is called the Executable.
Since our printf operation has just been condensed into a function (a block of task-specific code) we need to make reference to its original source and that's why we needed the #include
directive although this is specific to the C language, other languages have ways of making reference to any code that is expected to be found in an external library.
Since the simplest source file (file containing the original code you wrote) needs these mandatory libraries, a form of mandatory linking is done on every program after compilation. In fact, most toolchains (compilers with linkers and other stuff) will just link your object file to these libraries without you having to specify them and generate your executable in the format your target OS recorgnises. These kinds of executables (the ones generated after code passes through these stages are called Binary Executables
Now don't get me wrong, the final executable is still in the 1s and 0s the machine understands and that's the beauty of it - we get to write code in almost-English-like form and our code gets to the machine in the native language it understands. These streams of 1s and 0s are not just any binary instructions, they are called the machine binary or native binary.
Are there other kinds of binaries?
Yes, we'll get to that later in the series.
Whenever a program is executed in such a way that each instruction in its original source now exists as its corresponding machine binary form, such a program is said to be running natively.
Let's face it; we have been having a good time ever since we left the era of machine code - or at least that was what we thought.