Compilation, Interpretation, Virtualisation etc. How Computer Programming Works - PART 2

Compilation, Interpretation, Virtualisation etc. How Computer Programming Works - PART 2

How computer programs run

If you have not read PART 1, we are still selling tickets.

The programmers had two options:

  1. To either come up with yet another programming language to solve the challenge of "manual addressing".
  2. or to ditch their boring jobs and form a karaoke band.

But no, they chose to face the challenge head-on and you will never guess what they came up with - High Level Languages.

A set of languages that had close-to-human-language syntax and truly abstracted what the actual instructions were doing to the underlying metal - meaning you could truly add two numbers without the fear of applying the wrong addressing mode. A tremendous achievement.

Not only that, remember the case where every architecture had its instruction set and you had to write separate code for each architecture? well bye bye ISA-specific code because these new saviors(s) didn't have to be completely rewritten to make them work on other machines.

/*Context: C */
int add(int i, int j)
{
    int p = i + j; // don't mind me, just adding two numbers
    return p;
}

Source

Now here is the assembly language equivalent

.globl add
add:
    pushl %ebp
    movl %esp, %ebp
    subl $4, %esp        //create space for integer p
    movl 8(%ebp),%edx    //8(%ebp) refers to i
    addl 12(%ebp), %edx    //12(%ebp) refers to j
    movl %edx, -4(%ebp)    //-4(%ebp) refers to p
    movl -4(%ebp), %eax    //store return value in eax
    leave            //i.e. to movl %ebp, %esp; popl %ebp ret

//in other words, somebody HELP me!!!

Source

It is indeed true that one line of C may be equivalent to 10 lines of assembly.

Other than full abstraction (hiding) of the underlying machine's modes of operation, they provided many more features out of the box; cleaner code structuring through functions, grouping data into various data types, more human-friendly conditional constructs like if, else, else if etc. The way all these are mapped to their respective machine constructs are way beyond the scope of this series but it's not hard to understand either. All of these you had to write once and they can be made to run on other machines. But how?

You see, when you write code in a high level language like C with the intentions of porting (moving) it to a machine that runs an operating system (a behemoth of code that manages the CPU (the processor but with more stuff) and other devices that are connected to it so you don't have to) like Windows, there is more to the story than it always seems.

The earliest instructions written in HLL had to go through a process called Compilation. Here, all the fancy human-friendly code we write go through a series of steps before they turn into something you can double-click and complain about.

The operating system to which you are porting your program (the code that will be run) doesn't really matter, what matters in fact is the processor on which the program will run. In other words, if the target operating system is Windows, then the first concern is what CPU architecture the machine the Windows is running on is.

Let's say it is running on an Intel processor (Intel is a company that makes processors), then what matters is the code that is just needed to make your program compatible with the Intel CPU. This needed code my friends, is called the object code. Object code can be in any form, may be part-assembly-part-machine, non-yet-complete machine code or even full machine code.

Now that we know that the target architecture is taken care of, we don't expect the end-users to figure out how to wire the code into the processor because that is what the operating system is for.

But how do you get the program to run on the target OS, why don't we just dump the object file (the file containing the object code) on Windows and then double-click it?

Well, you could - except it won't run, most of the time.

You see, every operating system already has a way they talk to the machine. The way Ubuntu (another operating system) running on an Intel machine would pass on code (instructions) to the machine is different from how Windows would.

But they promised us cross-platform 🥺

Well, since you didn't create the target OS (coz if you did you would be stinking rich like a certain individual), it is up to the OS manufacturers to fill that in. So how do they do it?

An operating system is just a set of instructions (large volume of instructions) running on in the memory, nothing special (except that you have to pay licensing fees). As such, the OS manufacturers have done a lot of work and figured out some cool things like connecting to the output device (screen), writing a set of characters to the screen, connecting to an input device like the keyboard, the sounds and so on. They have written all the code for that and that's why you don't have to write code for buffering bytes for the purpose of displaying them to the screen.

So we expect them to be nice enough to make those code available so we don't have to go through the same stress. And yes they did - or did they?