A Very Gentle Introduction to WebAssembly

A Very Gentle Introduction to WebAssembly

The road to near-native execution in the browser

ยท

9 min read

The Story

The web has evolved from a platform for merely reading headlines to one that hosts the lifestyles of millions of users: from Grocery and Entertainment to Housing and Productivity. Building on the web is an all-win situation; the developers don't have to learn native technologies while making their ideas universally available and the users are also availed of the challenges that come with manually installing their native applications.

With the modern web, developers now have access to native features like Device Location, Device Orientation, Notification and many more, and can have their web applications possess some key attributes of their native counterparts across all devices, all from a single codebase.

So it makes sense that users expect the web to be the default platform for delivering any kind of technological solution, and indeed -- it is.

A recent achievement is how seamless it now is to transition from your regular life to any product on the web by having the web app installed directly on your device and function as though it were a mobile or desktop application, granting you native user experience (UX); these kinds of web applications are referred to as progressive web apps (PWA). The experiences they provide are getting closer to their native counterparts -- but the truth is -- native UX is not the same as native performance.

You can have an app pretend to be native by giving it access to native functionalities coupled with clever UI design, but there is still one more culprit that always breaks that illusion -- speed. Hitting native or near-native speed has been technically impossible due to how the web clients (mostly browsers) are designed -- that's until WebAssembly came into the scene.

So What's WebAssembly?

"WebAssembly is a binary instruction format for a stack-based virtual machine". To understand this, we need to understand how code runs in the browser. You see, JavaScript is the only turing-complete programming language the browser can run. A turing-complete language is one that can be used to model any computational problem; in simple terms, it can solve (or simulate) any programming task. But JavaScript does not talk to the machine directly, it is instead converted into a different format know as bytecode which is then run in a special kind of environment called a virtual machine (VM).

A virtual machine (VM) is a computer program that aims to model the architecture of a real or hypothetical computer. So, instead of the binaries running directly on your actual machine, it is run in this hypothetical environment in this bytecode format. A major advantage of this approach is that the VM guarantees that programs execute similarly (at least in theory) on different platforms, and since they will be running in the hypothetical VM anyway, developers don't need to learn platform-specific routines as they would have already been abstracted by the VM. Another advantage is that since the bytecode only knows the VM as its environment, it can be optimised for the VM and such optimisation can still have effect on any platform that hosts the VM.

As cool as this sounds, it however takes a heavy performance toll on JavaScript. This is because the code (or its end result) runs in a (somewhat) simulated environment which would need to run in the memory. Ultimately, the actual machine is only aware of the VM and therefore interacts with it -- and not your code๐Ÿ˜”.

The WebAssembly stack-based VM is another VM that the browser now supports and it has something the JavaScript VM does not have: the ability to convert the bytecodes to their native binary equivalent (the 1s and 0s your machine actually understands) at runtime. This superpower, my friends, is known as Just in Time compilation (JIT).

If you want to learn more about JIT, VMs and the program runtime in general, I have a series dedicated to that.

The Wasm VM understands a different kind of bytecode, the Wasm bytecode, which is the binary instruction format we mentioned earlier; and allows for the conversion of these bytecodes to native binaries at runtime(when your program is running). This JIT-ing gives almost native (near-native) execution since your code is converted to what the machine understands will finally interact with at runtime.

This opens us to a whole new world of possibilities that are free of the shortcomings of our non JIT-ed past. Since your programs will eventually be in main memory, the CPU interacts with it at the speed of the CPU. Applications can thus run faster and developers can finally breach the speed gap that has been the barrier for far too long.

Enough talk, let's right some Wasm

(module
  (func (export "addTwo") (param i32 i32) (result i32)
    local.get 0
    local.get 1
    i32.add))

Source

The above code snippet is in the WebAssembly Text Format or wat. WebAssembly comes in two formats: WebAssembly Text Format (wat) and the WebAssembly Binary Format (wasm). The WAT is a human-readable format that is designed for we humans to read (and maybe write๐Ÿคท), still, the Wasm VM does not understand this format and any code written in this format needs to be compiled to the binary format. This makes Wasm a compilation target. I will touch the concept of compilation targets in a few minutes but for now, let's go through the above code.

Every WebAssembly routine is a module, which is like a tree of nodes. So in the above snippet we create a module , which is always the root node and then a func node. Our func node is a function which we wish to present to the browser as "addTwo" in case we need to refer to it in the future; and for that, we apply the export keyword.

The param keyword declares two integer parameters that are 32bits in size and the result keyword indicates that were are expecting this function to return a 32bit integer back to its caller.

Inside our function body (notice we are still within the open parenthesis that precedes func), we invoke the local.get instruction; what this does is to read and push the value of any local variable of this function (in other words, variables that only has a scope within this function) into the stack. So local.get 0 pushes the 0th local variable in which case is our first argument and so on. The i32.add then pops two i32 variables from the stack, sums them arithmetically and pushes the result back as an i32 into the stack. So when this function is called, it returns the value at the top of the stack, the sum of the two numbers in our case.

Using our WebAssembly Module(s) in the browser

Note: I am using Chrome Version 101.0.4951.54 on Windows 11

Now let's try to use our code in a real project:

  1. Begin by create a folder and call it wasmtest.
  2. Now create a file in that folder and call it add.wat.
  3. then copy our above code snippet into the add.wat file.

Before we can make our little addition function available in the browser, we need a tool that can help us compile our WAT expressions to the wasm binary format that the Wasm VM understands. So let's get that from the release directory of the official WebAssembly Binary Toolkit project. Extract the downloaded compressed file and you should have something like the below.

extract wabt files.png

Now we will use the wat2wasm tool to convert our wat file. I did this in a Windows-specific way by already adding the wasm tools folder to the PATH variable on Windows and then navigating to my wasmtest directory to run:

$ wat2wasm add.wat

The steps should be similar on the platform you are running and now you should have a add.wasm file in your wasmtest diretory.

now have wasm file.png

Here comes the cool part!๐Ÿ˜Š. We want to create a form that provides us with two input fields, we extract the user's entries with JavaScript and prepare them as integers. We then create a tiny routine that loads our wasm file and makes it available as a module that can be invoked. We finally invoke the addTwo function from our module and print the result to the console. Lobatan!

Create an index.html file and fill it with this:

<!DOCTYPE html>
<html>
<body>
    <input type="text" id="arg1" />
    <input type="text" id="arg2" />
    <button onclick="add()">Add</button>
    <script>
        function add() {
            var arg1 = parseInt(document.getElementById("arg1").value);
            var arg2 = parseInt(document.getElementById("arg2").value);

            WebAssembly.instantiateStreaming(
                fetch('./add.wasm'))
                .then(module => {
                    const addTwo = module.instance.exports.addTwo;
                    console.log(`web assembly thinks the sum is ${addTwo(arg1, arg2)}`);
                });
        }
    </script>
</body>
</html>

Since we are using JavaScript's fetch method which only works when we are serving over a network, we need to serve our file locally over HTTP. For this, I used an Apache server I have installed on my PC by invoking it through the command line to serve our index.html file at localhost:8000. You can as well use the static-server npm package. The final result on hitting that Add button is:

final result.png

And that -- friends, is your first WebAssembly project ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰๐Ÿฅณ๐Ÿฅณ๐Ÿฅณ.

Is that all?

Not even close. Writing WebAssembly by hand even with wat is not a dream-come-true for anyone and basic tasks might quickly become painful to implement. For example, strings are not yet available in wat, so you can imagine how hard it will be to implement something that requires String.

I mentioned Wasm earlier as a compilation target, this means it is intended that other languages will compile to Wasm, so you can say wat is just the first language to do so. So, many other languages that are relatively easier to write can now compile to Wasm binaries most popular of them being (but not limited to) Rust and C/C++. But If you still want to take advantage of Wasm for the browser without learning something entirely new like Rust and C/C++, then I recommend checking out the AssemblyScript project. AssemblyScript is a TypeScipt-like language that compiles to Wasm. This means you write in familiar JavaScript-like syntax and compile your code to WebAssembly, plus -- it comes with build tools out of the box.

Here is our addTwo function re-implemented in AssemblyScript:

export function addTwo(arg1: i32, arg2: i32): i32 {
    return arg1 + arg2
}

This post is already turning into a book so I will end with some practical use-cases of Wasm and some amazing achievements so far.

Wasm in real life

  • Image/video editing: You can boost the performance of any browser app that generally involves dragging and dropping things, like an image or video editor. Figma has moved the performance-critical parts of its browser-based editor to WebAssembly.
  • Video games: Obviously! Here is the famous Unreal Engine 3 running a scene in real-time at 30fps, thanks to a 3-day port from C++ to Wasm๐Ÿ˜ฎ:

unreal in browser.gif

Source.

daedalos.jpg And wait...is that [Doom](en.wikipedia.org/wiki/Doom_(1993_video_game) I am seeing?! ๐Ÿ˜ฑ๐Ÿ˜ฑ๐Ÿ˜ฑ

AR, VR, PWAs, Simulation and virtually anything the browser was no made for. So get your hands on Wasm and see what you can build.

Thanks for reading. I hope you find this intro to WebAssembly very gentle. If you find it helpful, please like and share. Thanks

References

ย