How The Heck JavaScript Works Behind The Scenes? What is a JavaScript Engine?

javascript

July. 29. 2023

Have you ever wondered what goes on behind the scenes when you run your JavaScript code?

We often write JavaScript code that makes our web pages interactive and dynamic, but it's the JavaScript engine that brings that code to life. It's like the engine of a car, without it, the car won't move, and without the JavaScript engine, our code won't run.

Whether you're an experienced developer or just starting out, this article aims to help you become a more capable developer.

Let's dive into How a JavaScript Engine Works Behind The Scenes.

JavaScript Engine

The browser does not understand JavaScript. It uses a JavaScript engine that reads and executes the source code so that the computer knows what to do. Eachh major browser has its own engine.

Most famous JavaScript Engines are Google's V8 engine and SpiderMonkey which is written by none other than the creator of JavaScript Brendan Eich at Netscape Communications. It is currently maintained by the Mozilla Foundation.

In simple words, a JavaScript Engine is responsible for interpreting and executing JavaScript code. It includes components like the parser, interpreter, compiler (in some cases), memory heap, and call stack. Its primary role is to handle the execution of JavaScript code and manage the program's flow.

By understanding how the engine works, we can write code that runs faster, and works better.

In this article, we will specifically talk about Google's V8 Engine.

JavaScript Engine Visual Overview

Let's go deep into each step:

1- Tokenization

Tokenization, also known as Lexical Analysis, is the first phase in the JavaScript engine's process of interpreting and executing code. During this step, the source code is broken down into individual tokens, which are the smallest units of JavaScript syntax. These tokens are then used to build the Abstract Syntax Tree (AST), which we'll talk about later.

The source code comes from various sources, such as a script file, an inline script within an HTML page, or user input in a JavaScript console.
The JavaScript engine reads the source code character by character, treating it as a stream of characters.
Whitespace and comments are removed from the tokenization process. Their sole purpose is to improve code readability for developers.
Once whiteespace and comments are removed, the real tokenization process starts. During tokenization, the stream of characters is divided into meaningful language elements, such as keywords (e.g. if, else, function, var, while), identifiers (e.g. myVariable, calculateResult), operators (e.g. '+', '-', '*', '/', '&&', '||', '==='), literals (e.g. 7, "Hello, world!", false), and punctuations (special characters like '{' or ';').
The engine then generates a sequence of tokens based on the recognized language elements. These tokens form a list that represents the syntactical structure of the code.
If the engine encounters any characters that do not match the expected syntax, it will report a syntax error. The error message indicates the position in the code where the issue occurred, helping developers identify and fix the problem.

Let's look at a diagram to understand better:

Tokenization - Lexical Analysis As you can see, It broke down the source code into individual tokens.

2- Parsing (Abstract Syntax Tree - AST)

This step is called the Syntax Analysis, It can also be called Parsing. It is the second step in the process of interpreting and executing the code. It takes the stream of tokens generated during lexical analysis and organizes them into the tree structure called Abstract Syntax Tree - AST.

The AST represents the hierarchical structure of the code and provides the foundation for further interpretation and execution.

In simple words, AST is the tree representation of your source code.

Let's take a look at the diagram to understand better:

Syntax Analysis Parsing

AST

The nodes in the AST represent expressions, statements, function declarations, variable declarations, and more.

So, what's the role of AST in the Execution? Well, The AST serves as an intermediate representation of the code. Which makes it easier for the JavaScript engine to perform optimizations before the actual execution. The tree hierarchy allows the engine to navigate through the code's structure systematically.

Note that an AST doesn't store all the information. For example, it doesn't store punctuations, spaces etc.

There is an online tool called AST Explorer. It is a great tool to learn and work with AST.

3- Intermediate Code (Bytecode)

After creating the AST, the Interpreter comes into action and starts converting the AST into the intermediate code (commonly referred to as bytecode). This is not yet the machine code that the computer can directly execute, but a lower-level representation of the code that is closer to machine code than the original source code.

The Interpreter visits each node of the AST in a well-defined order. It starts from the root node and follows the tree hierarchy.

So now, you must be thinking, "Why do we need the Bytecode? Why not just convert the code directly to machine code?"

Bytecode is a low-level, platform-independent representation of the code. It can be generated once and executed on different platforms without the need for platform-specific compilation.

This portability allows JavaScript code to run consistently across various devices and environments, making it suitable for web browsers and other JavaScript runtime environments.

Still with me? Bravo! 🚀

4- Call Stack, Execution Context and Memory Heap

The Call Stack is a critical data structure that the JavaScript engine uses to keep track of the currently executing tasks. It plays a crucial role in managing the flow of function calls and their corresponding Execution Contexts.

Throughout the interpretation phase, the engine creates an Execution Context for each function call or block of code. The engine pushes it onto the call stack.

The execution context represents the environment in which the code is executed and includes information such as variables, function arguments, the scope chain, and the value of the this keyword.

Global Execution Context:

The first Execution Context that is created is the Global Execution Context. It represents the context in which the entire script runs and includes the global variables and functions. The Global Execution Context is created before any code is executed.

Function Execution Context:

As the JavaScript engine encounters function calls during the AST interpretation, it creates Function Execution Contexts. These Function Execution Contexts contain information specific to the function call, such as local variables and arguments.

Memory Heap

The engine manages the memory allocation for variables and objects in the memory heap. Variables are assigned memory addresses based on their data types and values. As the program runs, the engine updates the values of variables, manipulates data, and performs various operations based on the bytecode instructions.

I'am going to write a detailed article on the Call Stack and Memory Heap in the future. 🤞

IMPORTANT NOTE ⚠️: The creation of the Execution Context is intertwined with the AST interpretation. The JavaScript engine creates the Execution Context at the beginning of the AST interpretation phase, specifically when it encounters function calls and global code during the traversal of the AST.

5- Profiling

During bytecode execution, the JavaScript engine employs a profiler to gather runtime information about the code's execution patterns.

Based on the profiling information, the JavaScript engine identifies the hot paths, which are sections of code that are executed most frequently and are potential candidates for optimization.

For example: I'm calling the sum function 100 times. So this function will be considered as a hot function. Cool name for a function btw :)

const sum = (a, b) => a + b;
for (let i = 0; i < 100; i++){
  console.log(sum(1, 2));
}

6- Optimizing Compiler

Once the JavaScript engine finds the parts of your code that are used most often (hot paths), it starts the compilation process. The optimizing compiler takes those hot paths, and transforms them into super-fast machine code, specially tailored for your computer's hardware.

This optimized machine code is way quicker than reading and interpreting the original bytecode. It's like a turbo boostt for your code.

The CPU directly executes this optimized machine code. Because it's fine-tuned for your computer's specific abilities, it runs much faster and gives your code a significant performance boost.

7- Deoptimize (Optional)

In some cases, the engine might encounter situations where the assumptions made during compilation no longer hold true. For example, the variable types can be changed. When this happens, the engine can deoptimize and fall back from the machine code to the bytecode, allowing it to reapply optimizations with updated information.

So, this was my attempt to explain the JavaScript Engine. If you didn't understand everything, don't worry! it often takes a few attempts before these concepts fully sink in, and that's perfectly normal. It took me a few weeks to grasp these ideas myself.

Understanding the JavaScript Engine and inner workings of the language empowers you to write efficient, and reliable code. It enables you to make better decisions, troubleshoot issues effectively, and see the bigger picture of how JavaScript truly works.

Happy hacking!