Backend
Backend Essentials
Process and Threads

Process and Threads in Node.js

Understanding the difference between a process and a thread is a fundamental requirement for backend engineering. This knowledge dictates how you design scalable systems, manage memory, and prevent application crashes.

What is a Process?

By definition, a program under execution is a process.

When you write Node.js code, it is just a text file resting on your hard drive. When you type node server.js in your terminal, the Operating System takes that program and loads it into RAM. Once it is actively running in RAM, it becomes a Process.

Components of a Process in the OS

When the OS creates a process, it assigns several critical components to it:

  1. PID (Process Identifier): Every single process gets a unique numerical ID assigned by the OS.
    • Example: PID 1024 might be your Google Chrome browser, while PID 2201 is your active Node.js server.
    • Why it matters: The Operating System strictly uses the PID to:
      • Track the process: Monitoring how much RAM and CPU it is actively consuming.
      • Schedule the CPU: Deciding exactly when this specific PID gets its turn on the CPU core (Context Switching).
      • Kill the process: When you run a command like kill -9 2201 in your terminal, the OS uses the PID to forcefully terminate it.
      • Allocate resources: Granting the PID permission to bind to network ports (like Port 3000) or open files.
  2. Memory Allocation: Every process is assigned its own isolated memory space, divided into two main areas:
    • Stack Memory: Stores static data, function calls, and primitive variables. It is highly organized and fast.
    • Heap Memory: A large, unorganized pool of memory used for dynamic allocation (e.g., objects, arrays, and complex data structures).
  3. Program Counter (PC): A dedicated register that points to the exact memory address of the next instruction the CPU needs to execute.
  4. File Descriptors: When a process opens a file or a network socket, the OS tracks it using an integer index called a File Descriptor.

[!NOTE] Analogy: Think of a process as a physical factory building. It has its own walls, its own materials (Stack/Heap memory), a manager telling workers what to do next (Program Counter), and its own dedicated address (PID). If the factory next door burns down, your factory is safe because the memory is isolated.


What is a Thread?

A thread is the smallest unit of execution within a process. Simply put, threads are nothing but lightweight processes.

A single process can contain many threads.

  • 1 Core = 1 Active Process at a time (but handles many via context switching).
  • 1 Process = Can have Many Threads (sharing the process's resources).

Key Characteristics of a Thread:

  1. Shared Memory: All threads within the same process share the exact same memory space (the Heap) and File Descriptors. They can read and modify the exact same variables.
  2. Lightweight: Creating a new thread is much faster and requires significantly fewer resources than creating an entirely new process.
  3. Risky: Because memory is shared, two threads might try to modify the same variable at the exact same millisecond, leading to unpredictable bugs known as Race Conditions.

[!NOTE] Analogy: If the process is the factory building, the threads are the individual workers inside the factory. They all share the same tools and workspace (shared memory).

The Node.js Architecture: Single-Threaded

You will often hear: "Node.js is single-threaded." This means that your JavaScript code is executed by a single main thread.

If 1,000 users hit your API simultaneously, Node.js does not create 1,000 threads to handle them (unlike traditional servers like Apache or Tomcat). Instead, the single main thread handles all 1,000 users concurrently using the Event Loop and asynchronous I/O.

Why Single-Threaded?

  1. No Race Conditions: Since only one thread executes your JavaScript, you never have to worry about deadlocks or race conditions in your variables.
  2. Low Memory Footprint: Traditional servers allocate 2MB-4MB of memory per thread. 10,000 concurrent connections would require 20GB-40GB of RAM just for thread overhead. Node.js can handle 10,000 concurrent connections with just a few megabytes.

Is Node.js truly single-threaded?

No. While your JS code runs on a single thread, Node.js relies on a C++ library called libuv under the hood. Libuv maintains a hidden Thread Pool (default of 4 threads) to handle heavy OS tasks like File System operations (fs) and cryptography (crypto).


Multi-threading and Multi-processing in Node.js

To scale a Node.js application, you have two primary built-in tools to bypass the single-thread limitation:

1. Worker Threads (worker_threads module)

Introduced in Node 10, Worker Threads allow you to execute JavaScript in parallel across multiple threads within the same process. They are primarily used for offloading heavy CPU-bound tasks (like image processing) so they don't block the main Event Loop.

2. The Cluster Module (cluster)

The cluster module allows you to create child processes (workers) that run simultaneously and share the same server port. This is the standard way to utilize multi-core systems in Node.js.

  • Master Process: Listens on a port (e.g., 3000) and acts as a load balancer.
  • Worker Processes: The master distributes incoming requests to the workers (usually one worker per CPU core). If a worker process crashes, it doesn't affect the others, and the master can spawn a new one.