# Introduction to Windows Internals&#x20;

## Introduction to Windows Internals

<figure><img src="/files/kKzCFF98C9m4Bhnt8Qjd" alt=""><figcaption><p>souce: windows internals book 7th edition part 1 </p></figcaption></figure>

### **The Core Architecture**

The Windows operating system is a complex and layered environment. For most users, its inner workings are a complete black box, you click an icon, and a program runs. For the developer, security researcher, or system administrator, however, understanding the underlying mechanisms is essential for building robust software, analyzing malicious code, and troubleshooting complex system issues. This series will serve as a guide to that world, peeling back the layers of the OS to reveal its core components and behaviors. In this first installment, we will establish a foundational map of the Windows architecture. We will begin with the single most critical concept: the separation of the system into User Mode and Kernel Mode. From there, we will explore the key architectural components, define the fundamental **actors** of the OS—like processes and threads—and finally, look at the interfaces applications use to communicate with the kernel.

### Kernel mode vs User mode

* The fundamental design of Windows, essential for system stability and security, is its separation into two processor access modes: **user mode** and **kernel mode**. This division dictates what code can do and what system resources it can access.
* User application code runs in user mode, whereas OS code (such as system services and device drivers) runs in kernel mode. Kernel mode refers to a mode of execution in a processor that grants access to all system memory and all CPU instructions. Some processors differentiate between such modes by using the term code privilege level or ring level, while others use terms such as supervisor mode and application mode. Regardless of what it’s called, by providing the operating system kernel with a higher privilege level than user mode applications have, the processor provides a necessary foundation for OS designers to ensure that a misbehaving application can’t disrupt the stability of the system as a whole.
* The reason Windows uses only two levels is that some hardware architectures, such as ARM today and MIPS/Alpha in the past, implemented only two privilege levels. Settling on the lowest minimum bar allowed for a more efficient and portable architecture, especially as the other x86/x64 ring levels do not provide the same guarantees as the ring 0/ring 3 divide.

### System Architecture

The Windows architecture can be broadly divided into two main modes: **User Mode** and **Kernel Mode**. This separation is crucial for system stability and security.

#### User Mode

User mode is a restricted environment where most applications execute. It provides a safe and isolated space for programs to run, preventing them from directly accessing critical system resources or interfering with other applications.

* **Applications:** These are the programs that users interact with, such as web browsers, word processors, and games. They run in their own virtual address space, isolated from each other.
* **Subsystems:** These provide a bridge between user-mode applications and the kernel-mode services. Key subsystems include:
  * **Win32 Subsystem (csrss.exe):** Handles window management, graphics, and input events. It's the primary subsystem for most desktop applications.
  * **NT Subsystem (ntdll.dll):** Provides the interface between user-mode applications and the Windows kernel. It contains system call stubs that transition execution to kernel mode.
  * **Other Subsystems:** Windows also supports other subsystems like the POSIX subsystem (though less commonly used now) for compatibility with other operating systems.

The subsystems mentioned above are implemented in files ending with the .dll extension. A **Dynamic Link Library (DLL)** is a file containing a collection of code and data that can be used by multiple programs at the same time. This is a core concept in Windows.

Instead of compiling the same common functions (like those for creating a window or opening a file) into every single application, Windows places them in DLLs. When an application needs one of these functions, it dynamically loads the DLL into its address space at runtime. The primary advantage of this model is memory efficiency; if ten different programs all need to use kernel32.dll, Windows ensures that only one copy of that DLL's code resides in physical RAM, and it is shared among all ten processes.

#### Kernel Mode

Kernel mode is the privileged execution mode where the core operating system components reside. It has direct access to the system's hardware and memory. This mode is responsible for managing system resources, enforcing security policies, and providing services to user-mode applications.

* **Executive:** The core of the Windows kernel, containing fundamental services. Key components include:
  * **Object Manager:** Manages all system resources as objects, providing a consistent way to access and control them. Examples include files, processes, threads, and synchronization primitives.
  * **Virtual Memory Manager (VMM):** Manages the system's virtual memory, providing each process with its own private address space. It handles memory allocation, paging, and protection.
  * **Process Manager:** Creates, manages, and terminates processes and threads.
  * **Security Reference Monitor (SRM):** Enforces security policies and access control. It checks whether a process has the necessary permissions to access a particular object.
  * **I/O Manager:** Handles all input/output operations, providing a consistent interface for device drivers.
  * **Cache Manager:** Improves performance by caching frequently accessed data in memory.
  * **Plug and Play (PnP) Manager:** Detects and configures hardware devices.
  * **Power Manager:** Manages the system's power consumption.
* **Kernel:** The lowest level of the kernel mode, responsible for basic system functions such as thread scheduling, interrupt handling, and exception dispatching. It provides the foundation upon which the Executive is built.
* **Device Drivers:** Software components that allow the operating system to communicate with hardware devices. They operate in kernel mode and are responsible for translating generic I/O requests into device-specific commands.
* **Hardware Abstraction Layer (HAL):** Provides an abstraction layer between the kernel and the underlying hardware. This allows Windows to run on different hardware platforms without requiring significant modifications to the kernel.

A critical characteristic of kernel mode is that its memory space is not process-relative. After all, it's the same system kernel and drivers servicing every process. This "system space" is where the kernel itself, the Hardware Abstraction Layer (HAL), and all kernel drivers reside once loaded. This architecture provides automatic protection from direct user-mode access and also means that kernel components have a system-wide impact—a memory leak in a driver, for instance, will not be freed until the system reboots. User-mode processes, by contrast, can never leak anything beyond their own lifetime, as the kernel is responsible for closing all handles and freeing all private memory for a terminated process.

The relationship between these modes and their core components is best visualized through the Windows architecture flowchart.

![](/files/XhQvrfTg9V7yDzv1KPwv)

The architecture is a clear hierarchy. The Executive components, running in kernel mode, provide the core services like object and memory management that are foundational to everything else in the system. User-mode applications and their subsystems sit on top, interacting with the kernel through a well-defined boundary.

In modern versions of Windows, there is another architectural layer that can exist even below the kernel, providing a foundational trust boundary for the entire system. This is **the hypervisor layer**.

#### The hypervisor layer

This is composed of a single component: the hypervisor itself. There are no drivers or other modules in this environment. That being said, the hypervisor is itself composed of multiple internal layers and services, such as its own memory manager, virtual processor scheduler, interrupt and timer management, synchronization routines, partitions (virtual machine instances) management and inter-partition communication (IPC), and more.

With the division between user and kernel mode established, the next logical question is how a user-mode application requests services from the protected kernel. This is accomplished through a mechanism known as a **system call**.

### System Calls

User-mode applications cannot directly access kernel-mode resources. Instead, they must use **system calls** (also known as *syscalls* or *service calls*) to request services from the kernel.

1. An application calls a function in a DLL (Often, an application calls a high-level function in a library like `kernel32.dll`, which then calls the corresponding internal function within `ntdll.dll` to prepare for the kernel transition.).
2. This function prepares the necessary parameters and executes a special instruction (e.g., `SYSENTER` or `SYSCALL`) that causes a transition to kernel mode.
3. The kernel receives the system call, validates the parameters, and performs the requested operation.
4. The kernel returns the results to the user-mode application, and execution resumes in user mode.

### **Core Operating System Objects**

#### **The Object-Based Architecture: Objects and Handles**

Windows uses an object-based architecture, where all system resources are represented as objects. Each object has a type, attributes, and methods. This provides a consistent and secure way to manage resources.

* **Object Types:** Examples include files, processes, threads, mutexes, semaphores, and events.
* **Object Handles:** User-mode applications access objects through handles, which are opaque identifiers that represent a specific object.
* **Access Control:** The Security Reference Monitor (SRM) uses Access Control Lists (ACLs) to determine which users or groups have access to specific objects and what operations they are allowed to perform.
* An ***object*** is a **data structure** that represents a **system resource**, such as a **file**, **thread**, or **graphic image**. Your application can't directly access object data, nor the **system resource** that an object represents. Instead, your application must obtain an **object&#x20;*****handle***, which it can use to examine or modify the **system resource**. Each handle has an entry in an **internally maintained table**. Those entries contain the **addresses** of the **resources**, and the means to identify the **resource type**.
* A kernel object is a single, run-time instance of a statically defined object type. An object type comprises a system-defined data type, functions that operate on instances of the data type, and a set of object attributes.
* Kernel code can use direct pointer to objects.
* User mode code can only obtain handle to objects.
* objects are reference counted.

The most fundamental difference between an object and an ordinary data structure is that the internal structure of an object is opaque. You must call an object service to get data out of or put data into an object. You can’t directly read or change data inside an object. This difference separates the underlying implementation of the object from code that merely uses it, a technique that allows object implementations to be changed easily over time.

Not all data structures in the Windows OS are objects. Only data that needs to be shared, protected, named, or made visible to user-mode programs (via system services) is placed in objects. Structures used by only one component of the OS to implement internal functions are not objects.

#### Processes

* **program** is a static sequence of instructions, whereas a **process** is a container for a set of resources used when executing the instance of the program.

A **process** is the program *plus* its **execution context**. **The execution context** includes the **state** of the processor (processor context), which is the value of its program counter and all of its **registers**. It also includes the **process**’ **memory map**, which identifies the various regions of memory that have been allocated to the **process**.

**The memory** **map** of a **process** includes:

* ***Text***: the machine instructions (the compiled program).
* ***Data***: initialized static and global data.
* ***Bss***: uninitialized static data (e.g., global uninitialized strings, numbers, structures). The size of this region is contained within the program.
* ***Heap***: dynamically allocated memory (obtained through memory allocation requests, such as ***malloc*** or ***new***).
* ***Stack***: the call stack, which holds not just return addresses but also local variables, temporary data, and saved registers.

A **process** is a containment and management object that represents a running instance of a program. The term “**process runs**” which is used fairly often, is inaccurate. Processes don’t run – processes manage. Threads are the ones that execute code and technically run. From a high-level perspective, a process owns the following:

* An executable program, which contains the initial code and data used to execute code within the process. This is true for most processes, but some special ones don’t have an executable image (created directly by the kernel).
* A private virtual address space, used for allocating memory for whatever purposes the code within the process needs it.
* An ***access token*** (called primary token), which is an object that stores the security context of the process, used by threads executing in the process (unless a thread assumes a different token by using ***impersonation***), and is used for any kind of access checks by default.
* A private handle table to executive objects, such as events, semaphores, and files.
* One or more threads of execution. A normal user-mode process is created with one thread (executing the classic main/WinMain function). A user mode process without threads is mostly useless, and under normal circumstances will be destroyed by the kernel.

#### Threads

The actual entities that execute code are threads. A Thread is contained within a process, using the resources exposed by the process to do work (such as virtual memory and handles to kernel objects).

* Thread important inforamtion:
  * Current access mode, either user or kernel.
  * Execution context, including processor registers and execution state.
  * One or two stacks, used for local variable allocations and call management.
  * Thread Local Storage (TLS) array, which provides a way to store thread-private data with uniform access semantics.
  * Base priority and a current (dynamic) priority.
  * Processor affinity, indicating on which processors the thread is allowed to run on.
* The most common states a thread can be in are:

  * **Running** – currently executing code on a (logical) processor.

  * **Ready** – waiting to be scheduled for execution because all relevant processors are busy or unavailable.

  * **Waiting** – waiting for some event to occur before proceeding. Once the event occurs, the thread goes to the Ready state.

  > A thread has at least one stack residing in system (kernel) space, and it’s pretty small (default is 12 KB on 32-bit systems and 24 KB on 64-bit systems). A user-mode thread has a second stack in its process user-space address range and is considerably larger (by default can grow up to 1 MB).

### Virtual Memory

Every process has its own virtual, private, linear address space. This address space starts out empty (or close to empty, since the executable image and NtDll.Dll are the first to be mapped, followed by more subsystem DLLs). Once execution of the main (first) thread begins, memory is likely to be allocated, more DLLs loaded, etc. This address space is private, which means other processes cannot access it directly. The address space range starts at zero (technically the first and last 64KB of the address space cannot be committed), and goes all the way to a maximum which depends on the process “bitness” (32 or 64 bit) and the operating system “bitness”.

* Each process has its own address space, which makes any process address relative, rather than absolute. For example, when trying to determine what lies in address 0x20000, the address itself is not enough; the process to which this address relates to must be specified.
* The memory itself is called virtual, which means there is an indirect relationship between an address range and the exact location where it’s found in physical memory (RAM). The term virtual refers to the fact that from an execution perspective, there is no need to know if the memory about to be accessed is in RAM or not; if the memory is indeed mapped to RAM, the CPU will access the data directly. If the memory is not resident( specified by a flag in the translation table entry), the CPU will raise a page fault exception that will cause the memory manager’s page fault handler to fetch the data from the appropriate file, copy it to RAM, make the required changes in the page table entries that map the buffer, and instruct the CPU to try again.
  * `A buffer within a process may be mapped to physical memory, or it may temporarily reside in a file (such as a page file).`

#### Page State

* **Free** – the page is not allocated in any way; there is nothing there. Any attempt to access that page would cause an access violation exception. Most pages in a newly created process are free.
* **Committed** – the reverse of free; an allocated page that can be accessed successfully sans protection attributes (for example, writing to a read only page causes an access violation).
  * *Committed pages are usually mapped to RAM or to a file (such as a page file).*
* **Reserved** – the page is not committed, but the address range is reserved for possible future commitment. From the CPU’s perspective, it’s the same as Free – any access attempt raises an access violation exception. However, new allocation attempts using the **VirtualAlloc** function (or **NtAllocateVirtualMemory**) that does not specify a specific address would not allocate in the reserved region.

### **The Programming Interfaces**

To allow developers to interact with the core operating system objects and services, Windows provides several layers of application programming interfaces (APIs). These have evolved over decades, reflecting changes in programming paradigms and the need for more robust, component-based architectures. The journey begins with the foundational C-style API.

#### Windows API

**The Windows application programming interface (API)** is the user-mode system programming interface to the Windows OS family.

* the term **Windows API/win32 API** refers to both the **32-bit and 64-bit** programming interfaces to Windows.

#### **Component Object Model (COM)**

To address the limitations of the **C-style API,** Microsoft introduced the **Component Object Model (COM)** in 1993. Initially developed to allow Microsoft Office applications to share data—think embedding an Excel chart in a Word document—COM evolved into a foundational technology for much of Windows. This capability, known as **Object Linking and Embedding (OLE**), was originally built on an older mechanism called **Dynamic Data Exchange (DDE),** but DDE's limitations spurred the creation of COM. In fact, COM was initially released as **OLE 2.**

* COM is built on two core principles:
* **Interfaces**: Client applications interact with **COM objects** (sometimes called **COM servers**) through well-defined contracts called interfaces. These interfaces group related functions, creating a binary-compatible standard that sidesteps issues like compiler-specific name mangling (a process where compilers encode function names with extra information about their parameters, which can differ between compilers). This allows developers to use COM objects from a multitude of languages, including C, C++, Visual Basic, and .NET languages.
* **Dynamic Loading:** Instead of being statically linked to a client application, COM components are loaded dynamically at runtime. These components typically reside in Dynamic Link Libraries (DLLs) or executable files (EXEs).

COM also introduced important features for managing security, data transfer between processes (marshalling), and threading models. Many familiar Windows technologies are built on COM, including DirectX, Windows Media Foundation, and the Windows Shell.

#### **The Windows Runtime (WinRT)**

With the release of Windows 8, Microsoft unveiled a new API and runtime environment called the **Windows Runtime, or WinRT.** It's important not to confuse **WinRT** with **Windows RT**, the now-discontinued version of Windows for ARM-based devices.

**WinRT** was designed to be the foundation for a new class of applications, initially known as Metro Apps and now called **Universal Windows Platform (UWP)** apps. These apps are designed to run across a wide range of devices, from IoT and phones to tablets, desktops, and even the Xbox and HoloLens.

From a technical standpoint, WinRT is built on top of COM, extending its core infrastructure. It introduces features like comprehensive type metadata stored in .winmd files, which is an enhancement of a similar concept in COM known as type libraries. This results in a more cohesive and consistently designed API with clear namespace hierarchies.

It's crucial to understand that WinRT is not a complete replacement for the traditional Windows API. Instead, it sits alongside it. Desktop applications can access a subset of WinRT APIs, and conversely, UWP apps can use a limited set of the classic Win32 and COM APIs. At its core, the WinRT API still relies on the legacy Windows binaries and APIs.

To make it easier for developers to use WinRT, Microsoft provides "language projections" for various programming languages, including C++, C#, and JavaScript. For C++ developers, there's C++/CX, a set of language extensions, and the more modern, standard-compliant C++/WinRT. For .NET languages, the standard COM interop layer allows seamless access to WinRT APIs. JavaScript developers can utilize WinJS to interact with WinRT, using HTML for their app's user interface.

#### **The .NET Framework**

The .NET Framework is an integral part of the Windows operating system. It consists of two main components:

* **The Common Language Runtime (CLR):** The CLR is the execution engine for .NET applications. It provides a virtual machine that manages services like memory allocation (through a garbage collector), security, and thread management. Code that runs within the CLR is referred to as "managed code." The CLR itself is implemented as a COM server.
* **The Framework Class Library (FCL):** This is an extensive collection of pre-written code (types) that developers can use to build applications. The FCL provides a vast array of functionalities, including user interface elements, networking capabilities, and database access.

First introduced in 2002, the .NET Framework was designed to enhance developer productivity and improve the security and reliability of applications. It supports multiple programming languages, with C# and Visual Basic being the most prominent. While the original .NET Framework is Windows-only, its open-source, cross-platform successor, simply called .NET (formerly .NET Core), allows for development across Windows, macOS, and Linux. The .NET Framework will continue to be a part of Windows, but new development is encouraged on the modern .NET platform.

### **Core System Files**

The architectural components and subsystems we have discussed are not merely abstract concepts; they are implemented within a set of critical files on the system disk. Understanding these core files helps connect the theoretical architecture to a tangible reality. Tampering with or deleting these files can lead to system instability or failure.

#### Core Windows System Files

Windows relies on a multitude of files to function correctly. These files are located in various directories, but some are more critical than others. These "core" files are essential for booting, running the kernel, managing hardware, and providing basic system services. Tampering with or deleting these files can lead to system instability or failure.

Here's a breakdown of some of the most important core system files:

* **ntoskrnl.exe (NT Operating System Kernel):** This is the heart of the Windows operating system. It contains the kernel, which is responsible for managing the system's resources, including memory, processes, and threads. It also includes the executive, which provides a set of services that the kernel and other system components use.
* **hal.dll (Hardware Abstraction Layer):** The HAL provides a layer of abstraction between the operating system and the hardware. This allows Windows to run on different hardware platforms without requiring significant modifications to the kernel. It translates generic operating system commands into hardware-specific instructions.
* **ntdll.dll (NT Layer DLL):** This DLL provides the interface between user-mode applications and the Windows kernel. It contains system calls (also known as NT APIs) that applications use to request services from the kernel, such as creating processes, allocating memory, and accessing files.
* **kernel32.dll (Windows Kernel API Client DLL):** This DLL provides a subset of the Windows API that is used by most user-mode applications. It contains functions for managing processes, threads, memory, and other system resources. It acts as a bridge between user-mode applications and `ntdll.dll`.
* **user32.dll (Windows User API Client DLL):** This DLL provides functions for managing the user interface, such as creating windows, handling messages, and drawing graphics.
* **gdi32.dll (Windows Graphics Device Interface API Client DLL):** This DLL provides functions for drawing graphics on the screen. It is used by applications to create and display windows, draw lines and shapes, and render text.
* **advapi32.dll (Advanced Windows 32 Base API Client DLL):** This DLL provides a set of advanced functions for managing security, the registry, and services.
* **msvcrt.dll (Microsoft Visual C++ Runtime Library):** This DLL contains the runtime library for the Microsoft Visual C++ compiler. It provides functions for memory management, string manipulation, and other common tasks. Many applications depend on this DLL.
* **winload.exe (Windows Boot Loader):** This executable is responsible for loading the Windows kernel into memory during the boot process. It is invoked by the Windows Boot Manager.
* **winresume.exe (Windows Resume Loader):** This executable is responsible for resuming Windows from hibernation. It loads the contents of the hibernation file into memory and restores the system to its previous state.
* **bootmgr (Windows Boot Manager):** This file is responsible for displaying the boot menu and loading the operating system. It is located in the root directory of the system drive.
* **BCD (Boot Configuration Data):** This file contains the boot configuration data, which specifies the operating systems that are installed on the system and the options for booting them. It replaces the `boot.ini` file used in older versions of Windows.

Many of the processes listed below are foundational, representing some of the very first user-mode processes launched during the Windows boot sequence to bring the system to a usable state

* **smss.exe (Session Manager Subsystem):** This process is responsible for creating the user sessions and starting the Windows subsystem (csrss.exe) and the logon process (winlogon.exe).
* **csrss.exe (Client Server Runtime Subsystem):** This process is a user-mode subsystem that is responsible for managing the Windows console and providing support for Win32 applications.
* **wininit.exe (Windows Initialization Process):** This process is responsible for initializing the Windows environment, including starting services and creating the user profile.
* **services.exe (Service Control Manager):** This process is responsible for managing Windows services. It starts, stops, and monitors services.
* **lsass.exe (Local Security Authority Subsystem Service):** This process is responsible for enforcing the security policy of the system. It authenticates users and manages access to resources.

So, the main kernel files can be summed as follows:

| File Name                                  | Components                      |
| ------------------------------------------ | ------------------------------- |
| **Ntoskrnl.exe**                           | **Executive and Kernel**        |
| **Hal.dll**                                | **HAL**                         |
| **Win32k.sys**                             | **Kernel mode part of the GUI** |
| \***.sys in \SystemRoot\System32\Drivers** | **Core driver files**           |

## Conclusion

We have now established a foundational map of the Windows operating system. The key takeaway is the architectural division between a privileged **kernel mode** and a restricted **user mode**, a design that is paramount for system security and stability. We have seen that the kernel manages core system resources as **objects**, which are manipulated by **processes** and **threads**. Applications in user mode interact with these components through a well-defined boundary using **system calls**, which are exposed through layers of programming interfaces, from the classic Win32 API to the modern .NET Framework.

With this high-level architecture in mind, we are now prepared to dive deeper. In the next part of this series, we will focus exclusively on the lifecycle and inner workings of **Processes, Jobs, and Threads**, exploring how they are created, managed by the scheduler, and represented within the kernel.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://usta0x001.gitbook.io/posts/fundamentals/windows-internals/introduction-to-windows-internals.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.