A buffer overflow occurs when a program writing data to a buffer overloads that buffer's capacity. It's like pouring 12 ounces of milk into an 8 ounce glass.
After reading this article you will be able to:
Copy article link
Buffer overflow is an anomaly that occurs when software writing data to a buffer overflows the buffer’s capacity, resulting in adjacent memory locations being overwritten. In other words, too much information is being passed into a container that does not have enough space, and that information ends up replacing data in adjacent containers.
Buffer overflows can be exploited by attackers with a goal of modifying a computer’s memory in order to undermine or take control of program execution.
A buffer, or data buffer, is an area of physical memory storage used to temporarily store data while it is being moved from one place to another. These buffers typically live in RAM memory. Computers frequently use buffers to help improve performance; most modern hard drives take advantage of buffering to efficiently access data, and many online services also use buffers. For example, buffers are frequently used in online video streaming to prevent interruption. When a video is streamed, the video player downloads and stores perhaps 20% of the video at a time in a buffer and then streams from that buffer. This way, minor drops in connection speed or quick service disruptions won’t affect the video stream performance.
Buffers are designed to contain specific amounts of data. Unless the program utilizing the buffer has built-in instructions to discard data when too much is sent to the buffer, the program will overwrite data in memory adjacent to the buffer.
Buffer overflows can be exploited by attackers to corrupt software. Despite being well-understood, buffer overflow attacks are still a major security problem that torment cyber-security teams. In 2014 a threat known as ‘heartbleed’ exposed hundreds of millions of users to attack because of a buffer overflow vulnerability in SSL software.
An attacker can deliberately feed a carefully crafted input into a program that will cause the program to try and store that input in a buffer that isn’t large enough, overwriting portions of memory connected to the buffer space. If the memory layout of the program is well-defined, the attacker can deliberately overwrite areas known to contain executable code. The attacker can then replace this code with his own executable code, which can drastically change how the program is intended to work.
For example if the overwritten part in memory contains a pointer (an object that points to another place in memory) the attacker’s code could replace that code with another pointer that points to an exploit payload. This can transfer control of the whole program over to the attacker’s code.
Certain coding languages are more susceptible to buffer overflow than others. C and C++ are two popular languages with high vulnerability, since they contain no built-in protections against accessing or overwriting data in their memory. Windows, Mac OSX, and Linux all contain code written in one or both of these languages.
More modern languages like Java, PERL, and C# have built-in features that help reduce the chances of buffer overflow, but cannot prevent it altogether.
Luckily, modern operating systems have runtime protections which help mitigate buffer overflow attacks. Let’s explore 2 common protections that help mitigate the risk of exploitation:
Software developers can also take precautions against buffer overflow vulnerabilities by writing in languages that have built-in protections or using special security procedures in their code.
Despite precautions, new buffer overflow vulnerabilities continue to be discovered by developers, sometimes in the wake of a successful exploitation. When new vulnerabilities are discovered, engineers need to patch the affected software and ensure that users of the software get access to the patch.
There are a number of different buffer overflow attacks which employ different strategies and target different pieces of code. Below are a few of the most well-known.
*Computers rely on two different memory allocation models, known as the stack and the heap; both live in the computer’s RAM. The stack is neatly organized and holds data in a Last-In, First-Out model. Whatever piece of data was most recently placed in the stack will be the first to come out, kind of like how the last bullet inserted into an ammunition magazine will be the first to be fired. The heap is a disorganized pool of extra memory, data does not enter or leave the heap in any particular order. Since accessing memory from the stack is much faster than accessing from the heap, the heap is generally reserved for larger pieces of data or data that a programmer wants to manage explicitly.
Learning Center Navigation