Buffer Overflow
Do an Internet search on the term buffer overflow, and you'll come up with hundreds of thousands of links, most related to security. In the National Institute of Standards and Technology's ICAT index of computer vulnerabilities, six of the top 10 involve buffer overflows. In 1999, buffer overflow was named the No. 1 computer vulnerability. Five years later, it's still a major problem.
If you've ever poured a gallon of water into a pint-size pot, you know what overflow means ——water spills all around.
Inside a computer, something similar happens if you try to store too much data in a space designed for less. Input normally goes into a temporary storage area, called a buffer, whose length is defined in the program or the operating system.
Ideally, programs check data length and won't let you input an overlong data string. But most programs assume that data will always fit into the space assigned to it. Operating systems use buffers called stacks, where data is stored temporarily between operations. These, too, can overflow.
When a too-long data string goes into the buffer, any excess is written into the area of memory immediately following that reserved for the buffer -- which might be another data storage buffer, a pointer to the next instruction or another program's output area. Whatever is there is overwritten and destroyed.
That in itself is a problem. Just trashing a piece of data or set of instructions might cause a program or the operating system to crash. But much worse could happen. The extra bits might be interpreted as instructions and executed; they could do almost anything and would execute at the level of privilege (which could be root, the highest level).
Bad Programming
Buffer overflow results from a well-known, easily understood programming error. If a program doesn't check for overflow on each character and stop accepting data when its buffer is filled, a potential buffer overflow is waiting to happen. However, such checking has been regarded as unproductive overhead - when computers were less powerful and had less memory, there was some justification for not making such checks. Moore's Law has removed that excuse, but we're still running a lot of code written 10 or 20 years ago, even inside current releases of major applications.
Some programming languages are immune to buffer overflow: Perl automatically resizes arrays, and Ada95 detects and prevents buffer overflows. However, C —— the most widely used programming language today -- has no built-in bounds checking, and C programs often write past the end of a character array.
Also, the standard C library has many functions for copying or appending strings that do no boundary checking. C++ is slightly better but can still create buffer overflows.
Buffer overflow has become one of the preferred attack methods for writers of viruses and Trojan horse programs. Crackers are adept at finding programs where they can overfill buffers and trigger specific actions running under root privilege -- say, telling the computer to damage files, change data, disclose sensitive information or create a trapdoor access point.
时文选读
缓冲溢出
对缓冲溢出这个术语做一次因特网搜索,你将看到成千上万的链接,其中大多数与安全有关。按(美国)标准与技术协会公布的 ICAT计算机安全隐患指数,在最主要的十项安全隐患中有六项涉及缓冲溢出。1999年,缓冲溢出被定为头号计算机安全隐患。五年之后,它仍是一个重大问题。
如果你将一加仑水倒入一品脱的壶里,就知道溢出是什么意思了 ——水洒了一地。
在计算机中,如果你试图在设计存放较少数据的空间储存太多的数据,也会发生类似的事情。正常情况下,输入放入临时存储区域,称作缓冲区,其长度由程序或操作系统定义。
理想的情况是,程序检查你的数据长度,不让你输入过长的数据串。但多数程序假设数据总是适合分配给它的空间。操作系统使用称作堆栈的缓冲区,在两次运算之间数据暂时存放在那里。这也能产生溢出。
当太长的数据串进入缓冲区时,超过部分马上被写入紧跟在为缓冲区保留的存储区域后面的区域,它可能是另一个数据存储缓冲区、下一个指令的指针或者另一个程序的输出区域。不管是哪一种情况,(数据)都会因重写而被破坏。
这本身就是个问题。弄坏一个数据或一组指令会造成程序或操作系统的崩溃。还会发生更坏的事情,额外的位有可能被解释成指令而加以执行,这些位几乎可能做任何事情,在特权级上执行(它可能是根,即较高级)。
坏的编程
缓冲溢出源于一个众所周知、容易理解的编程错误。如果程序不检查每个字符是否溢出,以及在缓冲区已满时不停止接收数据,那么潜在的缓冲溢出就可能发生。但是,这样的检查被当作无收益的开销 ——当计算机能力不强、内存不大时,不做这样的检查还算有些道理。摩尔定律使这个借口不复存在,但是我们仍在运行一、二十年前写的程序,甚至有些主要应用软件的新版本还是如此。
有些编程语言对缓冲溢出具有免疫力: Perl能自动给阵列重新定长度,Ada95能检测和避免缓冲溢出。然而,今天用得最广的编程语言——C语言没有内在的超限检查,而C语言的程序常常写得超出字符阵列的末端。
同样,标准的 C语言程序库拥有很多拷贝或添加不检查边缘的字符串的功能。C++稍好一些,但仍会产生缓冲溢出。
缓冲溢出已经成为病毒和特洛伊木马程序编写者喜爱用的攻击方法。黑客们擅长于发现他们能使缓冲区溢出并在根特权下触发特定动作的程序,比方说,告诉计算机破坏文件、修改数据、暴露敏感信息或生成陷阱门接入点等。