Powerful gdb extension commands to investigate memory issues

gdbplus

gdbplus is “gdb + heap memory commands”

The GNU debugger, gdb, is well accepted in the developer community with its rich and powerful features. Gdbplus is an extension to the formal releases of gdb (http://www.gnu.org/software/gdb/). It is based upon gdb’s existing functions with additional commands to analyze heap data, object relationship, etc. These power features may shed light on tough issues such as memory corruption, debugging highly optimized code, etc.

Many program bugs, especially those in C/C++, are memory related. When a program failure is observed, either a crash or an error, we often face a suspicious memory address or an invalid data object which becomes the key to solve the problem and demands further investigation. But it is usually not an easy task with ever increasing number of data objects and more complex execution context. This is especially true for heap memory. Though gdb has many powerful functions, it is not so helpful when dealing with memory corruption issues. With built-in knowledge of the heap data structures of the program’s underlying memory manager, gdbplus scans the target process’s heap to check its consistency and point out corrupted spots if any. For those familiar with development on Windows, this is similar to the windbg’s extension command: !heap.

A nontrivial program would have many data objects which could be as simple as primitives, such as char, integer, float, etc. or complex aggregates like C++ object with multiple inheritances. Data objects are related through direct or indirect references by design. One object may be shared or referenced by multiple other objects. The application code usually goes through multiple indirect references to access a memory target. This makes it difficult to figure out the root cause when something goes wrong. In a typical debugging session, we may have one or more suspected data objects at hand. The challenge is to find out what other objects are holding references to the suspected and may potentially access them incorrectly. This is more or less like reverse engineering and understandably very difficult. A debugger like gdb can resolve global and local variables with the help of debug symbols (some local variables may not be found if the code is optimized). Heap data objects have no debug symbols associated with them. Therefore, gdb can’t do more than displaying raw data content of heap objects. Yet heap objects are often the focus of investigation. Take the memory overrun as an example, the key to track down this type of bug is to figure out what is the memory object preceding the victim and who owns it and how it is read and written.

Gdbplus is designed to meet the challenge. With the knowledge of heap data structures, which is specific to the implementation of memory manager, it figures out the boundaries and status of each memory block. By searching the target’s whole address space, it is capable of finding all references to any suspected memory object. If a heap object’s reference, either direct or indirect with multiple levels of reference, is a global or local variable, we may deduce the heap object’s type by walking down the chain of references. Any heap object should have at least of such reference chain, otherwise it is not accessible from the code, in other words, leaked.