Saturday, July 17, 2010

The woes of structure packing - #pragma pack

We were developing an application on Red Hat Enterprise Linux 5.3 that, among other things, needed to use a PCI interface card for acquiring IRIG-B time codes. The Qt 4.5 based application was first developed without this capability, and later the device driver API was integrated to it. The API had a header file with a class declaration and a corresponding source file containing definitions of the class functions. Integrating the API involved adding these two files to the list of the project's source files and making calls to the API functions.

This integration caused the application to abruptly crash after receiving SIGABRT signal. The reason reported was along the lines of: “*** glibc detected *** application_name: malloc(): memory corruption: 0x092a51c8 ***”.
Being a memory error, we used Valgrind to locate any memory access violations.

We found that the crash always happened at a single statement that dynamically allocated memory using operator new. The statement was inside the constructor for a class object. The allocation statement was something like:
   try
   {
      const unsigned int SIZE = 1024;
      char *p = new char[SIZE]; //< SIGABRT here
   }
   catch(...)
   {
      //report memory allocation error
   }
There was one allocation statement before this code. that just created an object of a class. It was guaranteed to be clean. The program was receiving a SIGABRT and no exception was being thrown. When debugged with gdb, gdb claimed the variable SIZE to be missing (even after making it volatile and compiling with -O0).

When executed with Valgrind memory check tool, Valgrind reported many occurrences of invalid write/read – i.e. beyond memory allocated by operator new.
For example, for instances like these:
    class ct;
    ...
    ct *p = new ct;

Valgrind was reporting that operator new had allocated only 103 bytes for *p, although sizeof(ct) reported 112 bytes (as obtained by printing sizeof(ct) in code and also in debugger).
Further, it was observed that the mere inclusion of the API source files in the project caused a crash, even if none of the API functions were actually called.
This made us examine the driver source files and at the very beginning of the 3rd party device driver header file was this line:
    #pragma pack(1)
And sure enough, this line was found to be causing all the problems we were observing.

The pragma pack directive

Pragmas are special directives that are used to communicate additional information to the compiler. Pragma directives are therefore non-standard and highly compiler specific. In our case '#pragma pack' (for gcc compiler) changes the maximum alignment of members of structures, unions, and classes subsequently defined.

Data alignment and structure padding

Data objects are generally aligned at specific word boundaries so that read/write operations can be performed efficiently. For example, on x86 platform, integers (32-bit) are generally aligned at 4- byte boundaries, where as 'short int' (16-bit) will be generally aligned at 2-byte boundaries and so on. The alignment rules are platform-specific, and might even change from compiler to compiler, for the same platform.
For a struct (or union or class) having member variables, the members themselves need to be aligned to such boundaries. This requires the compiler to insert additional unnamed data members so that proper alignment is maintained for the member objects. Checkout this example for x86 from Wikipedia.
The size of a struct/union/class object, as reported by operator sizeof, includes the size of these padding bytes. The size of a struct/class object is therefore at least (and not exactly equal to) the sum of the sizes of its members.
More on structure padding here: Data alignment: Straighten up and fly right

Changing the default packing

With the 'pragma pack' directive, it is possible to change the default alignment rule and force a particular maximum alignment boundary. For example,
    #pragma pack(2)
will force all structure (and union and class) members to be aligned at boundaries not larger than multiples of 2. i.e. An 'int' member, which would've been otherwise aligned at a 4-byte boundary, will now be aligned at a 2-byte boundary.

This tighter packing of bytes will obviously reduce the size of the structure objects, but at the cost of performance. Most processors can fetch an aligned word from memory (an atomic operation) faster than words that cross alignment boundaries. Misaligned memory requests cause multiple memory access cycles (therefore not an atomic operation) and this additional complexity adversely affects the performance of the application.

Also, note that while the x86 architecture tolerates misaligned memory access (with a performance penalty, of course), some other processors will terminate the application with a 'Bus error' (SIGBUS).

Forcing a tight alignment might still be useful when dealing with hardware drivers. The struct objects might be directly used for interacting with hardware and the padding bytes might be undesirable here. That may be the reason for using this pragma in the device driver header file.

When we included this file in our project, the pragma directive became active for all struct/class/union definitions in the included header files. This apparently created very confusing problems (gdb's weird reports). The problems we observed with our code were caused by alignment issues for some of the classes for which the pragma directive had become active. Searching the internet, I find that others have also reported problems with #pragma pack and Qt classes.

The solution

The ideal solution is to not change the default natural alignment rules. Pragma directives themselves are generally considered problematic and leading to non-portable code.

If the 'pragma pack' directive cannot be avoided at all (as in the case of a device driver), then the original packing scheme must be restored after the definition of the structures that require tight packing. i.e. the header file must be modified to something similar to:

    //push current alignment rules to internal stack
    #pragma pack(push)
    
    //force 1-byte alignment boundary
    #pragma pack(1)
    
    //the above two lines can be merged to:
    //#pragma pack(push,1)
    
    /*
        definition of structures requiring
        tight packing
    */
    
    //restore original alignment rules from stack
    #pragma pack(pop)
The above solution works for GNU, Microsoft and Borland compilers. This should have been implemented in the header file supplied by the manufacturer. For gcc, instead of the push-pop technique, you can also use #pragma pack() to restore the original packing rule.

11 comments:

  1. Thanks much for taking the time to document this "feature" and share your solution. It saved a lot of debugging time.

    ReplyDelete
  2. Nice article. It's interesting to know that some platform will terminate the application for incorrect alignment!

    ReplyDelete
  3. Very clearly written article - explains packing nicely.

    I'm having issues with the portability of bit fields - another headache altogether! :-p

    ReplyDelete
  4. Good explanation, it helped me a lot.

    ReplyDelete
  5. Had a similar problem in a qt project. In my case the change in alignment by the included header only affected the files sequentially added later in my project file. So only the new code was unstable, prior code worked fine. Spent 2 weeks tracking this down; what a headache...

    Thanks for the post!

    ReplyDelete
    Replies
    1. "..only affected the files sequentially added later in my project file.."
      Exactly.. That makes the problem even difficult to trace. If the header happens to be included last, then there might not be any issues. Then later, somebody adds a new include line below it, and poof!

      Delete
  6. Thanks, this solved my problem! I was using __attribute__((__packed__)) with GCC but it seems to work only with Clang.

    ReplyDelete