Tuesday, November 3, 2009

Fedora-ARM: alignment errors

Alignment Errors:
We usually come across a lot of alignment error problems while building packages for Fedora-ARM.

Here is a quick example of the problem:


#define SIZE 20
#define CHARVAL 'a'
#define INTVAL 0x0a0b0c0d
main()
{
char buffer[SIZE] = { 0, };
char *p = buffer, c;

printf("c value = 0x%x\n", CHARVAL);
printf("i value = 0x%.8x %d\n", INTVAL, INTVAL);

/* Ok, lets write them in the buffer one after the other */
*p = CHARVAL;
p++;
*(int *)p = INTVAL;

/* So the buffer should be 0x610d0c0b0a */
print_hex_string(buffer);

}


The output of this program on x86 machine:

c value = 0x61
i value = 0x0a0b0c0d 168496141
0x610d0c0b0a000000000000000000000000000000
^^^^----------------here

As expected :-)

And on ARM?

c value = 0x61
i value = 0x0a0b0c0d 168496141
0x0d0c0b0a00000000000000000000000000000000
^^^^----------------here
And there goes your value of "c" stored in the array.

These alignment errors are often seen on the ARM architecture. Basically it expects a word to be on a word boundary (last 2 bits should be zero). And if it isn't, well expect the unexpected. As can be seen above the value of c has been completely over-written.

The same goes for pointers as well.

/proc/cpu/aligment:
This file displays the list of alignment errors that have been encountered thus far.

You can echo various flags in this file to change the kernel's behavior when an alignment error is encountered.
  • 0: ignore
  • 1: warn (/var/log/messages)
  • 2: fixup
  • 3: signal (core dump for analysis)
For a detailed overview: http://lecs.cs.ucla.edu/wiki/index.php/XScale_alignment

2 comments:

  1. This is precisely why the ISO C language standard says:
    "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undeļ¬ned."

    So doing (int*)p itself is bad on ARM. One should use memcpy() in such cases.

    And, I suggest dropping the implicit int return type for main(). ;-)

    ReplyDelete
  2. BTW, that code is also an aliasing violation in addition to the alignment violation.

    Can't the ARM be configured to trap on alignment errors instead of silently doing the wrong thing? At least an Address Error (68k) or Bus Error (SPARC) is clear.

    ReplyDelete