+ /* cdest is a pointer to the pixel data that is typed char* so that
+ adding 1 to its position moves it only one byte
+
+ lenbytes is the amount of bytes that we will be copying each
+ iteration. this doubles each time through the loop.
+
+ x is the number of bytes left to copy into. lenbytes will alwaysa
+ be bounded by x
+
+ this loop will run O(log n) times (n is the number of bytes we
+ need to copy into), since the size of the copy is doubled each
+ iteration. it seems that gcc does some nice optimizations to make
+ this memcpy very fast on hardware with support for vector operations
+ such as mmx or see. here is an idea of the kind of speed up we are
+ getting by doing this (splitvertical3 switches from doing
+ "*(data++) = color" n times to doing this memcpy thing log n times:
+
+ % cumulative self self total
+ time seconds seconds calls ms/call ms/call name
+ 49.44 0.88 0.88 1063 0.83 0.83 splitvertical1
+ 47.19 1.72 0.84 1063 0.79 0.79 splitvertical2
+ 2.81 1.77 0.05 1063 0.05 0.05 splitvertical3
+ */
+ cdest = (gchar*)dest;
+ lenbytes = 4 * sizeof(RrPixel32);
+ for (x = (w - 4) * sizeof(RrPixel32); x > 0;) {
+ memcpy(cdest, start, lenbytes);
+ x -= lenbytes;
+ cdest += lenbytes;