# Better Living Through Clang-istry

Recently I came across this article, explaining how to use clang to dump the memory layout of a C++ object. Running

clang -cc1 -fdump-record-layouts ppfile.cpp

on a preprocessed C++ file, produced using, e.g.,

clang -E -I/probably/lots/of/include/paths  file.cpp

gives output like:

*** Dumping AST Record Layout
0 | class StarObject
0 |   class SkyObject (primary base)
0 |     class SkyPoint (primary base)
0 |       (SkyPoint vtable pointer)
0 |       (SkyPoint vftable pointer)
16 |       long double lastPrecessJD
32 |       class dms RA0
32 |         double D
|       [sizeof=8, dsize=8, align=8
|        nvsize=8, nvalign=8]
...(snipped)...
184 |   float B
188 |   float V
| [sizeof=192, dsize=192, align=16
|  nvsize=192, nvalign=16]

Notice that the lastPrecessJD variable is stored as a long double, with possibly 63 bits of precision instead of the usual 53 bits given by a double. In practice, long double has 16-byte storage and alignment. Since the vtable takes up only 8 bytes (on 64-bit), we waste 8 bytes on padding. Moreover, we then take up 16 bytes to store lastPrecessJD, but using a program like the following:

#include <stdio.h>
#include <math.h>

int main()
{
double jd2000 = 2451545.0;
double delta = nextafter(jd2000,jd2000+1) - jd2000;
printf("delta: %.30f\n", delta);
return 0;
}

we can compute that at the year 2000, the minimum time step at (64-bit) double precision is approximately 40 microseconds, so it’s not clear that we gain anything by using 80-bit long doubles instead of 64-bit doubles. Changing the long double to double (and placing it last, though this isn’t strictly necessary) results in memory layout for the SkyPoint class like so:

*** Dumping AST Record Layout
0 | class SkyPoint
0 |   (SkyPoint vtable pointer)
0 |   (SkyPoint vftable pointer)
8 |   class dms RA0
8 |     double D
|   [sizeof=8, dsize=8, align=8
|    nvsize=8, nvalign=8]
...(snipped)...
48 |   class dms Az
48 |     double D
|   [sizeof=8, dsize=8, align=8
|    nvsize=8, nvalign=8]

56 |   double lastPrecessJD
| [sizeof=64, dsize=64, align=8
|  nvsize=64, nvalign=8]

This saves 16 bytes, cutting the size to 64 bytes from 801. Since KStars suffers from abuse of complex inheritance heirarchies and everything-is-an-object, this is 16 bytes saved for every single object in the sky.

Doing some simple rearrangements of the data in other classes means we can also save 8 bytes per StarObject and DeepSkyObject. Overall, these changes give approximately a 10% reduction in memory usage, just from removing padding.

1. This also has the benefit that the SkyPoint data fits in a single cache line, though I don’t think this really makes a difference given the inefficiencies in the rest of the code, and the fact that none of our data has any thought put into alignment, but it’s nice to have.