We have discussed how to implement type checking for object-oriented languages, by viewing objects as records containing data fields and methods. We now look at how objects are represented in memory, and how method calls are implemented while supporting inheritance and overriding (late binding).
A class defines a namespace that maps names of fields and methods to corresponding information about them, such as their types and implementations. As discussed earlier, this means classes are a module mechanism, and when building a compiler for a language with modules, it makes sense for the compiler to build a symbol table for each class so that names can be resolved efficiently.
Fields and methods typically come with visibility annotations that describe which parts of the code are allowed to see each such component. Implementing visibility correctly can be challenging in languages with nested classes.
Since objects are essentially souped-up versions of records (C: structs), it is useful to first consider how to implement records. In most languages, the order and size of the fields in a record is fixed and known, so that the record can be laid out sequentially in memory, and location of each field is at a fixed offset from the start of the record in memory.
An important consideration for field layout is that many processors have worse
performance on misaligned accesses, or disallow them entirely. A misaligned
access is one in which a read or write of \(n\) bytes occurs at an address that is not a multiple
of \(n\). For example, most RISC processors make misaligned accesses illegal;
x86 processors generally allow misaligned mov
instructions (while sometimes
imposing a performance penalty), but disallow misaligned push
and pop
instructions.
To ensure aligned accesses, extra space is often inserted between fields in record types. For example, consider the following C struct declaration on a typical 64-bit machine:
struct str { char c; // 1 byte short s; // 2 bytes int i; // 4 bytes struct str *n; // 8 bytes }
The C standard specifies that each field is aligned on a byte address that is a multiple of its size, so the resulting structure leaves bytes unused in memory, shown shaded in gray below:
The fields can be accessed using memory operands indexed from the base address. If we are targeting
x86-64 code and t
is a temporary holding the base address,
then field x
can be accessed with the operand byte ptr [t]
, field s
can be accessed as word ptr [t+2]
, and field i
can be accessed as
dword ptr [t+16]
. The C language leaves it up to the programmer to order fields so
that space is not wasted; other languages, like ML, statically reorder fields.
In languages that do not provide static information about the order of fields (e.g., JavaScript), or where permutation subtyping is supported, the compiler cannot know at what offset to find a field, and a more expensive mechanism must be used such as looking the field name up in a hash table.
Since methods are shared across all instances of a class, we would like to
avoid taking up space in objects to store methods or pointers to methods.
At the IR level, a method is essentially a function with one extra, implicit argument:
the receiver object, usually named this
or self
in
the method code.
To see how objects and methods are implemented, consider the following classic OO example:
class Point { double x,y; void setX(double x) { this.x = x; } void moveX(double dx) { setX(x + dx); } } class ColoredPoint { Color c; Color getCol() { return c; } @Override void setX(double x) { c = Color.redder(c); this.x = x; } }
Now, suppose we execute the following lines of code:
Point p = new ColoredPoint(); p.setX(42); p.moveX(1);
What happens? When we use the method setX
directly, it should call
the ColoredPoint
version of setX
, because the actual
class of the receiver object is ColoredPoint
. The point's color
will change to be redder. That the static type of p
is
Point
is immaterial; what matters for an ordinary method call is
the actual run-time class of the object. Now consider the call
p.moveX(1)
. Does this make the point even redder? It might seem
that it does not, since moveX
is defined in Point
,
but it gets its work done by calling setX
. The actual class of the
receiver is still ColoredPoint
, so the same overriden method will
be called: the point does become redder. This ability to inherit code from
a superclass but customize its behavior is called late binding, and it is
an essential feature of OO programming.
The behavior we want is that for ordinary method calls, the method definition that should be used is the one from the actual run-time class, or the nearest ancestor that defines that method. On the other hand, for methods that are declared static, final, or nonvirtual (depending on the language), the compiler can select the correct method call at compile time.
Given a method call o.m(...)
, in which method m
is invoked on receiver
object o
, how should it be implemented? The problem of picking the right method definition
to run is called method dispatch. We have three main goals in mind for this meechanism:
A straightforward approach that has been used in dynamically typed OO languages
like Smalltalk is to simply search for the right method definition. Each object has a hidden field
that points to its class object; a method call o.m()
is implemented by walking up the class hierarchy,
starting from o
's class, and stopping once an implementation for method m
is found.
This implementation allows sharing inherited code, so it avoids code bloat, but it is much slower than
a simple function call.
For faster method dispatch, we need to precompute a data structure—a dispatch table. The key idea is to map different method names onto small integer indices, and to use those indices to look up the appropriate method code pointer in an array of code pointers. These arrays of code pointers go by various names. In C++, this array is called a vtbl, pronounced like “v-table”, standing for “virtual table". In other literature it is known as a selector table or dispatch vector.
The figure below shows how dispatch vectors are constructed
for the example classes Point
and
ColoredPoint
. The orange ovals on the right side represent the actual
machine code of the various method implementations. As shown, objects of
class Point
and ColoredPoint
have a hidden field
DV
that points to the dispatch vectors of their respective classes (blue).
These dispatch vectors are shared by all objects of the class.
Notice that the object layout for ColoredPoint
is completely
compatible with the layout for Point
: the only differences between
the layouts of the objects and their dispatch vectors is that the layouts are
extended with new fields. Therefore, code expecting to receive a pointer to a
Point
will be perfectly happy to get a ColoredPoint
instead. And this compatibility is important, since ColoredPoint
is a subtype of Point
: ColoredPoint
≤ Point
.
The methods setX
, moveX
, and getCol
are associated with
indices into the dispatch vectors: 0, 1, and 2 respectively. The two dispatch vectors agree on
the indices of the method, which is again crucial so that a ColoredPoint
can be used where
a Point
is expected.
Notice that there is only one copy of the code for method Point.moveX
, which is
pointed to by both dispatch vectors. Thus, this object layout supports code-sharing inheritance
as well as subtype polymorphism.
With the object layout described above, we can generate efficient method dispatch code. For example,
let us suppose we are targeting the x86-64 architecture and want to compile an expression like
p.moveX(...)
using our running example. An (abstract assembly) instruction sequence
that calls the appropriate method code would be the following:
mov rdi, p mov tDV, [p] // immutable memory location call [tDV + 8] // note: moveX is at index 1
This is a reasonably fast instruction sequence, though not as fast as a direct call. Note that the second instruction accesses an immutable memory location, so common subexpression elimination can be used to avoid loading the dispatch vector address more than once for a given object.
Accesses to fields is even faster, since fields are at known offsets. For
example, if we want to compile the expression p.y
, it simply
requires an indexed load from memory:
mov t, [p + 16]
Interfaces and abstract classes have an object layout, in some sense, since code needs to be able to handle arguments with these types. However, they do not have instances and so they do not need dispatch vectors. One exception is C++ abstract classes, which do need dispatch information, because C++ objects effectively change class as they are being constructed. The dispatch information in a C++ object is overwritten as its constructors are called.
Static methods do not require any dispatching: the compiler can determine which code to call
at compile time. Therefore, they do not need to be placed in the dispatch table. Similarly,
final methods, constructors, and calls via super
do not require method dispatch.
One wrinkle is that we want to be able to compile classes separately from each other.
Unfortunately, we can't generate code for subclasses without knowing details of their
superclasses' implementations that we would otherwise prefer to keep hidden. For example,
suppose that cp
is a variable of type ColoredPoint
and we'd like
to compile the expression cp.col
. By looking at the above diagrams, we can see
that the desired abstract assembly memory operand is [cp + 24]
. But if the compiler
can't see the code of Point
, how can it know that 24 is the correct offset? It
needs to know the size of Point
; that is, it needs to know about possibly
private fields that the implementation of ColoredPoint
should not depend on.
For this reason, C++ demands that programmers declare private fields, typically in .h
files, so that subclass implementations can know how big their superclasses are.
This requirement is a bit of an abstraction violation.
Java uses a different strategy for separate compilation that preserves abstraction better. The Java Virtual Machine computes field and method offsets as the program is executing, and caches them. The advantage of this lazy approach is that new fields and methods can be added to libraries without needing to recompile client code that inherits from library classes.
For example, a source-language call to ColoredPoint.getCol()
would result
in a JVM bytecode instruction of the form invokevirtual "ColoredPoint.getCol"
(The actual string representing the method is uglier and includes an encoding of the parameter
types.) When the JVM interpreter first encounters this bytecode instruction, it resolves
the offset to the method getCol()
and rewrites the instruction to a “quick” version:
invokevirtual_quick 2
.
If the code is run enough times by the interpreter, the JIT compiler is invoked to convert this bytecode instruction to the standard method dispatch code above:
mov tDV, [rdi] call [tDV+16]
If the JIT compiler can determine that only one possible implementation of the method could be called at this point in the program, it can avoid method dispatch. It may insert a direct function call or even inline the code of the called method. This optimization is crucial for good performance in Java.