[Page 728 (continued)]|
13.7. (Optional) Polymorphism, Virtual Functions and Dynamic Binding "Under the Hood"
C++ makes polymorphism easy to program. It is certainly possible to program for polymorphism in non-object-oriented languages such as C, but doing so requires complex and potentially dangerous pointer manipulations. This section discusses how C++ can implement polymorphism, virtual functions and dynamic binding internally. This will give you a solid understanding of how these capabilities really work. More importantly, it will help you appreciate the overhead of polymorphismin terms of additional memory consumption and processor time. This will help you determine when to use polymorphism and when to avoid it. As you will see in Chapter 23, Standard Template Library (STL), the STL components were implemented without polymorphism and virtual functionsthis was done to avoid the associated execution-time overhead and achieve optimal performance to meet the unique requirements of the STL.
First, we will explain the data structures that the C++ compiler builds at compile time to support polymorphism at execution time. You will see that polymorphism is accomplished through three levels of pointers (i.e., "triple indirection"). Then we will show how an executing program uses these data structures to execute virtual functions and achieve the dynamic binding associated with polymorphism. Note that our discussion explains one possible implementation; this is not a language requirement.
When C++ compiles a class that has one or more virtual functions, it builds a virtual function table (vtable) for that class. An executing program uses the vtable to select the proper function implementation each time a virtual function of that class is called. The leftmost column of Fig. 13.24 illustrates the vtables for classes Employee, SalariedEmployee, HourlyEmployee, CommissionEmployee and BasePlusCommissionEmployee.
Figure 13.24. How virtual function calls work.
(This item is displayed on page 729 in the print version)
In the vtable for class Employee, the first function pointer is set to 0 (i.e., the null pointer). This is done because function earnings is a pure virtual function and therefore lacks an implementation. The second function pointer points to function print, which displays the employee's full name and social security number. [Note: We have abbreviated the output of each print function in this figure to conserve space.] Any class that has one or more null pointers in its vtable is an abstract class. Classes without any null vtable pointers (such as SalariedEmployee, HourlyEmployee, CommissionEmployee and BasePlusCommissionEmployee) are concrete classes.
Class SalariedEmployee overrides function earnings to return the employee's weekly salary, so the function pointer points to the earnings function of class SalariedEmployee. SalariedEmployee also overrides print, so the corresponding function pointer points to the SalariedEmployee member function that prints "salariedemployee:" followed by the employee's name, social security number and weekly salary.
The earnings function pointer in the vtable for class HourlyEmployee points to HourlyEmployee's earnings function that returns the employee's wage multiplied by the number of hours worked. Note that to conserve space, we have omitted the fact that hourly employees receive time-and-a-half pay for overtime hours worked. The print function pointer points to the HourlyEmployee version of the function, which prints "hourly employee:", the employee's name, social security number, hourly wage and hours worked. Both functions override the functions in class Employee.
The earnings function pointer in the vtable for class CommissionEmployee points to CommissionEmployee's earnings function that returns the employee's gross sales multiplied by commission rate. The print function pointer points to the CommissionEmployee version of the function, which prints the employee's type, name, social security number, commission rate and gross sales. As in class HourlyEmployee, both functions override the functions in class Employee.
The earnings function pointer in the vtable for class BasePlusCommissionEmployee points to BasePlusCommissionEmployee's earnings function that returns the employee's base salary plus gross sales multiplied by commission rate. The print function pointer points to the BasePlusCommissionEmployee version of the function, which prints the employee's base salary plus the type, name, social security number, commission rate and gross sales. Both functions override the functions in class CommissionEmployee.
Notice that in our Employee case study, each concrete class provides its own implementation for virtual functions earnings and print. You have already learned that each class which inherits directly from abstract base class Employee must implement earnings in order to be a concrete class, because earnings is a pure virtual function. These classes do not need to implement function print, however, to be considered concreteprint is not a pure virtual function and derived classes can inherit class Employee's implementation of print. Furthermore, class BasePlusCommissionEmployee does not have to implement either function print or earningsboth function implementations can be inherited from class CommissionEmployee. If a class in our hierarchy were to inherit function implementations in this manner, the vtable pointers for these functions would simply point to the function implementation that was being inherited. For example, if BasePlusCommissionEmployee did not override earnings, the earnings function pointer in the vtable for class BasePlusCommissionEmployee would point to the same earnings function as the vtable for class CommissionEmployee points to.
Polymorphism is accomplished through an elegant data structure involving three levels of pointers. We have discussed one levelthe function pointers in the vtable. These point to the actual functions that execute when a virtual function is invoked.
Now we consider the second level of pointers. Whenever an object of a class with one or more virtual functions is instantiated, the compiler attaches to the object a pointer to the vtable for that class. This pointer is normally at the front of the object, but it is not required to be implemented that way. In Fig. 13.24, these pointers are associated with the objects created in Fig. 13.23 (one object for each of the types SalariedEmployee, HourlyEmployee, CommissionEmployee and BasePlusCommissionEmployee). Notice that the diagram displays each of the object's data member values. For example, the salariedEmployee object contains a pointer to the SalariedEmployee vtable; the object also contains the values John Smith, 111-11-1111 and $800.00.
The third level of pointers simply contains the handles to the objects that receive the virtual function calls. The handles in this level may also be references. Note that Fig. 13.24 depicts the vector employees that contains Employee pointers.
Now let us see how a typical virtual function call executes. Consider the call baseClassPtr->print() in function virtualViaPointer (line 85 of Fig. 13.23). Assume that baseClassPtr contains employees[ 1 ] (i.e., the address of object hourlyEmployee in employees). When the compiler compiles this statement, it determines that the call is indeed being made via a base-class pointer and that print is a virtual function.
The compiler determines that print is the second entry in each of the vtables. To locate this entry, the compiler notes that it will need to skip the first entry. Thus, the compiler compiles an offset or displacement of four bytes (four bytes for each pointer on today's popular 32-bit machines, and only one pointer needs to be skipped) into the table of machine-language object-code pointers to find the code that will execute the virtual function call.
The compiler generates code that performs the following operations [Note: The numbers in the list correspond to the circled numbers in Fig. 13.24]:
Select the ith entry of employees (in this case, the address of object hourlyEmployee), and pass it as an argument to function virtualViaPointer. This sets parameter baseClassPtr to point to hourlyEmployee.
Dereference that pointer to get to the hourlyEmployee objectwhich, as you recall, begins with a pointer to the HourlyEmployee vtable.
Dereference hourlyEmployee's vtable pointer to get to the HourlyEmployee vtable.
Skip the offset of four bytes to select the print function pointer.
Dereference the print function pointer to form the "name" of the actual function to execute, and use the function call operator () to execute the appropriate print function, which in this case prints the employee's type, name, social security number, hourly wage and hours worked.
The data structures of Fig. 13.24 may appear to be complex, but this complexity is managed by the compiler and hidden from you, making polymorphic programming straightforward. The pointer dereferencing operations and memory accesses that occur on every virtual function call require some additional execution time. The vtables and the vtable pointers added to the objects require some additional memory. You now have enough information to determine whether virtual functions are appropriate for your programs.
Performance Tip 13.1
Polymorphism, as typically implemented with virtual functions and dynamic binding in C++, is efficient. Programmers may use these capabilities with nominal impact on performance.
Performance Tip 13.2
Virtual functions and dynamic binding enable polymorphic programming as an alternative to switch logic programming. Optimizing compilers normally generate polymorphic code that runs as efficiently as hand-coded switch-based logic. The overhead of polymorphism is acceptable for most applications. But in some situationsreal-time applications with stringent performance requirements, for examplethe overhead of polymorphism may be too high.
Software Engineering Observation 13.11
Dynamic binding enables independent software vendors (ISVs) to distribute software without revealing proprietary secrets. Software distributions can consist of only header files and object filesno source code needs to be revealed. Software developers can then use inheritance to derive new classes from those provided by the ISVs. Other software that worked with the classes the ISVs provided will still work with the derived classes and will use the overridden virtual functions provided in these classes (via dynamic binding).