Identity Crisis (1996)

Introduction

According to the "Hammer" diagram in Grady Booch's book an object has a state, exhibits some well-defined behavior, and has a unique identity. In this article we will consider the issue of object identity for Eiffel objects and the implications it brings to Eiffel programmers and to library designers. We I will argue that a limited form of object ID is needed in the Eiffel standard to make it possible to develop portable interface libraries between Eiffel and other software systems.

Object ID: the view from the ivory tower

What is an "object_id"? From a purely theoretical point of view, "object_id" is a unique number that is associated with an object when the object is created and it remains "stuck" to the object forever. In particular, if the object is stored and retrieved later, or if is passed between processes and systems, its object_id should always remain the same.

Although generating unique numbers is relatively easy (for example, it is done in DCE; Universal Unique Identifier (UUID) is a 128 bit number guaranteed to be unique, no matter who generates it), the association between each object and its ID is tricky and expensive to implement. It is not surprizing then that the compiler vendors are against adding a universally unique object ID to the standard Eiffel Kernel.

Fortunately, an object ID this strong is not required.

Object ID: The View from the Trenches

The object ID actually needed by Eiffel programmers is a number assigned each object, a number which remains constant and unique during a single execution of a program. In Eiffel: The Language (1st Printing), Bertrand Meyer defines the object ID as "A positive value associated with the current object; if it is a complex object, the value is different from the value for any other complex object in the system." Such a weak object ID is slightly more abstract version of the object's memory address. Note however, that the memory address cannot always be used as a the ID, because it is possible for garbage collector to move the object, thereby changing the address.

Eiffel and OODBMS

To see how object ID might be used, we consider the problem of writing an Eiffel interface to an object oriented database system (OODBMS). In such a database system each persistent object is assigned a unique ID (sometimes called a "handle"). This ID can be used as a pointer between persistent objects, that is if persistent object A refers to a persistent object B, the actual value stored in some attribute of A will be the unique ID of B.

In the Eiffel interface we need to create Eiffel objects that represent database objects, but we must follow these restrictions:

At most one Eiffel object can exist for a given persistent object.
The database references between persistent objects must be replaced by Eiffel references, so that the object structure is mirrored in Eiffel.

To make sure that only one Eiffel object gets created for each persistent object retrieved, we can keep a table - typically a hash table - of Eiffel objects indexed by their database IDs. This way whenever we have a database ID, as during a retrieval operation, we can find the appropriate Eiffel object by using our table and if the object is not found then we can create one.

However, there is a serious problem with the above solution. An object placed in our table will never be garbage collected, because even if the entire application looses all the references to this object, the table itself will still have one reference active.

The only solution is to move the table out of Eiffel and into C - not a pleasant prospect. The C table will contain addresses of Eiffel objects and again it will be indexed by the database ID. Implementing such a table in C is possible, but the result is guaranteed not to be portable between Eiffel compilers.

We could avoid going to C and write this code in Eiffel, only if each Eiffel object had an object ID and if we had function, "id_object", which would translate the ID into a reference to the correpondng object. Here is how this code would look in Eiffel:

     get_eiffel_object (dbid : INTEGER) : ANY is
         local
             oid : INTEGER;
	     do
             oid := object_id_table.item (dbid);
             Result := id_object (oid)
         end;

Above, "object_id_table" is a hash table of integers - object IDs - indexed by the database ID. If our database interface were to allow any object to become persistent, without being forced to inherit from some PERSISTENT_OBJECT class, then we would also need a table to provide the reverse mapping - from Eiffel object IDs to database IDs.

Eiffel and UI

A developer of an interface between Eiffel and and a User Interface (UI) library (eg. NextStep), faces problems similar to that the ones we found in the OODBMS interface. Again, we need to keep a mapping between UI objects and corresponding Eiffel objects, and again a table holding this mapping cannot be in Eiffel because that would prevent the garbage collector from collecting any objects inserted into the table.

Problems

Although the definition of the weak object ID may seem straight forward, there are still some questions that need to be answered. For example, should expanded objects also have object IDs? At first glance we may want to answer that all objects should have ID, regardless of whether they are expanded or not, but then we would have to assign object IDs to constants. What should the object ID be for the constant integer 5? What about the string "Hello"?

A better choice is not have object IDs for expanded objects, this is in fact what is proposed in PELKS (The Proposed Eiffel Kernel Library STandard), and as it turns in practice we really need object IDs for reference types only.

Another interesting question is whether the object ID should be "hashable", that is can object ID be used as a key into a hash table. Again, at first glance it seems perfectly reasonable to allow objects to be inserted into hash tables using object ID as the index. However, remember that the object ID remains meanigful only during a single execution of a program, so if the hash table indexed by object IDs is stored in a file, what state will the table be in when it is restored later by another program. The restored objects will be assigned new IDs and the table will be incorrect.

Provided that the function "id_object" exists, to translate object IDs into object references, the object ID itself need not be hashable.

Conclusions

Although both proposed Eiffel Kernel library standards, Gustave and PELKS, include "object_id" and "id_object" features, in a recent discussion of the NICE Library committee some vendors were favoring dropping these features from the standard. This is not acceptable for Eiffel users. As I attempted to show above, the "object_id" and "id_object" features are essencial for enabling us to implement interface libraries that can be ported between various compilers.

Finally I'd like to thank Paul Murphy and Jason Schroeder for their Net postings and lunch discussions that helped in clarifying the main ideas of this article.

Richie Bielak (1996)

This work is licensed under a Creative Commons License.

Writing TOC