Identity Crisis (1996)
Introduction
According to the "Hammer" diagram in Grady Booch's book an object has a state, exhibits some well-defined behavior,
and has a unique identity. In this article we will consider the issue of object identity for Eiffel objects and the
implications it brings to Eiffel programmers and to library designers. We I will argue that a limited form of
object ID is needed in the Eiffel standard to make it possible to develop portable interface libraries between Eiffel and other software systems.
Object ID: the view from the ivory tower
What is an "object_id"? From a purely theoretical point of view, "object_id" is a unique number that
is associated with an object when the object is created and it remains "stuck" to the object forever.
In particular, if the object is stored and retrieved later, or if is passed between processes and systems,
its object_id should always remain the same.
Although generating unique numbers is relatively easy (for example, it is done in DCE; Universal Unique
Identifier (UUID) is a 128 bit number guaranteed to be unique, no matter who generates it),
the association between each object and its ID is tricky and expensive to implement. It is not surprizing
then that the compiler vendors are against adding a universally unique object ID to the standard Eiffel Kernel.
Fortunately, an object ID this strong is not required.
Object ID: The View from the Trenches
The object ID actually needed by Eiffel programmers is a number assigned each object, a number which remains constant
and unique during a single execution of a program. In Eiffel: The Language (1st Printing), Bertrand Meyer
defines the object ID as "A positive value associated with the current object; if it is a complex object, the value
is different from the value for any other complex object in the system." Such a weak object ID is slightly more
abstract version of the object's memory address. Note however, that the memory address cannot always be used as
a the ID, because it is possible for garbage collector to move the object, thereby changing the address.
Eiffel and OODBMS
To see how object ID might be used, we consider the problem of writing an Eiffel interface to an object oriented database
system (OODBMS). In such a database system each persistent object is assigned a unique ID (sometimes called a "handle").
This ID can be used as a pointer between persistent objects, that is if persistent object A refers to a persistent
object B, the actual value stored in some attribute of A will be the unique ID of B.
In the Eiffel interface we need to create Eiffel objects that represent database objects, but we must follow
these restrictions:
- At most one Eiffel object can exist for a given persistent object.
- The database references between persistent objects must be replaced by Eiffel references,
so that the object structure is mirrored in Eiffel.
To make sure that only one Eiffel object gets created for each persistent object retrieved, we can keep a table -
typically a hash table - of Eiffel objects indexed by their database IDs. This way whenever we have a database ID,
as during a retrieval operation, we can find the appropriate Eiffel object by using our table and if the object
is not found then we can create one.
However, there is a serious problem with the above solution. An object placed in our
table will never be garbage collected, because even if the entire application looses all the references to
this object, the table itself will still have one reference active.
The only solution is to move the table out of Eiffel and into C - not a pleasant prospect. The C table will
contain addresses of Eiffel objects and again it will be indexed by the database ID. Implementing such
a table in C is possible, but the result is guaranteed not to be portable between Eiffel compilers.
We could avoid going to C and write this code in Eiffel, only if each Eiffel object had an object ID and if
we had function, "id_object", which would translate the ID into a reference to the correpondng object. Here
is how this code would look in Eiffel:
get_eiffel_object (dbid : INTEGER) : ANY is
local
oid : INTEGER;
do
oid := object_id_table.item (dbid);
Result := id_object (oid)
end;
Above, "object_id_table" is a hash table of integers - object IDs - indexed by the database ID.
If our database interface were to allow any object to become persistent, without being forced to
inherit from some PERSISTENT_OBJECT class, then we would also
need a table to provide the reverse mapping - from Eiffel object IDs to database IDs.
Eiffel and UI
A developer of an interface between Eiffel and and a User Interface (UI) library (eg. NextStep), faces
problems similar to that the ones we found in the OODBMS interface. Again, we need to keep a mapping between
UI objects and corresponding Eiffel objects, and again a table holding this mapping cannot be in Eiffel
because that would prevent the garbage collector from collecting any objects inserted into the table.
Problems
Although the definition of the weak object ID may seem straight forward, there are still some questions
that need to be answered. For example, should expanded objects also have object IDs? At first glance we
may want to answer that all objects should have ID, regardless of whether they are expanded or not, but then
we would have to assign object IDs to constants. What should the object ID be for the constant integer 5?
What about the string "Hello"?
A better choice is not have object IDs for expanded objects, this is in fact what is proposed in PELKS
(The Proposed Eiffel Kernel Library STandard), and as it turns in practice we really need object IDs for
reference types only.
Another interesting question is whether the object ID should be "hashable", that is can object ID be used as a
key into a hash table. Again, at first glance it seems perfectly reasonable to allow objects to be inserted
into hash tables using object ID as the index. However, remember that the object ID remains meanigful only during
a single execution of a program, so if the hash table indexed by object IDs is stored in a file, what state
will the table be in when it is restored later by another program. The restored objects will be assigned new
IDs and the table will be incorrect.
Provided that the function "id_object" exists, to translate object IDs into object references, the object ID itself need not be hashable.
Conclusions
Although both proposed Eiffel Kernel library standards, Gustave and PELKS, include "object_id" and "id_object" features,
in a recent discussion of the NICE Library committee some vendors were favoring dropping these features from the
standard. This is not acceptable for Eiffel users. As I attempted to show above, the "object_id" and "id_object"
features are essencial for enabling us to implement interface libraries that can be ported between various compilers.
Finally I'd like to thank Paul Murphy and Jason Schroeder for their Net postings and lunch discussions that helped
in clarifying the main ideas of this article.
Richie Bielak (1996)
This work is licensed under a Creative Commons License.
Writing TOC