|
| |
Rules for Designing Chunks
You must follow these rules for designing and implementing your chunk, or your chunk
will not be consistent with the EDM and may break the system. These rules were
established by the EDM group to ensure a self-consistent and modular event data model. Not
all of these rules can be enforced by the compiler, so designers must be aware of these
rules. Component tests should test for compliance with these rules. See the section
earlier in this document for a description of how chunks are used.
Rules for implementing chunks
- Chunks may not contain smaller units of persistence: no d0_Refs
to any object is allowed within a chunk.
The chunk is the smallest unit of persistent event data. There is no need to design in
some other method of controlling the input or output of event data. If you think some item
is of sufficient importance, and of appropriate size, to be an atomic object in regards to
persistence, then make that object into a chunk in its own right.
- Each chunk must contain a closely related set of data,
typically the output of one step of reconstruction.
Each chunk must be cohesive. It is poor design practice to group unrelated
data (or behavior) into a single class.
- Chunks may not directly refer (by
value, reference or pointer) to other chunks or objects within other chunks.
Each chunk must be self-contained, to help ensure the self-consistency of the event and to
prevent excessive physical coupling in the extended EDM. Instead, make use of ChunkIDs
and "dumb data" to replace the pointers. See below for the constructs to use in
place of pointers. Starting with v00-01-02n-br-16 of the package edm and
v00-02-03-br-03 of the package identifiers, the EDM provides
"link" classes LinkIndex<>, LinkPtr<T>, etc.,
which provide a mechanism by which one can point from objects within one chunk to objects
within another chunk, without inducing any physical coupling.
- Each chunk must record bookkeeping information telling how
it was made and on which chunks it depends.
It is up to the designer of each chunk to determine what record keeping information is
important. For brevity, make use of RCPIDs to encode information about
reconstructors, EnvIDs to encode information about "reconstruction
environment" data (such as calibration data sets), and ChunkIDs to refer to
parent chunks. Please note that as of this writing, the Calibration and Alignment group
have not yet decided how to define and use the EnvID class. The version currently
in the EDM is just a placeholder.
- The access methods of each chunk
must be declared const. Setter functions are discouraged and must not be declared const.
It is preferable for all chunk data to be set via a constructor. "Set" functions
are discouraged. This is to help ensure that "parent" chunks are not modified
after "children" are made from them. Such a modification could make it
impossible to accurately trace the genesis of some part of the reconstructed event. Since
the data model allows any number of instances of any chunk class to be in the same event,
rather than modifying an existing chunk, it is always possible to add a new chunk, which
differs from the "old" chunk in whatever way is required for the task at hand. For
the same purpose, it is not allowed to declare member data mutable, except when
that data is merely a cache which stores the result of a calculation that can be performed
using only the const member functions of the chunk. It is most important to
understand the intent of this rule, rather than the details: if one is given access to a const
version of your chunk, one should not be able to call any function which modifies the
externally observable state of that chunk. Caching the result of a time-consuming
calculation is acceptable; such a cache variable would have to be declared mutable.
Modifying any variable that corresponds to the physics parameters or bookkeeping
parameters of the chunk is forbidden, for the reason expressed in the paragraph above.
- Each chunk must record the appropriate bookkeeping information.
This is so that users can determine how each chunk was created by querying the chunk
itself. Note that the definition of "appropriate information" might be different
for each chunk class, and the EDM can provide only the most general information. Each
designer should think carefully about his design, and make sure that the information that
would be interesting to users is recorded.
Requirements of the AbsChunk interface
In order to meet the AbsChunk interface, you must implement the following
member functions:
- std::list<ChunkID> parents( ) const
This member function returns the ChunkIDs of the "parents" of your
chunk. The "parents" of a chunk are those chunks which were used in the creation
of the "child" chunk. For example, ToyClusterColl chunks might be
created using a specific ToyCalorimeter chunk, and a specific ToyVertex
in a specific ToyVertexColl chunk. The ToyCalorimeter chunk has the
energy deposit information, and the ToyVertex gives the z-coordinate
from which the transverse energies are calculated. The parents of a specific ToyClusterColl
object are the ToyCalorimeter object and the ToyVertexColl object which
were used in its creation. The member function std::list<ChunkID> ToyClusterColl::parents() const
should return the ChunkIDs of these two objects.
- std::list<RCPID> rcps( ) const
This member function returns the RCPIDs of the RCP objects
which describe the reconstructor which made the chunk. Each chunk is created by a single
reconstructor object. If that reconstructor object had any parameters which were
configurable at run time, it got those parameters from an RCP object, supplied to
it by the framework. This reconstructor may have required, as input, other types of
chunks, which were in turn created by other reconstructors. In each case, the concise but
complete description of a reconstructor is given by the unique RCPID assigned to
the RCP object used in its instantiation. This function must return all the RCPIDs
which are relevant to the creation of the chunk. It is up to the designer of each chunk
class to decide which information is relevant, and which information is not.To continue
with the example above, we might decide that the parameters of the ToyClusterReco
object which created the ToyClusterColl are important to describing the ToyClusterColl,
but that the details of the ToyVertexReco object which created the ToyVertexColl
are not important. In that case, we would have std::list<RCPID> ToyClusterColl::rcps() const
return the RCPID of the RCP object used in the instantiation of the ToyClusterReco
object, but not that used for the ToyVertexReco object.
It is important that the designers (and review committees) pay careful attention to
this member function, because this list of RCPIDs will often be used by others.
For example, if several clustering algorithms have been run on a single event, the way a
user writes code to select the output of a particular algorithm is to use a selector which
looks at a chunk's RCPIDs, and which matches the one which returns the RCPIDs
which specify the algorithm which the user wanted.
- std::list<EnvID> environment( ) const
This member function returns the EnvIDs of the calibration and alignment
objects (or any other similar objects) used in the creation of the given chunk. Again, it
is at the discretion of the developer to decide what information is relevant. The
Calibration and Alignment group has not yet determined the way in which EnvIDs
are to be used, so this class is currently just a placeholder.
- std::string type( ) const and static std::string classType( )
These functions are used by the EDM to ensure type-safety at run time, both when chunks
are inserted into the event, and when chunks are accessed. The CHUNK_SETUP macro,
defined in the AbsChunk header file is, will generate both of these functions.
The same macro also invokes the macro required by DØOM for all persistent classes.
- void printChunk (std::ostream& os) const
This member function is useful for debugging reconstruction code. It prints an ASCII
representation of the chunk to the ostream os. It is called by the stream
insertion operator ( friend operator<<( ) ) which is defined in AbsChunk.
It is not necessary (and may even be counter-productive) to define operator<<( )
for classes that inherit from AbsChunk.
Requirements for persistence
The EDM will allow saving of events, and thus the chunks in the event, to permanent
storage. Your chunks must therefore be designed within the strictures of DØOM, the DØ
Object Model. The requirements of DØOM are documented in the DØOM
User Guide. Recall that the D0_OBJECT_SETUP macro required by DØOM is
implemented in CHUNK_SETUP, and must not be repeated.
References Between Chunks
The key concept behind the EDM rules for implementing references between chunks is to
use "dumb data" instead of pointers, in order to prevent excessive physical
coupling and in order to ensure consistency in the event, even when some chunks are
deleted.
- To refer to a specific chunk, use the ChunkID of the chunk to which you refer.
- To refer to data within another chunk class use an integer index or other small dumb
data type. Starting with v00-01-02n-br-16 of the package edm, the classes
LinkIndex<>, LinkPtr<>, and associated classes are provided
to allow a chunk (or an object within a chunk) to refer to an object within another chunk.
- Use an RCPID to refer to an RCP object used in the generation of chunk
data.
- Use an EnvID to refer to a specific database object used in the generation of
chunk data.
In designing a chunk or set of related chunks, you may find it useful to first plan
(and perhaps even implement) the classes without reference to the EDM. To then make the
classes you've designed meet the requirements of the EDM, break the inter-chunk physical
coupling by replacing pointers from one chunk to another by "dumb data" indices.
A fragmentary example is given below.
Example: References between chunks
Consider starting from the following design, which shows three types of chunks, each of
which contains zero or more "physics objects".

To make this design consonant with the EDM, we implement the associations between the ElectronColl
and the other chunks by having the ElectronColl contain two ChunkIDs,
one for its associated ClusterColl, and one for its associated TrackColl.
We implement the containment of Electrons by the ElectronColl by giving
the ElectronColl a member datum that is a std::vector<Electron>,
and similarly for Clusters in the ClusterColl and Tracks in the
TrackColl. Finally, we implement the association between an Electron and
its associated Cluster and Track by giving the Electron two
member data, one of which is the integer index of the associated Cluster in std::vector<Cluster>
contained in the ClusterColl, and the other of which is the integer index of the
associated Track in the std::vector<Track> contained in the TrackColl.
(Note that starting with v00-01-02n-br-16 of the package edm, one can use
a LinkIndex<T> instead, providing greater convenience for users of your
chunk).
In code, the implementation (in part; much required for real chunks is left out for
brevity) looks as follows:
Example chunk
The class ToyClusterColl (in the reconstructor
package) is an example that illustrates how to satisfy all the requirements given above.
This page last modified January 16, 2001 08:54 AM
Back to the EDM tutorial home page
Next: Selector Design
Previous: Reconstructors
|