RED Project
RED stands for reverse engineering of UML sequence diagrams. The goal of this project is to build a
state-of-the-art tool for extracting sequence diagrams from Java
code. Eventually we will make the tool freely available for use by
programmers and researchers, in the form of a public Eclipse plug-in.
Reverse-Engineered UML Sequence Diagrams
Sequence diagrams play a central role in
UML modeling of object interactions. Reverse engineering of sequence
diagram allows the automatic extraction of such diagrams from existing
code. This is often necessary during iterative development. A
typical scenario is to perform design recovery through reverse
engineering of class diagrams and sequence diagrams in the beginning
of the current iteration, based on the last iteration's code. The
resulting design documents serve as the starting point for subsequent
design work. Additional reverse engineering is also usually necessary
during an iteration.
Software maintenance can also benefit from
reverse-engineered sequence diagrams. These diagrams are particularly
well-suited for representing object interactions in object-oriented
software. The growing body of such software, even for newer languages
like Java, will create challenging maintenance problems for many years
into the future. Automatic design recovery of object interactions for
the purposes of software understanding and maintenance can be made
possible by tools for reverse engineering of sequence diagrams.
Object interactions are an essential consideration for testing
of object-oriented software. Various testing approaches consider
the interactions represented by sequence diagrams (or the similar
collaboration diagrams) as part of their coverage requirements. These
coverage goals can be defined with respect to different elements of
statically-constructed sequence diagrams which are extracted from the
code under test. Subsequent run-time analysis during test execution
can be used to determine the coverage of these diagram elements and to
highlight potential test weaknesses.
Static Analyses for Reverse-Engineered Sequence Diagrams
The work on RED revealed various challenging research problems that we
are currently solving. The difficulty of these problems becomes clear
when one considers commercial tools that perform reverse engineering
of UML sequence diagrams from Java code. UML modeling tools such as
Together ControlCenter by Borland and EclipseUML by Omondo can produce
reverse-engineered diagrams that are incorrect or incomplete. These
are not low-quality tools in fact, they are mature,
well-designed, high-quality commercial tools that are very useful
during software development. However, the conceptual problems related
to correct reverse-engineering of sequence diagrams are complicated
and require advanced static analysis techniques. We have solved some
of these problems (either partially or fully), and we believe that at
this moment RED truly represents the state of the art in terms of
static analyses for reverse engineering of UML sequence diagrams.
The following questions had to be addressed as part of this project:
- Given some call site in the analyzed code, how should the
receiver object(s) at this site be represented in the diagram? Our
efforts to answer this question produced a novel object naming
analysis which provides a precise answer for the majority of call
sites
[ICSE05].
We are currently working on analysis extensions which
focus on some challenging cases when a call site has multiple possible
receiver objects.
- How should the intraprocedural flow of control in the code be
represented in the reverse-engineered sequence diagrams? To
answer this question, we defined a new control-flow analysis
for mapping a method's control-flow graph to UML 2.0 interaction
fragments
[PASTE05].
This work systematically revealed the
limited expressive power of UML control-flow primitives, and the
tradeoffs that tool builders need to face when handling the
intraprocedural flow of control.
- How should infeasible call chains be eliminated from the
diagrams? Our work on call chain analysis showed that
infeasible chains in the call graph occur even when using precise call
graph construction algorithms
[ISSTA04, PASTE04].
Since such chains
should not be shown in the sequence diagrams, we are currently working
on static analyses for detecting infeasible call chains.
Visualization of Reverse-Engineered Sequence Diagrams
The problem of effective visualization of large-scale complex
reverse-engineered sequence diagrams is still open. Due to their
large size and inefficient spatial design, the diagrams could easily
become useless to software engineers. We investigated the visual
limitations of UML sequence diagrams and developed a set of techniques
for overcoming these limitations
[VISSOFT05].
These techniques allow
a programmer to explore interactively various aspects of large
real-world sequence diagrams in order to gain insights about the
behavior of the underlying software. We have implemented a prototype
visualization tool based on these techniques, and we have used it to
enhance our comprehension of sequence diagrams that were
reverse-engineered from the code in the standard Java libraries. This
part of the RED project is a significant step towards realizing
effective exploration of complex sequence diagrams in modern UML
tools.
Here are several examples from the prototype implementation of the
visualization techniques. For best results, view at 100% size.
- Sample screenshot
- Simple diagram derived from library
class java.math.BigDecimal
- In a large diagram, the user can filter out parts of the diagram
by graying them out . The grayed out parts
can be removed completely, leaving a slice of the
original diagram .
- The user can select the depth of the "interesting" call chains;
all messages beyond this depth are filtered out. The filtered
messages and objects can be grayed out
or removed completely. If the user changes the depth value, the
diagram is immediately modified to reflect this change.
- We have started work on visualization for the interaction
fragments introduced in UML 2.0. In this
example the diagram contains two opt fragments representing
optional behavior. If the user is interested in the high-level flow of
control, the fragments can be collapsed.
Alternatively, the user can zoom on the fragment (filtering everything
outside the fragment) to examine its contents in detail.
This work is still in the initial stages, and the visualization
component of the tool will become increasingly sophisticated as the
project progresses.
Testing Based on Reverse-Engineered Sequence Diagrams
Based on the reverse-engineered diagrams, we defined several
control-flow coverage criteria for testing the interactions among a
set of collaborating objects. The sequences of messages in the
diagrams were used to define the coverage goals for the family of
criteria, in a manner that generalizes traditional testing techniques
such as branch coverage and path coverage. We also defined a run-time
analysis that gathers coverage measurements for the criteria. The
results of this work compared different approaches for testing of
object interactions and provided new insights for testers and for
builders of test coverage tools
[FASE05].
main page