Abstract
This paper analyzes six open source projects in order to assess
software repositories, such as those managed by Subversion, as a
source for uncovering/discovering traceability links between
different types of software artifacts. Our finding suggests that
software repositories store a variety of artifacts that are central to
open source development and use. Furthermore, a heuristic-based
approach that uses sequential-pattern mining is presented. This
approach analyzes commits in a version history to mine for highly
frequent co-occurring changes to different artifacts (e.g., source
code and documentation). The hypothesis is if different types of
artifacts are committed together frequently then there is a high
probability that they have a traceability link between them.
Examples of mined traceability links from our preliminary
experimentation on mining KDE (K Desktop Environment)
repositories are presented.