Papers
We include the papers on this page to ensure timely dissemination on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by the copyrights. These works may not be reposted without the explicit permission of the copyright holder.
Filter
2011
-
Matthias Hert, Gerald Reif, Harald C. Gall, A Comparison of RDB-to-RDF Mapping Languages, Proceedings of the 7th International Conference on Semantic Systems (I-Semantics) 2011. (inproceedings)
Mapping Relational Databases (RDB) to RDF is an active field of research. The majority of data on the current Web is stored in RDBs. Therefore, bridging the conceptual gap between the relational model and RDF is needed to make the data available on the Semantic Web. In addition, recent research has shown that Semantic Web technologies are useful beyond the Web, especially if data from different sources has to be exchanged or integrated. Many mapping languages and approaches were explored leading to the ongoing standardization effort of the World Wide Web Consortium (W3C) carried out in the RDB2RDF Working Group (WG). The goal and contribution of this paper is to provide a feature-based comparison of the state-of-the-art RDB-to-RDF mapping languages. It should act as a guide in selecting a RDB-to-RDF mapping language for a given application scenario and its requirements w.r.t. mapping features. Our comparison framework is based on use cases and requirements for mapping RDBs to RDF as identified by the RDB2RDF WG. We apply this comparison framework to the state-of-the-art RDB-to-RDF mapping languages and report the findings in this paper. As a result, our classification proposes four categories of mapping languages: direct mapping, read-only general-purpose mapping, read-write general-purpose mapping, and special-purpose mapping. We further provide recommendations for selecting a mapping language.
-
Matthias Hert, Giacomo Ghezzi, Michael Würsch, Harald C. Gall, How to 'Make a Bridge to the new Town' using OntoAccess, Proceedings of the 10th International Semantic Web Conference (ISWC) 2011. (inproceedings)
Business-critical legacy applications often rely on relational databases to sustain daily operations. Introducing Semantic Web technology in newly developed systems is often difficult, as these systems need to run in tandem with their predecessors and cooperatively read and update existing data.
A common pattern is to incrementally migrate data from a legacy system to its successor by running the new system in parallel, with a data bridge in between. Existing approaches that can be deployed as a data bridge in theory, restrict Semantic Web-enabled applications to read legacy data in practice, disallowing update operations completely.
This paper explains how our RDB-to-RDF platform OntoAccess can be used to transition legacy systems into Semantic Web-enabled applications. By means of a case study, we exemplify how we successfully made a bridge between one of our own large-scale legacy systems and its long-term replacement. We elaborate on challenges we faced during the migration process and how we were able to overcome them.
-
Giacomo Ghezzi, Harald C. Gall, SOFAS : A Lightweight Architecture for Software Analysis as a Service, Working IEEE/IFIP Conference on Software Architecture (WICSA 2011), 20-24 June 2011, Boulder, Colorado, USA 2011, IEEE Computer Society. (inproceedings)
Access to data stored in software repositories by systems such as version control, bug and issue tracking, or mailing lists is essential for assessing the quality of a software system. A myriad of analyses exploiting that data have been proposed throughout the years: source code analysis, code duplication analysis, co-change analysis, bug prediction, or detection of bug fixing patterns. However, easy and straight forward synergies between these analyses rarely exist. To tackle this problem we have developed SOFAS, a distributed and collaborative software analysis platform to enable a seamless interoperation of such analyses. In particular, software analyses are offered as RESTful web services that can be accessed and composed over the Internet. SOFAS services are accessible through a software analysis catalog where any project stakeholder can, depending on the needs or interests, pick specific analyses, combine them, let them run remotely and then fetch the final results. That way, software developers, testers, architects, or quality assurance experts are given access to quality analysis services. They are shielded from many peculiarities of tool installations and configurations, but SOFAS offers them sophisticated and easy-to-use analyses. This paper describes in detail our SOFAS architecture, its considerations and implementation aspects, and the current set of implemented and offered RESTful analysis services.
-
Matthias Hert, Sergio Marsella, Gerald Reif, Harald C. Gall, UpLink - A Linked Data Editor for RDB-to-RDF Data, Proceedings of the 7th International Conference on Semantic Systems (I-Semantics) 2011. (inproceedings/Short Paper)
Linked Data builds a machine-processable Web of Data based on a large and growing number of RDF datasets and typed links among them. For the human user, Web-based interfaces were developed to enable browsing and editing Linked Data that is stored as native RDF. However, the majority of data on the current Web is stored in Relational Databases (RDB). This is a challenge for Linked Data browsers and especially for Linked Data editors. In this paper, we present UpLink which is to the best of our knowledge the first Linked Data editor for RDB-to-RDF data, i.e., RDF data that is mapped on demand from a RDB. We further present usage scenarios to demonstrate that UpLink supports the basic CRUD operations for editing Linked Data.
2010
-
Matthias Hert, Gerald Reif, Harald C. Gall, 'Semantic Web 2.0' - Write-enabling the Web of Data, Proceedings of the 6th Workshop on Semantic Web Applications and Perspectives (SWAP), September 2010. (inproceedings)
The Semantic Web today is mainly a read-only Web of Data. Many of the data sets that contribute to the Semantic Web are not stored as native RDF, but generated on demand via wrappers. Despite the fact that user contribution is the key success factor in the Web 2.0, current wrapper approaches and standardization efforts still focus on read-only data access. In this paper, we argue that the Semantic Web should learn from the evolution of the Web 2.0 and consider write-enabled semantic data wrappers.
-
Roger Wolfer, BibViz, 09 2010. (misc/Facharbeit)
BibViz is a software developed by Amancio Bouza for visualizing citation- and reference-relationships.
This Software helps a user with an existing bibliography for a specific topic to explore further
publications that are relevant for the topic. A publication counts as relevant for the topic if they cite
publications that are already in the bibliography or if they?ve been cited by a publication in the
bibliography. The existing solution had some issues. First, the process of manually adding citations
the bibliography was very time consuming. And second, to visualize a bibliography it was necessary
to adjust the file path in the source code. The first issue is solved by a web crawler which
automatically adds citations and further publication information to the bibliography. The second
issue is solved by a GUI which allows to choose an existing bibliography, to extend this bibliography
with information the web crawler collects and to save the extended bibliography.
-
Giacomo Ghezzi, Harald C. Gall, Distributed and Collaborative Software Analysis, Collaborative Software Engineering, Editor(s): Ivan Mistrik, John Grundy, Jim Whitehead, Andrè van der Hoek, January ; 2010, Springer-Verlag. (incollection)
-
Michael Würsch, Gerald Reif, Serge Demeyer, Harald C. Gall, Fostering Synergies - How Semantic Web Technology could influence Software Repositories, Proceedings of the 2nd Intl. Workshop on Search-driven development: Users, Infrastructure, Tools and Evaluation (SUITE)., May 2010. (inproceedings/Workshop paper)
The state-of-the-art in mining software repositories mirrors software artifacts from various sources into monolithic relational databases. This puts a lot of querying power in the hands of the software miners, however it comes at the cost of enclosing the data and hamper cross-application reuse. In this paper we discuss four problem scenarios to illustrate that Semantic Web technology is able to overcome these limitations. However, it requires that the software engineering research community agrees on two prerequisites: (a) a common vocabulary to talk about software repositories -- an ontology; (b) a strategy for generating unique and stable references to all software artifacts inside such a repository - a Universal Resource Identifier (URI).
-
Sandro Boccuzzo, Harald C. Gall, Multi-Touch Collaboration for Software Exploration, Proceedings of the International Conference on Program Comprehension (ICPC'10) 2010. (inproceedings)
Software systems have grown so complex and their design is so intricate that no individual can grasp the whole picture. Touch screen technology combined with 3D software visualization offers a promising way for the software engineers involved in a project to share knowledge about a software system in an intuitive way. In this paper we present first results on how such emerging technologies can be combined to support software exploration tasks, such as identifying high-impact changes or revealing problematic parts of the design. As demonstrated with a scenario, this turns the collaborative environment into a vehicle usable during software reviews.
-
Emanuel Giger, Martin Pinzger, Harald C. Gall, Predicting the fix time of bugs, Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, May 2010. (inproceedings)
Two important questions concerning the coordination of development effort are which bugs to fix first and how long it takes to fix them. In this paper we investigate empirically the relationships between bug report attributes and the time to fix. The objective is to compute prediction models that can be used to recommend whether a new bug should and will be fixed fast or will take more time for resolution. We examine in detail if attributes of a bug report can be used to build such a recommender system. We use decision tree analysis to compute and 10-fold cross validation to test prediction models. We explore prediction models in a series of empirical studies with bug report data of six systems of the three open source projects Eclipse, Mozilla, and Gnome. Results show that our models perform significantly better than random classification. For example, fast fixed Eclipse Platform bugs were classified correctly with a precision of 0.654 and a recall of 0.692. We also show that the inclusion of postsubmission bug report data of up to one month can further improve prediction models.
-
Ahmed Lamkanfi, Serge Demeyer, Emanuel Giger, Bart Goethals, Predicting the severity of a reported bug, Proceedings of the 7th Working Conference on Mining Software Repositories 2010. (inproceedings)
The severity of a reported bug is a critical factor in deciding how soon it needs to be fixed. Unfortunately, while clear guidelines exist on how to assign the severity of a bug, it remains an inherent manual process left to the person reporting the bug. In this paper we investigate whether we can accurately predict the severity of a reported bug by analyzing its textual description using text mining algorithms. Based on three cases drawn from the open-source community (Mozilla, Eclipse and GNOME), we conclude that given a training set of sufficient size (approximately 500 reports per severity), it is possible to predict the severity with a reasonable accuracy (both precision and recall vary between 0.65-0.75 with Mozilla and Eclipse; 0.70-0.85 in the case of GNOME).
-
Giacomo Ghezzi, Harald C. Gall, SOFAS Architecture, University of Zurich, Department of Informatics, Software Evolution and Architecture Lab, 01 2010. (techreport)
-
Giacomo Ghezzi, SOFAS: software analysis services, Editor(s): Jeff Kramer, Judith Bishop, Premkumar Devanbu, Sebastian Uchitel, May ; 2010, ACM. (inproceedings)
-
Michael Würsch, Giacomo Ghezzi, Gerald Reif, Harald C. Gall, Supporting Developers with Natural Language Queries, Proceedings of the 32nd International Conference on Software Engineering, May 2010, IEEE Computer Society. (inproceedings)
The feature list of modern IDEs is growing steadily and mastering these tools becomes more and more demanding, especially for novice programmers. Despite their remarkable capabilities, IDEs often still cannot directly answer the questions that arise during program comprehension tasks. Instead developers have to map their questions to multiple concrete queries that can be answered only by combining several tools and examining the output of each of them manually to distill an appropriate answer. Existing approaches have in common that they are either limited to a set of predefined, hardcoded questions, or that they require to learn a specific query language only suitable for that limited purpose. We present a framework to query for information about a software system using guided-input natural language resembling plain English. For that, we model data extracted by classical software analysis tools with an OWL ontology and use knowledge processing technologies from the Semantic Web to query it. We also present a case study that demonstrates how our framework can be used to answer queries about static source code information for program comprehension purposes.
-
Matthias Hert, Gerald Reif, Harald C. Gall, Updating Relational Data via SPARQL/Update, EDBT Workshop Proceedings, March 2010. (inproceedings)
Relational Databases (RDBs) are used in most current enterprise environments to store and manage data. The semantics of the data is not explicitly encoded in the relational model, but implicitly at the application level. Ontologies and Semantic Web technologies provide explicit semantics that allows data to be shared and reused across application, enterprise, and community boundaries. Converting all relational data to RDF is often not feasible, therefore we adopt a mediation approach for ontology-based access to RDBs. Existing mapping approaches focus on read-only access via SPARQL or as Linked Data but other data access interfaces exist, including approaches for updating RDF data. In this paper we present OntoAccess, an extensible platform for ontology-based read and write access to existing relational data. It encapsulates the translation logic in the core layer that provides the foundation of an extensible set of data access interfaces in the interface layer. We further present the formal definition of our RDB-to-RDF mapping, the architecture of our mediator platform, and a performance evaluation of the prototype implementation.
-
Patrick Knab, Martin Pinzger, Harald C. Gall, Visual Patterns in Issue Tracking Data, New Modeling Concepts for Today's Software Processes 2010, Springer. (inproceedings)
Software development teams gather valuable data about features and bugs in issue tracking systems. This information can be used to measure and improve the efficiency and effectiveness of the development process. In this paper we present an approach that harnesses the extraordinary capability of the human brain to detect visual patterns.
We specify generic visual process patterns that can be found in issue tracking data. With these patterns we can analyze information about effort estimation, and the length, and sequence of problem resolution activities.
In an industrial case study we apply our interactive tool to identify instances of these patterns and discuss our observations.
Our approach was validated through extensive discussions with multiple project managers and developers, as well as feedback from the project review board.
2009
-
Beat Fluri, Michael Würsch, Emanuel Giger, Harald C. Gall, Analyzing the co-evolution of comments and source code, Software Quality Journal Vol. 17 (4), September 2009. (article)
Source code comments are a valuable instrument to preserve design decisions and to communicate the intent of the code to programmers and maintainers. Nevertheless, commenting source code and keeping them up-to-date is often neglected for reasons of time or programmer?s obliviousness. In this paper, we investigate the question whether developers comment their code and to which extent they add comments or adapt them when they evolve the code. We present an approach to associate comments with source code entities to track their co-evolution over multiple versions. A set of heuristics are used to decide whether a comment is associated to its preceding or its succeeding source code entity. We analyzed the co-evolution of code and comments in eight different open source and closed source software systems. We found with statistical significance that (1) the relative amount of comments and source code grows at about the same rate; (2) the type of a source code entity, such as a method declaration or an if-statement, has a significant influence on whether or not it gets commented; (3) in six out of the eight systems, code and comments co-evolve in 90 percent of the cases; and (4) surprisingly, API changes and comments do not co-evolve but they are re-documented in a later revision. As a result, our approach enables a quantitative assessment of the commenting process in a software system. We can, therefore, leverage the results to provide feedback during development to increase the awareness when to add comments or when to adapt comments because of source code changes.
-
Sandro Boccuzzo, Harald C. Gall, Automated Comprehension Tasks in Software Exploration, ASE '09: Proceedings of the 2009 International Conference on Automated Software Engineering 2009. (inproceedings/Short Paper)
Finding issues in software usually requires a serie of comprehension tasks. After every task, an engineer explores the results and decides whether further tasks are required. Software comprehension therefore is a combination of tasks and a supported exploration of the results typically in an adequate visualization. In this paper, we describe how we simplify the combination of existing automated procedures to sequentially solve common software comprehension tasks. Beyond that we improve the understanding of the outcomes with interactive and explorative visualization concepts in a time efficient workflow. We validate the presented concept with basic comprehension tasks in an extended CocoViz tool implementation.
-
Harald C. Gall, Beat Fluri, Martin Pinzger, Change Analysis with Evolizer and ChangeDistiller, IEEE Software Vol. 26 (1), January/February 2009. (article)
-
Sandro Boccuzzo, CocoViz with ambient audio software exploration, ICSE '09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering 2009. (inproceedings)
For ages we used our ears side by side with our ophthalmic stimuli to gather additional information, leading and supporting us in our visualization. Nowadays numerous software visualization techniques exist that aim to facilitate program comprehension. In this paper we discuss how we can support such software comprehension visualization with environmental audio and lead users to identify relevant aspects. We use cognitive visualization techniques and audio concepts described in our previous work to create an ambient audio software exploration (AASE) out of program entities (packages, classes ...) and their mapped properties. The concepts where implemented in a extended version of our tool called CocoViz. Our first results with the prototype shows that with this combination of visual and aural means we can provide additional information to lead users during program comprehension tasks.
-
Thomas Zimmermann, Nachiappan Nagappan, Harald C. Gall, Emanuel Giger, Brendan Murphy, Cross-project defect prediction: a large scale experiment on data vs. domain vs. process, ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering on European software engineering conference and foundations of software engineering 2009, ACM. (inproceedings)
-
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald C. Gall, Brendan Murphy, Does distributed development affect software quality? An empirical case study of Windows Vista, ICSE '09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering 2009, IEEE Computer Society. (inproceedings)
-
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald C. Gall, Brendan Murphy, Does distributed development affect software quality?: an empirical case study of Windows Vista, Communications of the ACM Vol. 52 (8), August 2009. (article)
-
Sandro Boccuzzo, , Richard Wettel, Sazzadul Alam, Philippe Dugerdil, Harald C. Gall, Michele Lanza, EvoSpaces - Multi-dimentional Navigation Spaces for Software Evolution Vol. LNCS 5440, Springer 2009. (inbook)
In software development, a major difficulty comes from the intrinsic complexity of software systems and the size of which can easily reach millions of lines of code. But software is an intangible artifact that does not have any natural visual representation. While many software visualization techniques have been proposed in the literature, they are often difficult to interpret. In fact, the user of such representations is confronted with an artificial world that contains and represents intangible objects. The goal of our EVOSPACES project was to investigate effective visual metaphors (i.e., analogies) between natural objects and software objects so that we can exploit the cognitive understanding of the user. The difficulty of the approach is that the common sense expectations about the displayed world should also apply to the world of software objects. To solve this common sense representation problem for software objects our project addressed both the small-scale (i.e., the level of individual objects) and the large-scale (i.e., the level of groups of objects). After many experiments we decided for a "city" metaphor: at the small scale we included different houses and their shapes as visual objects to cover size, structure and history. At the large-scale level we arrange the different types of houses in districts and include their history in diverse layouts. The user then is able to use EVOSPACES virtual software city to navigate and explore all kinds of aspects of a city and its houses: size, age, historical evolution, changes, growth, restructuring, and evolution patterns such as code smells or architectural decay. For that we have developed a software environment named EVOSPACES as a plug-in to Eclipse so that visual metaphors can quickly be implemented in an easily navigable virtual space. Due to the large amount of information we complemented the flat 2D world with full-fledged immersive 3D representation. In this virtual software city, the dimensions and appearance of the buildings can be set according to software metrics. The user of the EVOSPACES environment can then explore a given software system by navigating through the corresponding virtual city.
-
Harald C. Gall, Gerald Reif, ICSE 2009 Tutorial - Semantic Web Technologies in Software Engineering , 31th International Conference on Software Engineering (ICSE 2009), May 18 2009. (inproceedings/tutorial)
Over the years, the software engineering community has developed various tools to support the specification, development, and maintainance of software. Many of these tools use proprietary data formats to store artifacts which hamper interoperability. On the other hand, the Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Ontologies are used to define the concepts in the domain of discourse and their relationships and as such provide the formal vocabulary applications use to exchange data. Besides the Web, the technologies developed for the Semantic Web have proven to be useful also in other domains, especially when data is exchanged between applications from different parties. Software engineering is one of these domains in which recent research shows that Semantic Web technologies are able to reduce the barriers of proprietary data formats and enable interoperability.
In this tutorial, we present Semantic Web technologies and their application in software engineering. We discuss the current status of ontologies for software entities, bug reports, or change requests, as well as semantic representations for software and its documentation. This way, architecture, design, code, or test models can be shared across application boundaries enabling a seamless integration of engineering results.
-
Patrick Knab, Martin Pinzger, Beat Fluri, Harald C. Gall, Interactive Views for Analyzing Problem Reports, ICSM '09 Proceedings of the 25th International Conference on Software Maintenance 2009. (inproceedings)
Issue tracking repositories contain a wealth of information for reasoning about various aspects of software development processes. In this paper, we focus on bug triaging and
provide visual means to explore the effort estimation quality and the bug life-cycle of reported problems.
Our approach follows the Micro/Macro reading technique and uses a combination of graphical views to investigate details of individual problem reports while maintaining the context provided by the surrounding data population. This enables the detection and detailed analysis of hidden pat- terns and facilitates the analysis of problem report outliers.
In an industrial study, we use our approach in various problem report analysis scenarios and answer questions related to effort estimation and resource planning.
-
Matthias Hert, Gerald Reif, Harald C. Gall, Personal Knowledge Mapping with Semantic Web Technologies, Proceedings of the 1st International Workshop on Personal Knowledge Management at the 5th Conference on Professional Knowledge Management, March 2009. (inproceedings)
Semantic Web technologies promise great benefits for Personal Knowledge Management (PKM) and Knowledge Management (KM) in general when data needs to be exchanged or integrated. However, the Semantic Web also introduces new issues rooted in its distributed nature as multiple ontologies exist to encode data in the Personal Information Management (PIM) domain. This poses problems for applications processing this data as they would need to support all current and future PIM ontologies. In this paper, we introduce an approach that decouples applications from the data representation by providing a mapping service which translates Semantic Web data between different vocabularies. Our approach consists of the RDF Data Transformation Language (RDTL) to define mappings between different but related ontologies and the prototype implementation RDFTransformer to apply mappings. This allows the definition of mappings that are more complex than simple one-to-one matches.
-
Tobias Bannwart, Amancio Bouza, Gerald Reif, Abraham Bernstein, Private Cross-page Movie Recommendations with the Firefox add-on OMORE, 8th International Semantic Web Conference (ISWC 2009), October 2009. (inproceedings/Semantic Web Challenge)
Online stores and Web portals bring information about a myriad of items such as books, CDs, restaurants or movies at the user's fingertips. Although, the Web reduces the barrier to the information, the user is overwhelmed by the number of available items. Therefore, recommender systems aim to guide the user to relevant items. Current recommender systems store user ratings on the server side. This way the scope of the recommendations is limited to this server only. In addition, the user entrusts the operator of the server with valuable information about his preferences.
Thus, we introduce the private, personal movie recommender OMORE, which learns the user model based on the user's movie ratings. To preserve privacy, OMORE is implemented as Firefox add-on which stores the user ratings and the learned user model locally at the client side. Although OMORE uses the features from the movie pages on the IMDb site, it is not restricted to IMDb only. To enable cross-referencing between various movie sites such as IMDb, Amazon.com, Blockbuster, Netflix, Jinni, or Rotten Tomatoes we introduce the movie cross-reference database LiMo which contributes to the Linked Data cloud.
-
Amancio Bouza, Gerald Reif, Abraham Bernstein, Probabilistic Partial User Model Similarity for Collaborative Filtering, Proceedings of the 1st International Workshop on Inductive Reasoning and Machine Learning on the Semantic Web (IRMLeS2009) at the 6th European Semantic Web Conference (ESWC2009), June 2009. (inproceedings)
Recommender systems play an important role in supporting people getting items they like. One type of recommender systems is user-based collaborative filtering. The fundamental assumption of user-based collaborative filtering is that people who share similar preferences for common items behave similar in the future. The similarity of user preferences is computed globally on common rated items such that partial preference similarities might be missed. Consequently, valuable ratings of partially similar users are ignored. Furthermore, two users may even have similar preferences but the set of common rated items is too small to infer preference similarity. We propose first, an approach that computes user preference similarities based on learned user preference models and second, we propose a method to compute partial user preference similarities based on partial user model similarities. For users with few common rated items, we show that user similarity based on preferences significantly outperforms user similarity based on common rated items.
-
Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald C. Gall, Brendan Murphy, Putting it All Together: Using Socio-Technical Networks to Predict Failures, ISSRE '09: Proceedings of the 20th International Symposium on Software Reliability, November 2009, IEEE Computer Society. (inproceedings)
-
Matthias Hert, Relational Databases as Semantic Web Endpoints, Proceedings of the 6th European Semantic Web Conference (ESWC), June 2009, Springer. (inproceedings/PhD Symposium)
This proposal explores the promotion of existing relational databases to Semantic Web Endpoints. It presents the benefits of ontology-based read and write access to existing relational data as well as the need for specialized, scalable reasoning over that data. We introduce our approach for translating SPARQL/Update operations to SQL, describe how scalable reasoning can be realized by using the power of the database system, and outline two case studies for evaluating our approach.
-
Patrick Knab, Martin Pinzger, Harald C. Gall, Smart views for analyzing problem reports: tool demo, ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering 2009, ACM. (inproceedings)
Issue tracking repositories contain a wealth of information for reasoning about various aspects of software development processes. In this paper, we focus on bug triaging and provide visual means to explore the effort estimation quality and the bug life-cycle of reported problems.
Our approach uses a combination of graphical views to investigate details of individual problem reports while maintaining the context provided by the surrounding data population. This enables the detection and detailed analysis of hidden patterns and facilitates the analysis of problem report outliers.
-
Michael Jehle, Kevin Leopold, Linard Moll, Anthony Lymer, Software Evolution Recognition and Visualization Information Service, University of Zurich, Department of Informatics, 12 2009. (techreport)
-
Matthias Hert, Gerald Reif, Harald C. Gall, SPARQL/Update for Relational Databases, Proceedings of the 6th European Semantic Web Conference (ESWC), June 2009. (inproceedings/Poster)
We present an approach for ontology-based read and write access to existing Relational Databases (RDBs). SPARQL/Update serves as the data manipulation language that is translated to equivalent SQL commands according to mappings between the RDBs and the Semantic Web. This addition of write support enables a full integration of existing relational data into Semantic Web applications.
-
Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, Abraham Bernstein, Tracking Concept Drift of Software Projects Using Defect Prediction Quality, Proceedings of the 6th IEEE Working Conference on Mining Software Repositories , May 2009, IEEE Computer Society. (inproceedings)
Defect prediction is an important task in the mining of
software repositories, but the quality of predictions varies
strongly within and across software projects. In this paper
we investigate the reasons why the prediction quality is so
fluctuating due to the altering nature of the bug (or defect)
fixing process. Therefore, we adopt the notion of a concept
drift, which denotes that the defect prediction model has
become unsuitable as set of influencing features has changed
? usually due to a change in the underlying bug generation
process (i.e., the concept). We explore four open source
projects (Eclipse, OpenOffice, Netbeans and Mozilla) and
construct file-level and project-level features for each of
them from their respective CVS and Bugzilla repositories.
We then use this data to build defect prediction models and
visualize the prediction quality along the time axis. These
visualizations allow us to identify concept drifts and ? as a
consequence ? phases of stability and instability expressed
in the level of defect prediction quality. Further, we identify
those project features, which are influencing the defect
prediction quality using both a tree induction-algorithm and
a linear regression model. Our experiments uncover that
software systems are subject to considerable concept drifts
in their evolution history. Specifically, we observe that the
change in number of authors editing a file and the number
of defects fixed by them contribute to a project?s concept
drift and therefore influence the defect prediction quality.
Our findings suggest that project managers using defect
prediction models for decision making should be aware of
the actual phase of stability or instability due to a potential
concept drift.
2008
-
Yu Zhou, Michael Würsch, Emanuel Giger, Harald C. Gall, Jian Lue, A Bayesian Network Based Approach for Change Coupling Prediction, WCRE '08: Proceedings of the 2008 15th Working Conference on Reverse Engineering 2008, IEEE Computer Society. (inproceedings)
Source code coupling and change history are two important data
sources for change coupling analysis. The popularity of public open
source projects in recent years makes both sources available. Based
on our previous research, in this paper, we inspect different
dimensions of software changes including change significance or source code
dependency levels, extract a set of features from the two
sources and propose a bayesian network-based approach for
change coupling prediction. By combining the features from the co-changed entities
and their dependency relation, the approach can model the underlying
uncertainty. The empirical case study on two medium-sized
open source projects demonstrates the feasibility and effectiveness
of our approach compared to previous work.
-
Yu Zhou, A Runtime Architecture-Based Approach for the Dynamic Evolution of Distributed Component-Based Systems, Proc. of International Conference on Software Engineering, Doctoral Symposium 2008. (inproceedings)
Dynamic evolution of distributed component-based systems (DCS) is an important task in software engineering. Several challenges are posed in this process. For example, how to preserve consistency during evolution and how to reflect the abstract evolution specification in the concrete reconfiguration implementation. Having observed the generality of software architecture, researchers have proposed various architectural description languages (ADLs), enabling evolution techniques, etc. to investigate the problem. These approaches typically employ the formal semantics of dynamic ADLs at the incremental levels of refinement in the design phase or the explicit maintenance of software architecture at runtime. However, different ADLs usually address different concerns and the lack of runtime support for the causal relation between ADLs and the running system easily leads to the mismatch between them, thus inevitably sacrifices their usability. We propose an approach based on a runtime architecture which is visually generated from an attributed type graph meta-model, exists through the lifecycle of DCS, establishes the causal relation between architectural topology and system configuration, and directs the dynamic evolution.
-
Martin Pinzger, Katja Gräfenhain, Patrick Knab, Harald C. Gall, A Tool for Visual Understanding of Source Code Dependencies, Proceedings of the International Conference on Program Comprehension (ICPC'08) 2008, IEEE Computer Society. (inproceedings)
Many program comprehension tools use graphs to visualize
and analyze source code. The main issue is that existing
approaches create graphs overloaded with too much
information. Graphs contain hundreds of nodes and even
more edges that cross each other. Understanding these
graphs and using them for a given program comprehension
task is tedious, and in the worst case developers stop using
the tools. In this paper we present DA4Java, a graphbased
approach for visualizing and analyzing static dependencies
between Java source code entities. The main contribution
of DA4Java is a set of features to incrementally
compose graphs and remove irrelevant nodes and edges
from graphs. This leads to graphs that contain significantly
fewer nodes and edges and need less effort to understand.
-
Yi Guo, Adrian Schwaninger, Harald C. Gall, An Architecture for an Adaptive and Collaborative Learning Management System in Aviation Security, 17th IEEE International Workshop in Enabling Technologies: Infrastructures for Collaborative Enterprises, Workshop for Distributed and Mobile Collaboration (DMC 2008), June 2008, IEEE Computer Society. (inproceedings)
The importance of aviation security has increased dramatically in recent years. Frequently changing regulations and the need to adapt quickly to new and emerging threats are challenges that need to be addressed by airports, security companies and appropriate authorities across the world. Learning Management Systems (LMS) have been developed as effective tools for enhancing the management, integration and application of knowledge in organizations. In the aviation security domain, we need mechanisms to quickly adapt to new learning content, to different roles ranging from screeners to supervisors, to flexible training
scenarios and solid job assessments. For that, a learning system has to be flexible and adaptive both in knowledge, organizational and in collaboration dimensions. Current LMS do not meet these requirements. In this paper we present a software architecture that is apt to support the adaptability and collaboration needs for such a system in aviation security. We discuss the requirements, roles, learning objects and course configuration in terms of adaptive and collaborative learning. We present a six-layer architecture and discuss some of its application scenarios. Our aim is to improve the quality and usefulness of LMS in aviation
security by utilizing knowledge-based analysis for data analysis and integrating a process engine for collaborative learning. We briefly report on our prototype and the gained
first feedback from the users.
-
Martin Pinzger, Nachiappan Nagappan, Brendan Murphy, Can Developer-Module Networks Predict Failures?, Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering 2008, ACM. (inproceedings)
Software teams should follow a well defined goal and keep their
work focused. Work fragmentation is bad for efficiency and
quality. In this paper we empirically investigate the relationship
between the fragmentation of developer contributions and the
number of post-release failures. Our approach is to represent
developer contributions with a developer-module network that we
call contribution network. We use network centrality measures to
measure the degree of fragmentation of developer contributions.
Fragmentation is determined by the centrality of software modules
in the contribution network. Our claim is that central software
modules are more likely to be failure-prone than modules located
in surrounding areas of the network. We analyze this hypothesis
by exploring the network centrality of Microsoft Windows Vista
binaries using several network centrality measures as well as
linear and logistic regression analysis. In particular, we investigate
which centrality measures are significant to predict the probability
and number of post-release failures. Results of our experiments
show that central modules are more failure-prone than modules
located in surrounding areas of the network. Results further
confirm that number of authors and number of commits are
significant predictors for the probability of post-release failures.
For predicting the number of post-release failures the closeness
centrality measure is most significant.
-
Beat Fluri, Emanuel Giger, Harald C. Gall, Discovering Patterns of Change Types, Proceedings of the 23rd International Conference on Automated Software Engineering, September 2008, IEEE Computer Society. (inproceedings)
The reasons why software is changed are manyfold; new features are added, bugs have to be fixed, or the consistency of coding rules has to be re-established. Since there are many types of of source code changes we want to explore whether they appear frequently together in time and whether they describe specific development activities. We describe a semi-automated approach to discover patterns of such change types using agglomerative hierarchical clustering. We extracted source code changes of one commercial and two open-source software systems and applied the clustering. We found that change type patterns do describe development activities and affect the control flow, the exception flow, or change the API.
-
Martin Pinzger, Harald C. Gall, Michael Fischer, Emerging Methods, Technologies and Process Management in Software Engineering, John Wiley 2008. (inbook)
-
Cathrin Weiss, Abraham Bernstein, Sandro Boccuzzo, i-MoCo: Mobile Conference Guide - Storing and querying huge amounts of Semantic Web data on the iPhone/iPod Touch, October 2008. (misc)
Querying and storing huge amounts of Semantic Web data ? this has usually required a lot of computational power. This is no
longer true if one makes use of recent research outcomes like modern RDF indexing strategies. We present a mobile conference guide application that combines several different RDF data sets to present interlinked information about publications, conferences, authors, locations, and others to the user. With our application we show that it is possible to store a big amount of indexed data on an iPhone/iPod Touch device. That querying is also efficent
is demonstrated by creating the application?s actual content out of real time queries on the data.
-
Martin Pinzger, Katja Gräfenhain, Patrick Knab, Harald C. Gall, Incremental Visual Understanding of Java Source Code, Department of Informatics, University of Zurich 2008. (techreport)
-
Jacek Ratzinger, Thomas Sigmund, Harald C. Gall, On the Relation of Refactorings and Software Defect Prediction, MSR 2008. (inproceedings)
This paper analyzes the influence of evolution activities such as refactoring on software defects. In a case study of five open source projects we used attributes of software evolution to predict defects in time periods of six months. We use versioning and issue tracking systems to extract 110 data mining features, which are separated into refactoring and non-refactoring related features. These features are used as input into classification algorithms that create prediction models for software defects. We found out that refactoring related features as well as non-refactoring related features lead to high quality prediction models. Additionally, we discovered that refactorings and defects have an inverse correlation: The number of software defects decreases, if the number of refactorings increased in the preceding time period. As a result, refactoring should be a signi?cant part of both bug ?xes and other evolutionary changes to reduce software defects.
-
Beat Fluri, Jonas Zuberbuehler, Harald C. Gall, Recommending Method Invocation Context Changes, Proceedings of the 1st International Workshop on Recommender Systems for Software Engineering, November 2008, ACM. (inproceedings)
Our investigations of bug fixes in Eclipse showed that a significant amount of bugs were fixed by moving invocations of certain methods into the then or else-part of if-statements with similar conditions. Based on this finding, we leverage such context changes applied in the past to support developers while adding invocations of the same method. In this paper we present ChangeCommander, an Eclipse plugin that implements our approach to recommend insertions of particular if-statements before calling a method. ChangeCommander presents context change suggestions by highlighting affected method invocations in the source code and provides automated code adaptation support.
-
Amancio Bouza, Gerald Reif, Abraham Bernstein, Harald C. Gall, SemTree: Ontology-Based Decision Tree Algorithm for Recommender Systems, In Proceedings of the 7th International Semantic Web Conference, October 2008. (inproceedings/Poster)
Recommender systems play an important role in supporting people when choosing items from an overwhelming huge number of choices. So far, no recommender system makes use of domain knowledge. We are modeling user preferences with a machine learning approach to recommend people items by predicting the item ratings. Specifically, we propose SemTree, an ontology-based decision tree learner, that uses a reasoner and an ontology to semantically generalize item features to improve the effectiveness of the decision tree built. We show that SemTree outperforms comparable approaches in recommending more accurate recommendations considering domain knowledge.
-
Marco D'Ambros, Harald C. Gall, Michele Lanza, Martin Pinzger, Software Evolution , SpringerLink 2008. (inbook)
Software repositories such as versioning systems, defect tracking systems, and
archived communication between project personnel are used to help manage the progress of
software projects. Software practitioners and researchers increasingly recognize the potential
benefit of mining this information to support the maintenance of software systems, improve
software design or reuse, and empirically validate novel ideas and techniques. Research is
now proceeding to uncover ways in which mining these repositories can help to understand
software development, to support predictions about software development, and to plan various
evolutionary aspects of software projects.
This chapter presents several analysis and visualization techniques to understand software
evolution by exploiting the rich sources of artifacts that are available. Based on the data models
that need to be developed to cover sources such as modification and bug reports we describe
how to use a Release History Database for evolution analysis. For that we present approaches
to analyze developer effort for particular software entities. Further we present change coupling
analyses that can reveal hidden change dependencies among software entities. Finally, we
show how to investigate architectural shortcomings over many releases and to identify trends
in the evolution. Kiviat graphs can be effectively used to visualize such analysis results.
-
Sandro Boccuzzo, Harald C. Gall, Software Visualization with Audio Supported Cognitive Glyphs, 24th IEEE International Conference on Software Maintenance (ICSM 2008) 2008, IEEE Computer Society. (inproceedings)
There exist numerous software visualization techniques that
aim to facilitate program comprehension. One of the main
concerns in every such software visualization is to identify
relevant aspects fast and provide information in an effective
way. In previous work, we developed a cognitive visualiza-
tion technique and tool called CocoViz that uses common
place metaphors for an intuitive understanding of software
structures and evolution. In this paper, we address soft-
ware comprehension by a combination of visualization and
audio. Evolution and structural aspects are annotated with
different audio to represent concepts such as design erosion,
code smells or evolution metrics. We use audio concepts
such as loudness, sharpness, tone pitch, roughness or oscil-
lation and map those to properties of classes and packages.
As such we provide an audio annotation of software entities
along their version history for software analysis and soft-
ware browsing. Our ?rst results with the prototype and a
small user study show that with this combination of visual
and aural means we can facilitate program comprehension
and provide additional information that usually is not pro-
vided by current visualization approaches.
-
Ansgar Bernardi, Stefan Decker, Ludger van Elst, Gunnar Grimnes, Tudor Groza, Siegfried Handschuh, Mehdi Jazayeri, Cedric Mesnage, Knud Möller, Gerald Reif, Michael Sintek, The Social Semantic Desktop - A New Paradigm Towards Deploying the Semantic Web on the Desktop, Semantic Web Engineering in the Knowledge Society, Editor(s): Jorge Cardoso, Miltiadis D. Lytras; 2008, IGI Global. (incollection)
This chapter introduces the general vision of the Social Semantic Desktop (SSD) and details it in the context of the NEPOMUK project. It outlines the typical SSD requirements and functionalities that were identified from real world scenarios. In addition, it provides the design of the standard SSD architecture together with the ontology pyramid developed to support it. Finally, the chapter gives an overview of some of the technical challenges that arise from the actual development process of the SSD.
-
Giacomo Ghezzi, Harald C. Gall, Towards Software Analysis as a Service, Proceedings of Evol'08, the 4th Intl. ERCIM Workshop on Software Evolution and Evolvability at the 23rd IEEE/ACM Intl. Conf. on Automated Software Engineering, September 2008. (inproceedings)
Throughout the years software engineers have come up with a myriad of specialized tools and techniques that focus on a certain type of analysis, such as metrics extraction, evolution tracking, co-change detection, bug prediction, all the way up to social network analysis of team dynamics.
However, easy and straight forward synergies between these analyses/tools rarely exist because of their stand-alone nature, their platform dependence, their different input and output formats and the variety of systems to analyze. This significantly hampers their usage and reduces their acceptance by other researchers and software companies.
To overcome this problem we propose a distributed and collaborative software analysis platform to enable a seamless interoperability of software analysis tools across platform, geographical and organizational boundaries. In particular, we devise software analysis tools as services that can be accessed and composed over the Internet. These distributed services shall be widely accessible through a software analysis broker where organizations and research groups can register and share their tools.
To enable (semi)-automatic use and composition of these tools, they are classified and mapped into a software analysis taxonomy and adhere to specific meta-models and ontologies for their category of analysis.
-
Harald C. Gall, Gerald Reif, Tutorial - Semantic Web Technologies in Software Engineering, 30th International Conference on Software Engineering (ICSE 2008), May 12 2008. (inproceedings/tutorial)
Over the years, the software engineering community has developed various tools to support the specification, development, and maintainance of software. Many of these tools use proprietary data formats to store artifacts which hamper interoperability. However, the Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. Ontologies are used define the concepts in the domain of discourse and their relationships and as such provide the formal vocabulary applications use to exchange data. Beside the Web, the technologies developed for the Semantic Web have proven to be useful also in other domains, especially when data is exchanged between applications from different parties. Software engineering is one of these domains in which recent research shows that Semantic Web technologies are able to reduce the barriers of proprietary data formats and enable interoperability.
In this tutorial, we present Semantic Web technologies and their application in software engineering. We discuss the current status of ontologies for software entities, bug reports, or change requests, as well as semantic representations for software and its documentation. This way, architecture, design, code, or test models can be shared across application boundaries enabling a seamless integration of engineering results.
2007
-
Marco D'Ambros, Michele Lanza, Martin Pinzger, "A Bug's Life" Visualizing a Bug Database, Proceedings of IEEE International Workshop on Visualizing Software for Understanding and Analysis (VisSoft 2007) 2007, IEEE Computer Society. (inproceedings)
-
Gian Marco Laube, Gerald Reif, Harald C. Gall, Architectural Issues of the Semantic Clipboard as Ontology Mediation Service, 1st Workshop on Architecture, Design, and Implementation of the Semantic Desktop (SemDeskDesign2007) at the Eurpean Semantic Web Conference ESWC2007, June 2007. (inproceedings)
When copying and pasting data between applications using
the operating system clipboard, the semantics of the transfered information is usually lost. Using Semantic Web technologies these semantics
can be explicitly de?ned in a machine process-able way. In previous research we developed a prototype to show the feasibility and bene?ts from
a semantic enriched clipboard, that was limited to the number of ontologies it could handle or application that could access it. In this paper
we introduce an advanced architecture for the Semantic Clipboard that
incorporates the standard communication paradigm of operating system
clipboards and is able to handle RDF graphs of arbitrary domains of interest. This architecture includes a data mediation service that overcomes
vocabulary heterogeneities between source and target applications.
-
Beat Fluri, Assessing Changeability by Investigating the Propagation of Change Types, Proceedings of the 29th International Conference on Software Engineering, May 2007, IEEE Computer Society. (inproceedings)
We propose an approach to build a changeability assessment model for source code entities. Based on this model, we will assess the changeability of evolving software systems.
The changeability assessment is based on a taxonomy of more than 30 change types and a classification of these in terms of change significance levels for consecutive versions of software entities. We consider change type propagation on different levels of granularity ranging from method changes to interface and class changes.
We claim that this kind of assessment is effective in pointing to potential causes of maintainability problems in evolving software systems.
-
Beat Fluri, Michael Würsch, Martin Pinzger, Harald C. Gall, Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction, IEEE Transactions on Software Engineering Vol. 33 (11), November 2007. (article)
A key issue in software evolution analysis is the identification of particular changes that occur across several versions of a program. We present change distilling, a tree differencing algorithm for fine-grained source code change extraction. For that, we have improved the existing algorithm of Chawathe et al. for extracting changes in hierarchically structured data. Our algorithm extracts changes by finding both a match between the nodes of the compared two abstract syntax trees and a minimum edit script that can transform one tree into the other given the computed matching. As a result, we can identify fine-grained change types between program versions according to our taxonomy of source code changes. We evaluated our change distilling algorithm with a benchmark we developed that consists of 1,064 manually classified changes in 219 revisions of eight methods from three different open source projects. We achieved significant improvements in extracting types of source code changes: Our algorithm approximates the minimum edit script by 45% better than the original change extraction approach by Chawathe et al. We are able to find all occurring changes and almost reach the minimum conforming edit script, i.e., we reach a mean absolute percentage error of 34%, compared to 79% reached by the original algorithm. The paper describes both our change distilling algorithm and the results of our evaluation.
-
Sandro Boccuzzo, Harald C. Gall, CocoViz: Supported Cognitive Software Visualization, Proceedings of 14th Working Conference on Reverse Engineering (WCRE 2007) 2007, IEEE Computer Society. (inproceedings)
As software evolves and becomes more and more complex, program comprehension arises as a major concern in soft- ware projects. The amount of data and the complexity of relationships between the entities are unmanageable for en- gineers without effective tool support. In this paper, we demonstrate how CocoViz can help understanding software in a quick and intuitive manner. Some of the implemented approaches have been presented inde- pendently before. However, in CocoViz we combine them in an intuitive and easy to use manner.
-
Sandro Boccuzzo, Harald C. Gall, CocoViz: Towards Cognitive Software Visualizations, Proceedings of IEEE International Workshop on Visualizing Software for Understanding and Analysis (VisSoft 2007) 2007, IEEE Computer Society. (inproceedings)
Understanding software projects is a complex task. There is an increasing need for visualizations that improve com- prehensiveness of the evolution of a software system. This paper discusses our recent work in software visualization with respect to metaphors. Our goal is to use simple and well-known graphical elements known from daily life such as houses, spears, or tables to allow a user a quick and intuitive understanding of a given visualization via their proportions. We present a software metrics configurator that handle different metaphors and allows optimizations to their graphical representation. The results so far show that large systems can be visualized effectively with metaphor glyphs, yet more case studies and more metaphor glyphs are required for a better understanding for offering a simple and cognitive visual understanding of a software system.
-
Gerald Reif, Tudor Groza, Siegfried Handschuh, Cedric Mesnage, Mehdi Jazayeri, Rosa Gudjonsdottir, Collaboration on the Social Semantic Desktop, Workshop on Ubiquitous Mobile Information and Collaboration Systems (UMICS 2007) at CAiSE 2007, June 2007, Springer. (inproceedings)
To accomplish the daily work people use several desktop applications to collaborate with co-workers. Each application is specialized
on a speci?c domain, such as document management, email, or time planning. Although the data is distributed over several applications the data
is highly interlinked from the user?s point of view. The Social Semantic
Desktop aims to take advantage of Semantic Web technologies on the
computer?s desktop to better support the user?s mental working model
and to enable collaboration over enterprise boundaries. In this paper we
present our ongoing work on the Social Semantic Desktop as collaboration environment. We present the intended usage scenarios, discuss the
required services and give an outlook on the architecture we envision for
the Social Semantic Desktop.
-
Ksenia Ryndina, Jochen M. Küster, Harald C. Gall, Consistency of Business Process Models and Object Life Cycles, Models in Software Engineering 2007, Springer. (inproceedings)
-
Katharina Reinecke, Gerald Reif, Abraham Bernstein, Cultural User Modeling With CUMO: An Approach to Overcome the Personalization Bootstrapping Problem, First International Workshop on Cultural Heritage on the Semantic Web at the 6th International Semantic Web Conference (ISWC 2007), November 12 2007. (inproceedings)
The increasing interest in personalizable applications for heterogeneous user populations has heightened the need for a more efficient acquisition of start-up information about the user. We argue that the user?s cultural background is suitable for predicting various adaptation preferences at once. With these as a basis, we can accelerate the initial acquisition process. The paper presents an approach to factoring culture into user models. We introduce the cultural user model ontology CUMO, describing how and to which extend it can accurately represent the user?s cultural background. Furthermore, we outline its use as a re-usable and shared knowledge base in a personalization process, before presenting a plan of our future work towards cultural personalization.
-
Beat Fluri, Michael Würsch, Harald C. Gall, Do Code and Comments Co-Evolve? On the Relation Between Source Code and Comment Changes, Proceedings of the 14th Working Conference on Reverse Engineering, October 2007, IEEE Computer Society. (inproceedings)
Comments are valuable especially for program understanding and maintenance, but do developers comment their code? To which extent do they add comments or adapt them when they evolve the code? We examine the question whether source code and associated comments are really changed together along the evolutionary history of a software system.
In this paper, we describe an approach to map code and comments to observe their co-evolution over multiple versions. We investigated three open source systems (i.e., ArgoUML, Azureus, and JDT Core) and describe how comments and code co-evolved over time. Some of our findings show that: 1) newly added code|despite its growth rate|barely gets commented; 2) class and method declarations are commented most frequently but far less, for example, method calls; and 3) that 97% of comment changes are done in the same revision as the associated source code change.
-
Jacek Ratzinger, Martin Pinzger, Harald C. Gall, EQ-Mine:Predicting Short-Term Defects for Software Evolution, Proceedings of the 10th International Conference of Funtamental Approaches to Software Engineering (FASE), April 2007, Springer. (inproceedings)
-
Abraham Bernstein, Jayalath Ekanayake, Martin Pinzger, Improving Defect Prediction Using Temporal Features and Non Linear Models, Proceedings of the International Workshop on Principles of Software Evolution, September 2007, IEEE Computer Society. (inproceedings)
Predicting the defects in the next release of a large soft-
ware system is a very valuable asset for the pro ject manger
to plan her resources. In this paper we argue that temporal
features (or aspects) of the data are central to prediction per-
formance. We also argue that the use of non-linear models,
as opposed to traditional regression, is necessary to uncover
some of the hidden interrelationships between the features
and the defects and maintain the accuracy of the prediction
in some cases.
Using data obtained from the CVS and Bugzilla reposito-
ries of the Eclipse pro ject, we extract a number of temporal
features, such as the number of revisions and number of re-
ported issues within the last three months. We then use
these data to predict both the location of defects (i.e., the
classes in which defects will occur) as well as the number of
reported bugs in the next month of the pro ject. To that end
we use standard tree-based induction algorithms in compar-
ison with the traditional regression.
Our non-linear models uncover the hidden relationships be-
tween features and defects, and present them in easy to un-
derstand form. Results also show that using the temporal
features our prediction model can predict whether a source
?le will have a defect with an accuracy of 99% (area under
ROC curve 0.9251) and the number of defects with a mean
absolute error of 0.019 (Spearman?s correlation of 0.96).
-
Jacek Ratzinger, Thomas Sigmund, Peter Vorburger, Harald C. Gall, Mining Software Evolution to Predict Refactoring, Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM 2007) 2007, IEEE Computer Society. (inproceedings)
Can we predict locations of future refactoring based on the development history? In an empirical study of open source projects we found that attributes of software evolution data can be used to predict the need for refactoring in the following two months of development. Information systems utilized in software projects provide a broad range of data for decision support. Versioning systems log each activity during the development, which we use to extract data mining features such as growth measures, relationships between classes, the number of authors working on a particular piece of code, etc. We use this information as input into classification algorithms to create prediction models for future refactoring activities. Different state-of-the-art classifiers are investigated such as decision trees, logistic model trees, propositional rule learners, and nearest neighbor algorithms. With both high precision and high recall we can assess the refactoring proneness of object-oriented systems. Although we investigate different domains, we discovered critical factors within the development life cycle leading to refactoring, which are common among all studied projects.
-
Knud Möller, Gerald Reif, Siegfried Handschuh, Moving Stuff - Linking Desktops with semiBlog, the Semantic Clipboard and RDFa, 16th International World Wide Web Conference (WWW2007), Developers Track, May 8-12 2007. (inproceedings/Demo)
In this short paper we will demonstrate how embedded RDFa in Weblogs can be used as a medium for data-transfer between desktops. A combination of two existing Semantic Web tools - the desktop-based Semantic Blog authoring tool semiBlog and the Semantic Clipboard application - allows one user to export and blog data from various desktop applications such as electronic addressbooks, calendars or bibliographic databases, and another user to import the same data back into their own applications. http://sw.deri.org/~knud/papers/MetadataRoundtripWWW2007/
-
Jacek Ratzinger, Martin Pinzger, Harald C. Gall, Quality Assessment based on Attribute Series of Software Evolution, Proceedings of the 14th Working Conference on Reverse Engineering (WCRE), October 2007, IEEE Computer Society. (inproceedings)
-
Gerald Reif, Gian Marco Laube, Knud Möller, Harald C. Gall, SemClip - Overcoming the Semantic Gap Between Desktop Applications, 5th Semantic Web Challenge at the 6th International Semantic Web Conference (ISWC 2007), November 11-15 2007. (inproceedings/Semantic Web Challenge)
When copying and pasting data between applications using
the operating system clipboard, the semantics of the transfered information is usually lost. Using Semantic Web technologies these semantics can
be explicitly de?ned in a machine process-able way and therefore be preserved during the data transfer. In this paper we introduce SemClip, our
implementation of a Semantic Clipboard that enables the exchange of
semantically enriched data between desktop applications and show how
such a clipboard can be used to copy and paste semantic annotations
from Web pages to desktop applications.
-
Tudor Groza, Siegfried Handschuh, Knud Möller, Gunnar Grimnes, Leo Sauermann, Enrico Minack, Gerald Reif, Rosa Gudjonsdottir, The NEPOMUK Project - On the Way to the Social Semantic Desktop, Proceedings of the Third International Conference on Semantic Technologies (I-SEMANTICS 2007) 2007. (inproceedings)
This paper introduces the NEPOMUK pro ject which aims to create a
standard and reference implementation for the Social Semantic Desktop. We outline the requirements and functionalities that were identi?ed for a useful Semantic Desktop system and present an architecture that ful?lls these requirements which was acquired by incremental re?nement of the architecture of existing Semantic Desktop prototypes. The NEPOMUK pro ject is primarily motivated by three real-life industrial use-cases, we brie?y outline these and the processes used to extract required functionalities from the people working in these areas today, and we present a selection of typical tasks where the Semantic Desktop could be of bene?t.
2006
-
Gerald Reif, Harald C. Gall, An Architecture for a Semantic Portal, International Workshop on Data Integration and Semantic Web (DISWeb'06) at the 18th Conference on Advanced Information Systems Engineering (CAiSE 2006), June 2006, Springer. (inproceedings)
Current Web applications provide their information and functionalities to human users only. To make Web applications also accessible for machines, the Semantic Web proposes an extension of the current Web, that describes the semantics of the content and the services explicitly with machine-processable meta-data. In this paper we introduce an architecture of a Semantic Portal that provides a unique front-end to the information and functionalities of individual Semantic Web applications. To realize the portal we use WEESA to semantically annotate Web applications and provide the annotations in a knowledge base (KB) for download and querying. Based on that, the Semantic Harvester collects the KBs from individual Semantic Web applications to build the global KB of the Semantic Portal. Finally, we use Semantic Web services to make the portal a unique interface to the services of the Web applications.
-
Beat Fluri, Harald C. Gall, Classifying Change Types for Qualifying Change Couplings, Proceedings of the 9th International Conference on Program Comprehension, June 2006, IEEE Computer Society. (inproceedings)
Current change history analysis approaches rely on information provided by versioning systems such as CVS. Therefore, changes are not related to particular source code entities such as classes or methods but rather to text lines added and/or removed. For analyzing whether some change coupling between source code entities is significant or only minor textual adjustments have been checked in, it is essential to reflect the changes to the source code entities.
We have developed an approach for analyzing and classifying change types based on code revisions. We can differentiate between several types of changes on the method or class level and assess their significance in terms of the impact of the change types on other source code entities and whether a change may be functionality-modifying or functionality-preserving.
We applied our change taxonomy to a case study and found out that in many cases large numbers of lines added and/or deleted are not accompanied by significant changes but small textual adaptations (such as indentation, etc.). Furthermore, our approach allows us to relate all change couplings to the significance of the identified change types. As a result, change couplings between code entities can be qualified and less relevant couplings can be filtered out.
-
Tobias Sager, Abraham Bernstein, Martin Pinzger, Christoph Kiefer, Detecting Similar Java Classes Using Tree Algorithms, Proceedings of the International Workshop on Mining Software Repositories, May 2006, ACM. (inproceedings)
Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code while typing and that suggests similar code examples or existing implementations from a source code repository. Mining software repositories by means of similarity measures enables and enforces reusing existing code and reduces the developing effort needed by creating a shared knowledge base of code fragments. In information retrieval similarity measures are often used to find documents similar to a given query document. This paper extends this idea to source code repositories. It introduces our approach to detect similar Java classes in software projects using tree similarity algorithms. We show how our approach allows to find similar Java classes based on an evaluation of three tree-based similarity measures in the context of five user-defined test cases as well as a preliminary software evolution analysis of a medium-sized Java project. Initial results of our technique indicate that it (1) is indeed useful to identify similar Java classes, (2) successfully identifies the ex ante and expost versions of refactored classes, and (3) provides some interesting insights into within-version and between-version dependencies of classes within a Java project.
-
Michael Fischer, Harald C. Gall, EvoGraph: A Lightweight Approach to Evolutionary and Structural Analysis of Large Software Systems, 13th Working Conference on Reverse Engineering (WCRE), October 2006, IEEE Computer Society. (inproceedings)
Structural analyses frequently fall short in an adequate
representation of historical changes for retrospective analysis.
By compounding the two underlying information spaces
in a single approach, the comprehension about the interaction
between evolving requirements and system development
can be improved significantly. We therefore propose
a lightweight approach based on release history data and
source code changes, which first selects entities with evolutionary
outstanding characteristics and then indicates their
structural dependencies via commonly used source code entities.
The resulting data sets and visualizations aim at a
holistic view to point out and assess structural stability, recurring
modifications, or changes in the dependencies of
the file-sets under inspection. In this paper we describe
our approach and its results in terms of the Mozilla case
study. Our approach completes typical release history mining
and source code analysis approaches, therefore past restructuring
events, new, shifted, and removed dependencies
can be spotted easily.
-
Patrick Knab, Martin Pinzger, Abraham Bernstein, Predicting Defect Densities in Source Code Files with Decision Tree Learners, MSR '06: Proceedings of the 2006 International Workshop on Mining Software Repositories, May 2006, ACM. (inproceedings)
With the advent of open source software repositories the data available
for defect prediction in source files increased tremendously.
Although traditional statistics turned out to derive reasonable results
the sheer amount of data and the problem context of defect prediction
demand sophisticated analysis such as provided by current data
mining and machine learning techniques.
In this work we focus on defect density prediction and present
an approach that applies a decision tree learner on evolution data
extracted from the Mozilla open source web browser project. The
evolution data includes different source code, modification, and defect
measures computed from seven recent Mozilla releases. Among
the modification measures we also take into account the change coupling,
a measure for the number of change-dependencies between
source files. The main reason for choosing decision tree learners,
instead of for example neural nets, was the goal of finding underlying
rules which can be easily interpreted by humans. To find these
rules, we set up a number of experiments to test common hypotheses
regarding defects in software entities. Our experiments showed, that
a simple tree learner can produce good results with various sets of
input data.
-
Reto Geiger, Beat Fluri, Harald C. Gall, Martin Pinzger, Relation of Code Clones and Change Couplings, Proceedings of the 9th International Conference of Funtamental Approaches to Software Engineering, March 2006, Springer. (inproceedings)
Code clones have long been recognized as bad smells in software systems and are considered to cause maintenance problems during evolution. It is broadly assumed that the more clones two files share, the more often they have to be changed together. This relation between clones and change couplings has been postulated but neither demonstrated nor quantified yet. However, given such a relation it would simplify the
identification of restructuring candidates and reduce change couplings.
In this paper, we examine this relation and discuss if a correlation between code clones and change couplings can be verified. For that, we propose a framework to examine code clones and relate them to change couplings taken from release history analysis.
We validated our framework with the open source project Mozilla and the results of the validation show that although the relation is statistically unverifiable it derives a reasonable amount of cases where the relation exists.
Therefore, to discover clone candidates for restructuring we additionally propose a set of metrics and a visualization technique. This allows one to spot where a correlation between cloning and change coupling exists and, as a result, which files should be restructured to ease further evolution.
-
Gerald Reif, Semantic Annotation, Semantic Web - Wege zur vernetzten Wissensgesellschaft, Editor(s): Tassilo Pellegrini, Andreas Blumauer; 2006, Springer. (incollection)
In diesem Kapitel wird zuerst der Begriff Semantische Annotation eingeführt und es werden Techniken besprochen um die Annotationen mit dem ursprünglichen Dokument zu verknüpfen. Weiters wird auf Probleme eingegangen, die sich beim Erstellen der Annotationen ergeben. Im Anschluss daran werden Software Tools vorgestellt, die einen Benutzer beim Annotierungsprozess unterstützen. Zum Abschluss werden Methoden diskutiert, die den Annotierungsvorgang in den Entwicklungsprozess einer Web Applikation integrieren.
-
Gerald Reif, Martin Morger, Harald C. Gall, Semantic Clipboard - Semantically Enriched Data Exchange Between Desktop Applications, Semantic Desktop and Social Semantic Collaboration Workshopat the 5th International Semantic Web Conference ISWC06, November 2006. (inproceedings)
The operating system clipboard is used to copy and paste data between applications even if the applications are from different vendors. Current clipboards only support the transfer of data or formatted data between applications. The semantics of the data, however, is lost in the transfer. The Semantic Web, on the other hand, provides a common framework that allows data to be shared across application boundaries while preserving the semantics of the data. In this paper we introduce the concept of a Semantic Clipboard and present a prototype implementation that can be used to copy and paste RDF meta-data between desktop applications. The Semantic Clipboard is based on a flexible plugin architecture that enables the easy extension of the clipboard to new ontology vocabularies and target applications. Furthermore, we show how the Semantic Clipboard is used to copy and paste the meta-data from semantically annotated Web pages to a user's desktop application.
-
Sunghun Kim, Thomas Zimmermann, Miryung Kim, Ahmed Hassan, Audris Mockus, Tudor Girba, Martin Pinzger, E. James Whitehead Jr., Andreas Zeller, TA-RE: An Exchange Language for Mining Software Repositories, Proceedings of the International Workshop on Mining Software Repositories, May 2006, ACM. (inproceedings)
Software repositories have been getting a lot of attention from researchers in recent years. In order to analyze software repositories, it is necessary to first extract raw data from the version control and problem tracking systems. This poses two challenges: (1) extraction requires a non-trivial effort, and (2) the results depend on the heuristics used during extraction. These challenges burden researchers that are new to the community and make it difficult to benchmark software repository mining since it is almost impossible to reproduce experiments done by another team. In this paper we present the TA-RE corpus. TA-RE collects extracted data from software repositories in order to build a collection of projects that will simplify extraction process. Additionally the collection can be used for benchmarking. As the first step we propose an exchange language capable of making sharing and reusing data as simple as possible.
-
Gerald Reif, Harald C. Gall, Using WEESA to Semantically Annotate Cocoon Web Applications, 1st Semantic Authoring and Annotation Workshop 2006 at the 5th International Semantic Web Conference ISWC2006, November 2006. (inproceedings)
The Semantic Web is based on the idea that Web applications provide semantically annotated Web pages. This meta-data is typically added in the semantic annotation process which is currently not part of the Web engineering process. Web engineering, however, proposes methodologies to design, implement and maintain Web applications but lack semantic annotation. In this paper we show how WEESA, a mapping from XML documents to ontologies, can be used in Apache Cocoon Web applications to semantically annotate Web pages. We introduce Cocoon transformer components that use the WEESA mapping definition to automatically generate RDF meta-data from XML documents. We further show how existing Cocoon Web applications can be extended to Semantic Web applications and discuss the experiences gained in an industry case study.
2005
-
Jens Knodel, Isabel John, Dharmalingam Ganesan, Martin Pinzger, Fernando Usero, Jose L. Arciniegas, Claudio Riva, Asset Recovery and Incorporation into Product Lines, Proceedings of the 12th IEEE Working Conference on Reverse Engineering, November 2005, IEEE Computer Society. (inproceedings)
Software product lines aim in having a common platform from which several similar products can be derived. The elements of the platform are called assets and they are managed in an asset base being part of the product line infrastructure. The products are then built on top of the assets. Assets can include own developments, open source or third-party software modules, as well as design and project documents. In the context of the European-wide project FAMILIES we concentrated on techniques used to build the platform with focus on the recovery of these assets from existing systems. We present an approach on how to incorporate existing assets into the product line infrastructure. Thereby we explicitly distinguish the asset origins and the different information sources available. The incorporation is a quality-driven process that is backed up by a set of reverse engineering techniques to evaluate the asset?s internal quality. The quality assessment of an asset is the critical measurement for industrial development organizations in order to incorporate assets into their product line infrastructure.
-
Michele Lanza, Stephane Ducasse, Harald C. Gall, Martin Pinzger, CodeCrawler: An Information Visualization Tool for Program Comprehension, Proceedings of the 27th International Conference on Software Engineering 2005, ACM. (inproceedings)
CODECRAWLER is a language independent, interactive, software visualization tool. It is mainly targeted at visualizing object-oriented software, and in its newest implementation has become a general information visualization tool. It has been successfully validated in several industrial case studies over the past few years. CODECRAWLER strongly adheres to lightweight principles: it implements and visualizes polymetric views, visualizations of software enriched with information such as software metrics and other source code semantics. CODECRAWLER is built on top of Moose, an extensible language independent reengineering environment that implements the FAMIX metamodel. In its last implementation, CODECRAWLER has become a general-purpose information visualization tool.
-
Stefania Leone, Thomas Hodel, Harald C. Gall, Concept and architecture of an pervasive document editing and managing system, SIGDOC '05: Proceedings of the 23rd annual international conference on Design of communication, September 21-23 2005. (inproceedings)
Collaborative document processing has been addressed by many
approaches so far, most of which focus on document versioning
and collaborative editing. We address this issue from a different
angle and describe the concept and architecture of a pervasive
document editing and managing system. It exploits database
techniques and real-time updating for sophisticated collaboration
scenarios on multiple devices. Each user is always served with upto-
date documents and can organize his work based on document
meta data. For this, we present our conceptual architecture for
such a system and discuss it with an example.
-
Jacek Ratzinger, Michael Fischer, Harald C. Gall, EvoLens: Lens-View Visualizations of Evolution Data, Proceedings of the 8th International Workshop on Principles of Software Evolution 2005. (inproceedings)
Observing the evolution of very large software systems is difficult because of the sheer amount of information that needs to be analyzed and because the changes performed in the system are at a very low granularity level. In recent approaches software metrics have been used to compute condensed graphical visualizations of these data also reflecting metrics. However, most techniques concentrate on visualizing data of one particular release providing only insufficient support for visualizing data of several selected releases. In this paper we present the RelVis visualization approach that provides integrated condensed graphical views on source code and release history data of up to n releases of a software system. Measurements of metrics of n releases are composed to views that facilitate spectators to spot trends of metrics of source code entities and relationships. Critical trends are highlighted: This allows the user to direct perfective maintenance activities to source code entities involved. The paper provides needed background information and evaluation of the approach with a large open source software project.
-
Beat Fluri, Harald C. Gall, Martin Pinzger, Fine-Grained Analysis of Change Couplings, Proceedings of the 5th International Workshop on Source Code Analysis and Manipulation, October 2005, IEEE Computer Society. (inproceedings)
In software evolution analysis, many approaches analyze release history data available through versioning systems. The recent investigations of CVS data have shown that commonly committed files highlight their change couplings. However, CVS stores modifications on the basis of text but does not track structural changes, such as the insertion, removing, or modification of methods or classes. A detailed analysis whether change couplings are caused by source code couplings or by other textual modifications, such as updates in license terms, is not performed by current approaches.
The focus of this paper is on adding structural change information to existing release history data. We present an approach that uses the structure compare services shipped with the Eclipse IDE to obtain the corresponding fine-grained changes between two subsequent versions of any Java class. This information supports filtering those change couplings which result from structural changes. So we can distill the causes for change couplings along releases and filter out those that are structurally relevant. The first validation of our approach with a medium-sized open source software system showed that a reasonable amount of change couplings are not caused by source code changes.
-
Marco D'Ambros, Michele Lanza, Harald C. Gall, Fractal Figures: Visualizing Development Effort for CVS Entities, VISSOFT '05: Proceedings of the 3rd IEEE International Workshop on Visualizing Software for Understanding and Analysis 2005, IEEE Computer Society. (inproceedings)
Versioning systems such as CVS or Subversion exhibit a
large potential to investigate the evolution of software systems.
They are used to record the development steps of software
systems as they make it possible to reconstruct the
whole evolution of single files. However, they provide no
good means to understand how much a certain file has been
changed over time and by whom. In this paper we present
an approach to visualize files using fractal figures, which (1)
convey the overall development effort, (2) illustrate the distribution
of the effort among various developers, and (3) allow
files to be categorized in terms of the distribution of
the effort following gestalt principles. Our approach allows
us to discover files of high development efforts in terms of
team size and effort intensity of individual developers. The
visualizations allow an analyst or a project manager to get
first insights into team structures and code ownership principles.
We have analyzed Mozilla as a case study and we
show some of the recovered team development patterns in
this paper as a validation of our approach.
-
Jacek Ratzinger, Michael Fischer, Harald C. Gall, Improving Evolvability through Refactoring, Proceedings of the International Workshop on Mining Software Repositories 2005. (inproceedings)
Refactoring is one means of improving the structure of existing software. Locations where to apply refactoring are often based on subjective perceptions such as ?bad smells?, which are vague suspicions of design shortcomings. We exploit historical data extracted from repositories such as CVS and focus on change couplings: if some software parts change at the same time very often over several releases, this data can be used to point to candidates for refactoring. We adopt the concept of bad smells and provide additional change smells. Such a smell is hardly visible in the code, but easy to spot when viewing the change history. Our approach enables the detection of such smells allowing an engineer to apply refactoring on these parts of the source code to improve the evolvability of the software. For that, we analyzed the history of a large industrial system for a period of 15 months, proposed spots for refactorings based on change couplings, and performed them with the developers. After observing the system for another 15 months we finally analyzed the effectiveness of our approach. Our results support our hypothesis that the combination of change dependency analysis and refactoring is applicable and effective.
-
Michael Fischer, Johann Oberleitner, Jacek Ratzinger, Harald C. Gall, Mininig Evolution Data of a Product Family, Proceedings of the International Workshop on Mining Software Repositories 2005. (inproceedings)
Diversification of software assets through evolving requirements impose a constant challenge on the developers and maintainers of large software systems. Recent research has addressed the mining for data in software repositories of single products ranging from fine- to coarse grained analyses. But so far, little attention has been payed for mining data about the evolution of product families. In this work, we study the evolution and commonalities of three variants of the BSD, a large open source operating system. The research questions we tackle are concerned with how to generate high level views of the system discovering and indicating evolutionary highlights. To process the large amount of data, we extended our previously developed approach for storing release history information to support the analysis of product families. In a case study we apply our approach on data from three different code repositories representing about 8.5GB of data and 10 years of active development.
-
Michael Fischer, Johann Oberleitner, Harald C. Gall, System Evolution Tracking through Execution Trace Analysis, Proceedings of the 13th International Workshop on Program Comprehension 2005. (inproceedings)
Execution traces produced from instrumented code reflect a system's actual implementation. This information can be used to recover interaction patterns between different entities such as methods, files, or modules. Some solutions for the detection of patterns and their visualization exist, but are limited to small amounts of data and are incapable of comparing data from different versions of a large software system. In this paper, we propose a methodology to analyze and compare the execution traces of different versions of a software system to provide insights into its evolution. We recover high-level module views that facilitate the comprehension of each module's evolution. Our methodology allows us to track the evolution of particular modules and present the findings in three different kinds of visualizations. Based on these graphical representations, the evolution of the concerned modules can be tracked and comprehended much more effectively. Our EvoTrace approach uses standard database technology and instrumentation facilities of development tools, so exchanging data with other analysis tools is facilitated. Further, we show the applicability of our approach using the Mozilla open source system consisting of about 2 million lines of C/C++ code.
-
Martin Pinzger, Michael Fischer, Harald C. Gall, Towards an Integrated View on Architecture and its Evolution, Electronic Notes in Theoretical Computer Science Vol. 127 (3), April 2005. (article)
Information about the evolution of a software architecture can be found in the source basis of a project and in the release history data such as modification and problem reports. Existing approaches deal with these two data sources separately and do not exploit the integration of their analyses. In this paper, we present an architecture analysis approach that provides an integration of both kinds of evolution data. The analysis applies fact extraction and generates specific directed attributed graphs; nodes represent source code entities and edges represent relationships such as accesses, includes, inherits, invokes, and coupling between certain architectural elements. The integration of data is then performed on a meta-model level to enable the generation of architectural views using binary relational algebra. These integrated architectural views show intended and unintended couplings between architectural elements, hence pointing software engineers to locations in the system that may be critical for on-going and future maintenance activities. We demonstrate our analysis approach using a large open source software system.
-
Martin Pinzger, Harald C. Gall, Michael Fischer, Michele Lanza, Visualizing multiple evolution metrics, Proceedings of the ACM Symposium on Software Visualization (SoftVis'2005) 2005, ACM. (inproceedings)
Observing the evolution of very large software systems needs the analysis of large complex data models and visualization of condensed views on the system. For visualization software metrics have been used to compute such condensed views. However, current techniques concentrate on visualizing data of one particular release providing only insufficient support for visualizing data of several releases. In this paper we present the RelVis visualization approach that concentrates on providing integrated condensed graphical views on source code and release history data of up to n releases. Measures of metrics of source code entities and relationships are composed in Kiviat diagrams as annual rings. Diagrams highlight the good and bad times of an entity and facilitate the identification of entities and relationships with critical trends. They represent potential refactoring candidates that should be addressed first before further evolving the system. The paper provides needed background information and evaluation of the approach with a large open source software project.
-
Gerald Reif, Harald C. Gall, Mehdi Jazayeri, WEESA - Web Engineering for Semanitc Web Applications, Proceedings of the 14th International World Wide Web Conference, May 2005. (inproceedings)
The success of the Semantic Web crucially depends on the existence ofWeb pages that provide machine-understandable meta-data. This meta-data is typically added in the semantic annotation process which is currently not part of theWeb engineering process. Web engineering, however, proposes methodologies to design, implement and maintain Web applications but lack the generation of meta-data. In this paper we introduce a technique to extend existing Web engineering methodologies to develop semantically annotated Web pages. The novelty of this approach is the definition of a mapping from XML Schema to ontologies, called WEESA, that can be used to automatically generate RDF meta-data from XML content documents. We further show how we integrated the WEESA mapping into an Apache Cocoon transformer to easily extend XML based Web applications to semantically annotated Web application.
2004
-
Thomas Hodel, Harald C. Gall, Klaus R. Dittrich, Dynamic Collaborative Business Processes within Documents, In Proceedings of the 22nd Annual International Conference of Communication 2004. (inproceedings)
Effective collaborate business process support is essential in today?s business. In this paper, we address this aspect within documents. Often, such text documents are stored unsystematically in a rather confusing file structure with an inscrutable hierarchy and little access control. Business data, on the other hand, are stored in a systematic way in databases allowing multi-user, multi-site, user-/role-specific controlled access. We store text documents in databases and exploit these database capabilities: collaborative business processes then can be defined per document or any part of a document. In this paper, we present this dynamic collaborative business process concept and the prototype within documents for our database-based collaborative editor. We evaluate the potential of such business processes for the quality of communication and documentation.
-
Michael Fischer, Harald C. Gall, Visualizing Feature Evolution of Large-Scale Software based on Problem and Modification Report Data, Journal of Software Maintenance and Evolution: Research and Practice Vol. 16 (6) 2004. (article)
Gaining higher-level evolutionary information about large software systems is a key challenge in dealing with increasing complexity and architectural deterioration. Modification reports and problem reports (PRs) taken from systems such as the concurrent versions system (CVS) and Bugzilla contain an overwhelming amount of information about the reasons and effects of particular changes. Such reports can be analyzed to provide a clearer picture about the problems concerning a particular feature or a set of features. Hidden dependencies of structurally unrelated but over time logically coupled files exhibit a good potential to illustrate feature evolution and possible architectural deterioration. In this paper, we describe the visualization of feature evolution by taking advantage of this logical coupling introduced by changes required to fix a reported problem. We compute the proximity of PRs by applying a standard technique called multidimensional scaling (MDS). The visualization of these data enables us to depict feature evolution by projecting PR dependence onto (a) feature-connected files and (b) the project directory structure of the software system. These two different views show how PRs, features and the directory tree structure relate. As a result, our approach uncovers hidden dependencies between features and presents them in an easy to assess visual form. A visualization of interwoven features can indicate locations of design erosion in the architectural evolution of a software system. As a case study, we used Mozilla and its CVS and Bugzilla data to show the applicability and effectiveness of our approach.
RDF for all publications
BibTeX for all publications
Statistics
| Reference type |
Number of references |
| article |
6 |
| inbook |
3 |
| incollection |
3 |
| inproceedings |
76 |
| misc |
2 |
| techreport |
3 |
| Total |
93 |
©2004-2012 University of Zurich, s.e.a.l.