Volume 6, Number 2 - Software Quality Assurance

Characterizing Software Dependability from Multiple Stakeholders’ Perspective^*

by Patricia Costa and Ioana Rus, Fraunhofer Center for Experimental Software Engineering, Maryland

Introduction

To illustrate a method that we have developed for characterizing the dependability of a software application, we present a case study where we used this method for a web-based software tool that supports on-line mediated and structured discussions ("electronic workshops").

Dependability has long been a critical requirement for space and defense systems. Since software is embedded in more and more systems, dependability becomes a vital necessity in many other sectors of society, including national infrastructure and health care, as well as mainstream systems ranging from electronic commerce to desktops. Dependability cannot be achieved only by testing at the end of development; it has to be engineered, managed, and built in to the product throughout the entire lifecycle.

Dependability is defined by IFIP WG-10.4 as "The trustworthiness of a computing system which allows reliance to be justifiably placed on the service it delivers" [www.dependability.org]. For software there is no dependability definition universally accepted and employed. Dependability is often regarded as a set of properties such as reliability, availability, safety, fault tolerance, robustness, and security. After investigating what dependability means for different products, we concluded that dependability is a multi-attribute property, defined and measured by a set of different indicators, usually specific to a system or a class of functionally related systems. Moreover, even for the same system, different stakeholders commonly possess different definitions and requirements for dependability.

Along these lines, we developed a dependability methodology to characterize a software system with respect to its dependability properties. This methodology takes into account the human factor, more specifically human requirements, expectations, and perceptions of dependability. The resulting characterization can be used to establish a baseline. This baseline may then be used to:

Compare the dependability of the system at the time of the characterization with previous snapshots of the same system in order to analyze the evolution over the time of its dependability

Identify dependability attributes to be improved, and support the selection of strategies (technologies or process improvements) to achieve this goal

Assess the effect of a specific strategy, by comparing the dependability of the system before and after its application.

Dependability Characterization Methodology

Figure 1 shows a summary of the activities and artifacts of our dependability characterization methodology. The circles indicate activities and the squares indicate artifacts. We applied this methodology to one of the systems developed and used by Fraunhofer Center Maryland named "eWorkshop". The eWorkshop is a web-based chat tool for recorded, on-line, and synchronous discussions that has been in use and evolution for a few years. Different stakeholders involved in this electronic workshop process play various roles. In the rest of this paper we will use the example of the eWorkshop to describe each of the activities in Figure 1. The activity outside the main box (dependability planning, building-in and improvement) is supported, but not performed by our team.

Figure 1. Software Dependability Characterization Methodology Activities and Artifacts

Preliminary Understanding of the System and Dependability Goals

In this activity, we identified the stakeholders and the dependability attributes that might be relevant to each of them. We met with the development team to characterize the eWorkshop process and to identify the different roles and needs of the eWorkshop’s stakeholders. We identified the following stakeholders of the system: moderator, lead discussant, scribe, participant, observer, and technical support. A core of features of the eWorkshop tool are used by all software stakeholders, but other features are only used by a subset of stakeholders.

Goal Refinement Using the Goal-Question-Metric Approach [1]

In order to identify dependability goals, we first defined the high-level goals (G1-G3) for our dependability characterization and measurement:

G1: Understand definition of dependability and needs or expectations for the operational eWorkshop, with respect to dependability, from its stakeholders’ perspective.

G2: Characterize the operational eWorkshop software with respect to its dependability, from each stakeholder’s perspective, in the context of its most recent version.

G3: Characterize the satisfaction of stakeholders with the eWorkshop, with respect to dependability, in the context of its most recent version.

Then, we refined these goals, taking into account the stakeholders and specific metrics for the eWorkshop tool. We were guided by a collection of definitions, measures, and models of dependability attributes that we gathered by performing a literature survey. The relevant dependability properties and their measurement indicators that we identified for the eWorkshop were:

Availability:

Av1: Total time the system was available during a session
Av2: Continuous time that the system was available during a session
Av3: Recovery time from failure

Reliability:

Re1: Availability between failures
Re2: Total number of crashes during a session

Starting with the generic goals, we derived specific goals for the various viewpoints, quality focus and purpose. Once specific goals were identified, questions and metrics for each were also identified using the Goal-Question-Metric (GQM) technique. Table 1 presents samples of questions for specific goals. We assigned identifiers to goals and questions (e.g., GPR = goal of participant for reliability, or QPR1 = question for participant related to reliability) to trace goals down to questions and metrics for the metrics identification phase, and also to trace back metrics to questions and goals in the metrics analysis and processing phase.

Table 1. Sample of Questions for Specific Goals
GPR
QPR1	What is the mean time between failure? ^** [Minutes]
QPR2	What is the failure rate? [Failures/Session]
QPR3	Percent of number of times the participant tried to use a feature and it was not accessible, out of the total number of times he/she accessed the feature?

Operational System Data Collection

In this step we conducted a gap analysis of the metrics being collected and the ones identified as needed in the GQMs. With the collaboration of the development team, we identified metrics collected by the eWorkshop that could be used to analyze the system’s dependability. An analysis of the metrics indicated that there was no need for new metrics to be collected by the system. Besides, the interview would be able to capture the perception and satisfaction of the stakeholders in order to complete the metrics being collected.

Interviews with Stakeholders

Based on the GQMs and the metrics being collected from the operational system, we created questionnaires for each type of stakeholder. Each questionnaire had two parts. In Part 1, the interviewees were asked about their experience and satisfaction with the system’s reliability and availability. The interviewees were asked the desired, acceptable and experienced values of each indicator of each dependability attribute (e.g reliability, availability).

Since dependability is related to fulfilling or failing to fulfill the requirements of the system, Part 2 of the questionnaire contains similar questions, tailored for each specific feature offered by the eWorkshop for each group of stakeholder. The interviewees were asked the desired, acceptable and experienced values for the number of times a feature failed and for the duration of failures. They were also asked the importance of each feature and how many times they used it during a session. This information will be used later in the data processing step for weighting the features and aggregating individual answers into higher-level indicators.

There were also open-ended questions, allowing the interviewees to state other dependability definitions and attributes of interest, problems with the system, or needs and expectations. If interviewees are concerned with more dependability attributes than those identified in the previous step, then these attributes will be added, and corresponding questions and metrics will be generated. In this specific case study, this did not occur.

We conducted interviews with eleven individuals within five classes of stakeholders.

Data Processing

We processed and aggregated the data obtained from the interviews with stakeholders and the operational system data collection. To gather and process the data collected from the interviews, we used techniques such as the multi-attribute utility function (MAUF)[2]. We analyzed the data collected by the system for metrics that would indicate reliability or availability in the following sources: log of the web server, log generated by a monitoring tool that runs during the meeting, and the log of the meeting.

Data Visualization and Analysis

We analyzed the data collected automatically by the system and found out that the system did not fail during the session, since we did not find any indication of problems in any of the data sources. We then proceeded to analyze the results from the interviews.

In general, the stakeholders were satisfied with the eWorkshop, since no major failures were identified for the system or for each individual features. Figure 2 shows the satisfaction of each class of stakeholders (aggregated per type of stakeholder from individuals’ responses) with dependability attributes of the eWorkshop, expressed in "utility" values ranging from 1 to 10, where 1 is the minimum utility and 10 is the maximum.

Figure 2. Satisfaction with dependability of the eWorkshop software for some classes of stakeholders.

The "expert" was the only stakeholder that was not completely satisfied. Once the problem that caused this dissatisfaction was detected, it took the expert only 30 seconds to restart the application and continue the meeting. However, the expert believes that the maximum it should take for the system to recover should be 10 seconds. Also, he believes that the system shouldn’t crash at all during a meeting.

We also analyzed and compared the perceived, accepted and desired value for certain dependability indicators. The actual and perceived value for a system’s attributes is better than, or equal to the acceptable value. In general, the values are also in accordance with the desire of the stakeholders, with the single exception of the expert.

Dependability Planning, Building-in and Improvement

We presented the results of applying our methodology to the development team. When we discussed the expert’s concerns, the team lead mentioned that he was aware of the problem. Since no other problems were detected during the meeting, he believes there is a possibility that the problem was outside the context of the eWorkshop tool due to other components of the environment.

Since the system was considered to be dependable, no plans to improve the system’s dependability were made. During the interviews, some of the interviewees suggested improvements that could be made to the system. Despite these comments not being related to the dependability aspects that we were analyzing, we recorded and presented them to the development team. The development team was very satisfied with the results of our study.

Conclusions and Future Work

We developed and presented a methodology for characterizing the dependability of a system, as well as an application of this methodology. The methodology can be tailored for application to different systems. The application of the methodology to the eWorkshop tool made it possible for us to test the feasibility of the methodology and to improve it according to the feedback received during its application.

From the eWorkshop team point of view, the methodology was very valuable and confirmed the application’s dependability level. The service provider believed that the stakeholders were satisfied, but the methodology application results gave him more confidence, confirming his hypotheses. The feedback from the interviews is very useful for indicating areas of improvement. Once the improvements are made, the results of this analysis can be used as a baseline to assure that the addition of new features does not compromise the system’s availability and dependability. This process (or a subset of the steps) can be repeated to characterize the dependability values of the new versions of the system and verify its improvement.

About the Authors

Patricia Costa is a scientist at the Fraunhofer Center for Experimental Software Engineering, Maryland. Her research interests include software architecture, agile methods, knowledge management and software measurement and experimentation. She has a B.Sc. and a M.Sc. in Computer Science and a M.Sc. in Telecommunications Management.

Ioana Rus is a scientist with the Fraunhofer Center for Empirical Software Engineering, Maryland. Her research interests include software reliability and dependability, knowledge management, software process modeling and simulation, process improvement, and empirical methods in software engineering. She has a Ph.D. in Computer Science and Engineering.

Author Contact Information

Patricia Costa
Fraunhofer Center for Experimental Software Engineering, Maryland
4321 Hartwick Rd, suite 500
College Park, MD 20742
[email protected]

Ioana Rus
Fraunhofer Center for Experimental Software Engineering, Maryland
4321 Hartwick Rd, suite 500
College Park, MD 20742
[email protected]
http://fc-md.umd.edu

Characterizing Software Dependability from Multiple Stakeholders’ Perspective*