Migration of Legacy Components to Service-Oriented Architectures

By By Grace Lewis, Edwin Morris, and Dennis Smith, Software Engineering Institute

1. Introduction

Software archaeology investigates and rehabilitates legacy systems so that their architecture can be discovered and their code reused. An increasingly popular approach to software archaeology has been to leverage the value of legacy systems by exposing all or parts of it as services within a service-oriented architecture (SOA). A service-oriented architecture is a collection of services with well-defined interfaces and a shared communications model. A service is a coarse-grained, discoverable, and self- contained software entity that interacts with applications and other services through this loosely coupled, often asynchronous, message-based communication model [2, 4]. Systems or applications that are called "service-based" use the functionality provided by these services as part of their mission. For example, when a person makes a reservation through Travelocity, what appears to be a single web-based application actually involves the complex orchestration of a set of "services" from a number of sources. These services may include authentication, flight schedules, reservations, hotel searches, and credit card validation. SOAs offer the promise of enabling legacy systems to work together, presumably without making significant changes. In theory, a developer would simply have to create a well- defined common interface to the legacy system so that it could offer its services to other systems or applications.

In practice, constructing services from existing systems is neither easy nor automatic.

This migration is a complex task, particularly when services will execute within a tightly constrained environment. SOA migration tasks can be considered from a number of perspectives including that of the user of the services, the SOA architect, or the service provider. This paper focuses on the service provider. Section 2 briefly discusses what it means to create services from legacy components. Section 3 summarizes a recent engagement where the SEI helped a program office make decisions about migrating legacy components as services within an SOA. Section 4 outlines the Service-Oriented Migration and Reuse Technique (SMART) method for evaluating legacy components for their potential to become services in an SOA. Section 5 provides conclusions and discusses next steps.

2. Creation of Services From Legacy Components

Enabling a legacy system to work within Web services is sometimes relatively straightforward. Web service interfaces are based on well-known standards and are set up to receive messages, parse their content, invoke legacy code, and optionally wrap the results as a message to be returned to the sender. Many modern development environments provide tools to help in this process, and commercial organizations are employing these environments to expose their business processes to the world. However, characteristics of legacy systems, such as age, language, and architecture, as well as of the target SOA can complicate the task. This is particularly the case when migrating to highly constrained SOAs such as those being proposed for some DoD systems. DoD migrations to SOAs will likely rely less on automation, and more on careful analysis of the feasibility and magnitude of the effort involved that goes beyond just the technical aspects. Such an analysis needs to consider:

1. Requirements from potential service users. It is important to know what applications would use the services and how they would be used. For example, what information will be exchanged? In what format?
2. Technical characteristics of the target environment, such as messaging technologies, communication protocols, service description languages, and service discovery mechanisms.
3. Characteristics of the architecture of the legacy system such as dependencies on commercial products or specific operating systems.
4. The effort involved in writing the service interface. Even if it is expected that the legacy system will remain intact, code must receive the service request, translate it into legacy system calls, and produce a response.
5. The effort involved in the translation of data types, which may be small for basic data types and newer legacy systems, but very large in the cases such as audio, video, graphics, and other complex data types, or in legacy programming languages that do not provide capabilities for building XML documents .
6. The effort required to describe the services so that they can be discovered and used appropriately. This may require information about qualities of service (e.g., performance, reliability, and security) or service level agreements (SLAs) [7].
7. The effort involved in writing initialization code that prepares the service to take requests, and operational procedures that must be followed for deployment of services.
8. Estimates of cost, difficulty, and risk. The information gathered in the previous points should provide for more realistic estimates.

3. Support for Program Office Plans for Migration to SOA

The SEI recently worked with a DoD program office to analyze the potential for migrating a set of legacy components from a command and control (C2) system to form services for an SOA. The program office recognized that if a selected set of components from their C2 system were converted to application domain services (ADS), they may have applicability for a broad variety of purposes. Our role was to perform a preliminary evaluation of the feasibility of this conversion. We initially met with the government program office and the contractors who had developed the system. At this meeting we were given an overview and history of the systems, migration plans, and the drivers for the migration. We were also given a brief orientation to the target SOA along with system documentation. This initial meeting was followed up by in-depth interviews and architectural analysis of portions of the system. The following paragraphs summarize what we learned about the target SOA, the current system, the gap between the current state and that required for the SOA, and our suggestions to the program office.

Understanding the Target SOA

The target SOA was studied through an analysis of available documentation and through a meeting with the developers. The target SOA is currently under development using a variety of commercial products and standards, along with significant custom code. The effort is focused on satisfying a number of specific quality attributes important to the DoD, such as performance, security, and availability. In order to meet these needs, the SOA will impose constraints on potential services. Because the SOA is still under development, the specifications for how to deploy and write services had not yet been fully defined. The target SOA is illustrated in Figure 1.

Figure 1. Physical View of the Target SOA.

Figure 1 shows that the SOA includes common services (CS) that are to be used by user applications and ADS s. The SOA owns the common services. The environment allows for a set of application domain services (ADS) which derive their requirements from user applications. Program offices are invited to submit proposals for services to meet these requirements, either by building them from scratch or by migrating them from legacy components. These requirements then need to be analyzed in detail and matched to existing functionality to determine what can be used as-is, what has to be modified, and what has to be new development. Even though the full details of compliant services for the target SOA have not yet been worked out, the target SOA imposes a number of constraints on organizations that are developing ADSs from existing legacy components. Some of the constraints/requirements for developers of ADSs include:

1. An ADS needs to be self-contained, that is, it should be able to be deployed as a single unit. In this specific target SOA, services need to be stand-alone and of small granularity so that they can be deployed as needed on standardized and often limited-resource platforms such as handheld devices. In a legacy component, functionality that has been identified as part of a service needs to be fully extracted from the system, including code that corresponds to shared libraries or the core of a product line.
2. In the target SOA, an ADS has to be able to be deployed on a Linux operating system. As a result, Windows-based legacy components could be a problem, especially if there are dependencies on the operating system through direct system calls or if there is a dependency on commercial products that are only available for Windows
3. All services will share a common data model and all data will be accessed through a Data Store common service. Services will no longer define internal data and all data will be defined as part of the common data model. An ADS will use the Discovery common service to find and connect to other services.
4. If the ADS will rely on other services, code to discover and connect to these services will have to be written. Once the service is developed it needs to be advertised. Other applications that wish to use this service will perform a discovery on the available services and choose which service(s) they desire to use.
5. An ADS will use the Communications common service for communicating with other services. The target SOA provides tools for generating data readers and data writers that will take incoming and outgoing data and format it accordingly.

Understanding the Existing Capabilities

We met with the contractor and program office representatives to learn about the system, focus our investigation on a limited number of legacy components, and to select criteria for further screening. The current system, written in C++ on a Windows operating system, had a total of about 800,000 lines of code and 2500 classes (the rough equivalent of modules in the object-oriented paradigm). In addition, the system had dependencies on a commercial database and a second product for visualizing, creating, and managing maps. Both commercial products have only Windows versions. The program office had completed a preliminary identification of potential services that could be built from components of the legacy system. Using the screening criteria, we selected seven of these services, containing 29 classes, as our focus since they offered potentially high payoff for our effort. Together with the contractor and program office representatives, we developed characteristics for analyzing the potential reusable components. These characteristics reflected both the base characteristics provided by the Options Analysis for Reuse (OAR) technique developed at the SEI and our knowledge of the necessary characteristics of services operating within the target SOA [1]. The characteristics included:

Size
Complexity
Level of documentation
Coupling
Cohesion
Number of base classes
Programming standards compliance
Black box vs. white box suitability
Scale of changes required
Commercial mapping software dependency
Microsoft dependency
Support software required

The characteristics formed the basis for the more detailed analysis discussed below

Analyze the Gap

Given the known and projected constraints of the target SOA, we performed a preliminary analysis of the legacy components to determine their suitability for reuse as services and the amount of effort and risk that would be involved. This analysis consisted of three parts: 1) a gap analysis of the changes to the legacy components that would be necessary for migration to the SOA, 2) an informal evaluation of code quality, 3) an architecture reconstruction to obtain a better understanding of the set of undocumented dependencies between different parts of the legacy system. The results of these analyses allowed us to define a service migration strategy based on the risks due to the unknown future state of the target SOA. These analyses are outlined below. During the gap analysis we considered the candidate legacy components in terms of the characteristics that were developed earlier. The characteristics provided input on level of difficulty and risk factors. We identified dependencies of the selected classes on the commercial mapping software, the commercial database, and Windows. The contractor provided estimates for converting the components into services, based on a set of simplifying assumptions on the actual make-up of the target SOA and the final set of user requirements. A summary of this initial analysis of converting the selected components to services is illustrated in Figure 3.

Figure 3. Results of Initial Analysis

Using the existing contractor, the level of difficulty of making these changes would be low to medium, and the risk would be low because of their familiarity with the systems. These determinations were made by having detailed discussions with the contractor where we explained the target SOA requirements and constraints and they would analyze the specific changes to be made to the code. However, inadequacies were found in architecture documentation such that there remained a number of gaps in our understanding of the system. In addition, the contractor underestimated the amount of code used by the potential services. To get a better understanding of these issues we performed a code analysis and architecture reconstruction. We first analyzed the code through a code analyzer "Understand for C++". This analysis provided:

Data dictionary
Metrics at the project, file, class, and function level
Invocation tree
Cross reference for include files, functions, classes/types, macros and objects
Unused functions and objects

The code analysis enabled us to validate the input from the contractor and to produce input for an architecture reconstruction tool that would identify dependencies between portions of the code. From the code analysis, we found that the code was better organized and documented at the code level than most code that we have seen. However, as mentioned earlier, there were inconsistencies in the quality and documentation between different parts of the code that made the analysis complicated. Because we needed to get a better understanding of dependencies between different parts of the legacy code, we performed an architecture reconstruction with a tool called ARMIN. This analysis, a form of software archaeology, provided a look at the software architecture of the "as-built" system [3, 6]. In our analysis, we were interested in

Dependencies between services and user interface code--user interface code needs to be removed from services given that the users of these services will have their own user interfaces
Dependencies between services and the commercial mapping software--this code has to be clearly identified so that decisions can be made as to how to replace this functionality if Windows products are not allowed in this environment
Dependencies between services--there were two migration projects taking place at the same time; if these projects shared code it would make no sense to treat them as separate projects
Dependencies between the services and the rest of the code that mainly represented the data model--this would prove the importance of the data model as well as the underestimation of the code used by the services

The analysis was able to identify a substantial number of undocumented dependencies between portions of the code. These enabled a more realistic understanding of the scope of the migration effort. It also documented dependencies with the commercial mapping software and database. These are a potential concern in the target environment. The architecture reconstruction also provided evidence that the system data model is potentially a valuable reusable component that had not been identified during the initial analysis. However, this finding was tempered by the fact that in the target SOA environment, the eventual common data model will preclude the use of the current data model. The common data model will likely be the result of negotiation among many interested parties.

Developing a Strategy for Service Migration

In general, we found that the legacy code represents a set of components with significant reuse potential. However, because the current legacy system does not have sufficient architecture or other high level documentation, it was difficult to understand both the "big picture" and the lower level dependencies between portions of the code. If the migration to service effort moves forward, the results of the architecture reconstruction can provide a starting point for understanding how to disentangle these dependencies. The largest risk in reusing the legacy components concerns the fact that the target SOA has not been fully developed. While its overall structure has been defined, many of the specific mechanisms for interacting with it are still pending. Thus, it is not yet clear what the ultimate requirements will be for a service. Based on these general observations, the recommended migration strategy can be summarized in the following steps:

1. Require the contractor to update the software architecture documentation and standardize comments in the code.
2. Work with the developers of the target SOA to define what is meant by a compliant service.
3. Work closely with the team within the target SOA group that is defining the data model to understand its contents and influence it as necessary.
4. Find out if there the vendor has plans for a Linux version of the mapping software or if the target SOA group has plans for a mapping common service to replace the current Windows mapping software.
5. Interact with potential application developers that will be using the services to understand their requirements and develop appropriate service interfaces.
6. Recalculate cost and effort of migration based on complete set of code dependencies and new understanding of user requirements and SOA constraints.
7. Understand the commonality between the current service migration effort and a second forthcoming similar migration project to a different target SOA.

We recommended that the program office take a proactive approach in working with the developers of the target SOA to understand implications of the current and evolving SOA plans. The program office should also work closely with the developers of the applications that will be using these services to obtain requirements. An immediate step is to actively participate in negotiations regarding the common data model to make sure that it contains all the information they need.

4. Service-Oriented Migration and Reuse Technique (SMART)

We had previously employed this general approach and these techniques on other architecture and reuse investigationsŃbut not as an integrated approach for making decisions on migration to services. Because of interest from other customers, we are in the process of formalizing the approach and techniques into the Service-Oriented Migration and Reuse Technique (SMART). A Technical Note will be coming out soon that explains the technique. The C2 example provides a general outline of SMART, which details the following steps:

1. Establish stakeholder context.
2. Describe existing capabilities.
3. Describe the future service-based state.
4. Analyze the gap between service-based state and existing capabilities.
5. Develop strategy for service migration

SMART is not intended to replace system engineering activity. It provides a preliminary analysis of the viability of migrating to services, migration strategies available, and the costs, risks, and associated confidence ranges for each strategy. The organization must still pursue an appropriate engineering strategy [5].

5. Conclusions and Next Steps

Determining how to expose functionality as services can have substantial complexity. Software archaeology, in this context, has to go beyond understanding the architecture of a system and reusing code. Other concerns, such as the stakeholder context and SOA development plans will greatly influence the way in which code is reused. Our report to the C2 client, while not definitive, did pointed out a number of issues that they had not previously considered. The type of disciplined analysis that we performed supported our recommendations and provided the client an invaluable amount of information. For example, a by-product of this work was a better understanding of the target SOA on behalf of the client. The developer of the target SOA also benefited from the process because it provided a better understanding of the potential needs and challenges for the services that would use the target SOA. Given our recommendations, mainly that the target SOA is still under development, the client decided to defer the migration decision a decision that could have been costly two years from now. The type of disciplined analysis that we performed appears to have applicability for other organizations that are considering migrations to SOAs. An early version of SMART was applied in the C2 example just described. This early version differed from the current structure of SMART because there were various guides and outputs that had not been formalized. However, similar concepts were applied informally. An early version of SMART served to direct and provide discipline to our analysis. We are currently updating the prototype process with the following goals:

Develop tools that expose SOA concerns that need to be addressed when exposing functionality as services. A Service Migration Interview Guide to be used in SMART is the first of such tools.
Incorporate decision rules to determine when it is most useful to include the code analysis and architecture reconstruction steps as part of the process.
Make the process repeatable so that it can be used by the wider community. The tools and decision rules being developed are a first step in developing a repeatable process.
Improve the breadth and consistency of information gathered about the engineering effort necessary to change the legacy artifact into a service. The Service Migration Interview Guide is the first tool intended for this purpose. By incorporating significant technical "know how" into the SMIG, we also further an ultimate goal of transitioning the technique to other users.
Incorporate decision rules on when it is most useful to include the code analysis and architecture reconstruction steps as part of the process.
Develop machine support for capturing and analyzing data gathered during the SMART process. This will entail building templates for major artifacts
Develop techniques and criteria for determining when a SMART team has captured sufficient information to complete the analysis process.
Establish a mechanism to capture the net effect of SMART on migration efforts. This information is essential for continued evolution and improvement of SMART.

References

[1] Bergey, J.; O'Brien, L.; and Smith, D. "Using the Options Analysis for Reengineering (OAR) Method for Mining Components for a Product Line," 316-327. Software Product Lines: Proceedings of the Second Software Product Line Conference (SPLC2). San Diego, CA, August 19-22, 2002. Berlin, Germany: Springer, 2002.
[2] Brown, A; Johnston, S.; and Kelly, K. Using Service-Oriented Architecture and Component-Based Development to Build Web Service Applications. Rational Software Corporation. 2002.
[3] Kazman, R; O'Brien, L.; and Verhoef, C. Architecture Reconstruction Guidelines, 2nd Edition (CMU/SEI-2002-TR-034). Software Engineering Institute. November 2003.
[4] Lewis, Grace and Wrage, Lutz. Approaches to Constructive Interoperability (CMU/SEI- 2004-TR-020). Software Engineering Institute. January 2005.
[5] Morisio, M.; Ezran, M.; and Tully, C. Success and failure factors in software reuse. IEEE Transactions on Software Engineering 28, no. 4, (April 2002): 340-57.
[6] O'Brien, L.; Stoermer, C; and Verhoef, C. Software Architecture Reconstruction: Practice Needs and Current Approaches (CMU/SEI-2002-TR-024). Software Engineering Institute. August 2002.
[7] The Service Level Management Learning Community web site: http://www.nextslm.org/

About the Authors

Grace Lewis

is a senior member of technical staff at the Software Engineering Institute (SEI) of Carnegie Mellon University (CMU), where she is a part of the Integration of Software-Intensive Systems (ISIS) Initiative. Grace is currently working in the areas of constructive interoperability, service-oriented architectures, Web services, modernization of legacy systems, and model-driven architecture. Her latest publications include several reports published by Carnegie Mellon on these subjects and a book in the SEI Software Engineering Series. Grace has over fifteen years of experience in Software Engineering. She is also a member of the technical faculty for the Master in Software Engineering program at CMU. Grace holds a B.Sc. in Systems Engineering and an Executive MBA from Icesi University in Cali, Colombia; as well as a Master in Software Engineering from Carnegie Mellon University. Email: [email protected]

Edwin Morris

is a Senior Member of the Technical Staff at the Software Engineering Institute, assigned to the Integration of Software-Intensive Systems (ISIS) Initiative. He is currently investigating approaches to achieving technical interoperability between complex systems and programmatic interoperability between the organizations that build and maintain them. Previous activities involved improving processes and techniques for the evaluation and selection of COTS products, and the development of the COTS Usage Risk Evaluation (CURE) technology. Before coming to the SEI, Ed developed custom operating systems for embedded microprocessors along with support tools to predict and monitor the performance of real time systems. Ed holds a B.A. and an M.A. in Psychology from University of Connecticut, as well as an M.S. in Computer Science from Bowling Green State University. Email: [email protected]

PhD Dennis Smith

is a senior member of the technical staff and Lead of the Integration of Software-Intensive Systems (ISIS) Initiative. This initiative was launched in October, 2003 and focuses on developing and applying methods, tools and other technologies that enhance the effectiveness of complex networked systems and systems of systems. Previously, he was a member of the Product Line Systems Program and technical lead in the effort for migrating legacy systems to product lines. In this role his team developed the method Options Analysis for Reengineering, OAR, to support reuse decision-making. He has published a variety of books, articles and technical reports, and has given talks and keynotes at conferences and workshops. Dennis was the co-editor of the IEEE and ISO recommended practice on CASE Adoption, and has been general chair of two international conferences, IWPC99 and STEP99. Dennis holds an M.A. and PhD from Princeton University, and a B.A. from Columbia University. Email: [email protected]

¹ The most common (but not only) form of SOA is that of Web services, in which (1) service interfaces are described using Web Services Description Language (WSDL), (2) payload is transmitted using Simple Object Access Protocol (SOAP) over Hypertext Transfer Protocol (HTTP), and optionally (3) Universal Description, Discovery and Integration (UDDI) is used as the directory service [4].

² XML (eXtended Markup Language) is currently the most common format for message payload within SOAs.

³ Base classes are those from which the classes in the service are inheriting properties. "Coupled" classes are those that contain code that is used by the classes in the service. It is important to account for these as they represent code that needs to be migrated.

October 2005
Vol. 8, Number 3

Software Archaeology

Articles in this issue:

Tech Views

Reverse Engineering and Software Archaeology

Software Archaeology

Migration of Legacy Components to Service-Oriented Architectures

Software Preservation at the Computer History Museum

Download this issue (PDF)

Receive the Software Tech News