Software Archaeology
By Andy Schneider, Lead Integration Architect, for BP's Oil Trading & Supply SystemsBP Plc. and Pete
Windle, Consultant, BJSS Ltd.
Introduction
For us to discuss software archaeology productively we must first define it. A naive definition can be derived thus:
softáware aráchaeáoláoágy Pronunciation Key (sôftwâr ärk-l-j)
The systematic study of past software systems by the recovery and examination of remaining material evidence, such
as code, tests and design documentation.
Using archaeology as a metaphor allows us to reason about how we recover and examine material evidence relating
to software projects. As useful as this is, it is by no means the entire story. Many times there exist individuals
to interview, and primary and secondary evidence to investigate. These types of activities owe more to historical
practice than they do to archaeology. One of the reasons for this is that existing communities in organizations ensure
that the investigation of systems is as much about people, existing practice and precedent as it is about tools and
artifacts 'dug out of the ground'. This emphasis on people and historical practice is less evident in existing software
archaeology literature.
As an illustration, consider the UK constitution. The UK constitution defines the form, structure, activities, character,
and fundamental principles by which UK society, law making institutions and government institutions operate.
Famously, though, it is not written down. Instead it consists of a number of bodies, namely the monarchy, the
executive, parliament and the civil service, supported by both precedent and law. A student of the UK constitution is
rather like someone trying to understand a software system. The student has to talk to experts, study paperwork,
identify 'urban myths', analyze past behaviours and cope with the fact that, whilst they investigate it, the constitution is
changing. The authors find this process, the process of historical study, to be as compelling an analogy as
archaeology when considering practical techniques for studying software systems.
With this in mind, this paper will take a more historical and people orientated perspective. Firstly, we will outline some
of the limitations of the archaeology and historical metaphors. We will then go on to outline certain principles of
history, as applicable to software systems, and then use these principles to drive out key techniques for understanding
systems. Hopefully we will leave you, the reader, with practical techniques and another way of thinking about
understanding systems.
The Trouble with a Metaphor
It is important that any metaphor is seen for what it is - a starting point or a framework for describing the properties of
the object of interest. It is not to be used as a reference or to define truth. One can say that a hierarchical software
structure is like the branches of a tree; one cannot then infer that software is made of wood. Fowler goes into this in
some detail[3].
So then, what are the limitations of our archaeological and historical metaphors?
Software is unlike archaeology in many ways. One of the main contentions that the authors have is that archaeology
is essentially an academic subject. Its practical techniques are there to divine information from the past using the
artifacts left behind, and this is for no reason other than the discovery of knowledge. In contrast, the authors believe
that the role of software archaeologist is useful merely as part of a software practitioner's everyday skill set. As Booch
notes in [1], software archaeology provides the ability to "[recover] essential details about an existing system sufficient
to reason about, fix, adapt, modify, harvest, and use that system itself or its parts."
Also, a large part of archaeology is dedicated to maintaining the condition of a site or dig. Software is inherently easy
to protect from external damage - even if you don't have a source control system, you can always burn a copy onto a
CD.
We contend, therefore, that historical techniques are just part of the toolkit to be applied to software problems. The
main difference here is one of timescale - the information to be gleaned from software historical studies in our context
is usually much closer to the present, giving more access to primary sources.
The differentiation between archaeology and history is no doubt one that the respective disciplines have debated
extensively. For our purposes, we will define history as "the study of the past". At that point, archaeology becomes
that subset of history that is largely concerned with the study of the concrete artifacts left behind by a subject of
interest. History encompasses archaeology then, and adds more of a bias towards considering the human aspects of
the subject in hand.
Applying that to our software paradigm, we afford equal weight to the surrounding culture as to its artifacts.
Types of Evidence
Historians subdivide evidence into primary and secondary sources.
Primary sources can be material things such as artifacts, tools, constructions or remains (although in software,
"material" can include intangibles - source and object code, log files, etc). Alternatively, they could be
contemporaneous written records - documents, specifications, e-mails.
Finally, interviews conducted with the protagonists or direct witnesses to events are themselves primary sources. The
oral history of software teams tends to be a very rich source of information (albeit some of it apocryphal or cargo
cult[2] invocation). This is for several reasons, but mainly because most software projects are delivered with a
number of undocumented deltas to the original requirements and/or design.
Cargo cult science ia a term used by Richard Feynman in his 1974 Caltech http://en.wikipedia.org/wiki/Caltech/ commencment address to
describe work that has the sembelence of being scientific, but is missing "a kind of scientific integrty, a principle of scientific thought
that corresponds to a kind of utter honesty". Feynman cautioned that to avoid becoming cargo cult scientists, researchers must first of all avoid
fooling themselves, being willing to question and dout their own results, and to investigate possible flaws in a theory or an experiment
(from http://en.wikipedia.org/wiki/Cargo_cult_science).
Secondary sources are texts and documents which are derivative of the actual event - not actually part of the
phenomenon but later analysis or interpretation of it. As we will discuss later, the production of secondary sources is
an important part of the software historian's milieu.
When considering these sources it is often useful to evaluate their quality before spending significant time reading
them. Some questions a good historian may ask when judging software related sources would be:
- Why was the source created?
- Does it look like the source was ad-hoc and rushed or well thought and pre-planned?
- Did the author actually work on the system?
- Who was the target of the document?
- Was the source intended to be for public consumption or private to the team?
- Was the document in current use?
- How long after the system/module was written was the source produced?
Exploring the Past
A good historian recognises that the past is different from the present. This means avoiding researching a system
with the aim of reinforcing existing expectations but rather, allowing the system to teach you how it really is. For
example, when encountering a method called createInstance you may assume from the name that it creates instances
of some structure or object and therefore pass it by. However, the method could be doing anything. There are
examples of poor naming all over the IT world. Worse, by passing over the method you may have missed a lesson;
that 'instance' in this system refers to some specific concept (such as a point in time) rather than the meaning you
have ascribed to it.
Be conscious of signifiers
In semiotics [5], people distinguish between the signified, the concept being
communicated and the signifier, the token that represents the concept. The
signifier and the signified together are known as a sign. The much used example
of this is the White Hat in old cowboy movies. Viewers of these movies know that
the White Hat stands for the Good Guy. If you are not familiar with this sign, or
apply it outside of the genre then you may miss something or worse,
misunderstand the communication.
Pattern names are signifiers and so are software terms such as 'instance'. So, be
conscious that sets of signs used in the past may not be the same as those you
are using. Or, put more bluntly, 'assumption is the mother of all screw-ups'.
With the best will in the world, you cannot hope to know all there is to know about a system and its past, so it is worth
remembering that the past you see is only part of the picture. Consider the functionality someone developed in the
past. Ask yourself "was that the functionality that was intended?" - moreover, ask yourself "was that the functionality
that was actually wanted?" If you assume that the code correctly implements the required functionality then you may
be missing a bug or a requirements gap.
It doesn't always do what is says on the tin
On encountering some code written in the past, seek to both understand the code and determine
if the code correctly implements the functionality. If you cannot determine the latter, be sure to
explicitly recognise that fact. Sources of functional definition (apart from a requirements
document) are often the support team or the users.
The functionality of the system you are studying is and has been part of a wider system, combining software,
users and support (at a minimum). Good historians always consider multiple perspectives and contexts before
drawing conclusions.
It isn't just the code
When you are examining functionality in the system, and applying 'It doesn't
always do what it says on the tin', you may feel the urge to fix the problem. Before
you do this, determine if the wider system relies on the 'error' discovered in the
software system for correct operation.
The past is not static. When seeking to understand a system, recognise that you can both examine it now, at this
instance, but you can also examine it over a series of points in time. A historian will always look at what came before
and after when assessing a particular point in time.
Consider collating data such as:
Points in time
Gather metrics that vary over time to obtain an understanding of the evolution of
the system. Correlate specific events in the results with any spoken or written
history of the project to enrich your understand of why things have evolved the
way they did.
Cultures
Examining a system is inextricably linked with gaining familiarity with the group of people who create (or created), use
and support the system. There is much to be gained from examining some of the popular ethnography literature (e.g.
[6][7][8]). There isn't space in this article to spend much time on the subject, but we have
included a limited number of practices to ensure a reasonable breadth of coverage.
Many cultures deny that an outsider can ever understand them. They maintain that their treasure is only for
members; this is particularly the case in some tribes. Recognise that groups that form around legacy systems can
exhibit many tribal characteristics. Understand that whilst the tribe may be the final authority on their perceptions and
the terms they use, the past and its evolution are open for you to look into. Because of the tribal nature, it is important
to use their language, to be initiated and to adopt their customs - without becoming pickled (i.e. 'going native') or
failing to question the customs.
Speak the lingo
Paying attention to the terminology used by existing legacy teams allows you to
communicate with them on their terms. This helps communication, improves
rapport and shows you are actively listening.
Each group of people will have their own interpretation of history; listening to only one (e.g. the development
team) can result in inaccurate bias creeping in to your research process.
Seek Multiple Sources
Acquire multiple perspectives on the system to reduce the sensitivity of your finding to bias.
You can extend this practice to observing how conversations with individuals vary over time. For example, one of the
authors was interviewing someone on a legacy team about the deployment process. The process sounded incredibly
complex and the reasons for the complexity seemed valid, but somewhat tortuous. The author went back for repeated
conversations and noticed as time went on that the explanations changed or simplified. This indicated that there was
more to the story than was being told. The differences pointed to where to explore and the author changed his tack as
a result.
Question Customs
Responses to questions of the form 'well, it just works that way' are often clues of
where to spend your time when examining a system. These responses may point
to Cargo Cult[2]. Identifying these misconceptions can be useful pointers to
fragile or complex areas of the system that need understanding well.
Cultures have a strong interest in their own pedigree. This can result in cultural chauvinism, where certain groups
compete to show they are the 'oldest', 'fastest', 'youngest' etc. To do this they will often, sometimes without knowing
it, manipulate history - to show things in a certain light. It is important therefore to rely not only on artifacts from a
team, but also actively engage with external stakeholders to gather data. For example, if you are interested in
performance, ask the support team; measure the response times, do not just rely on anecdotes from the team. See
Multiple Sources above.
Gather Objective Data
Where possible gather quantifiable data to validate hearsay. E.g. use profile tools to
determine where the bottleneck really is.
Documenting the Past
There are several different techniques for documenting the past. This section details some of the forms of output and
methods of production that the authors have found useful.
Update and Extend the Existing Literature
This is generally not a valid option in historical circles - a historian who merely
rewrites other people's books to "correct" interpretations would not be seen as
contributing much.
When understanding software systems, however, there is usually the potential to at
least use existing documentation to bootstrap your work.
Catalogue existing design and requirements documentation. Do not treat it as
gospel truth - consider potential levels of accuracy, bias and how up-to-date it has
been kept. Consider which parts are still useful and which approaches are worth
extending.
It may be that a slash and burn strategy is required should the existing
documentation be woefully inadequate or glaringly inaccurate. Not every document
contains enough value to be worth recovering.
Conversely, it may be that although inaccurate in places, some existing
documentation is "good enough" - it is still worth briefly recording where it is not,
however.
Start With A Clean Sheet
In a way, the mirror image of the previous point. When approaching a large and
unknown architecture with many moving parts, it is generally best to start with a blank
sheet of paper.
This can then be iteratively filled in as your understanding expands - the blanks within
the diagram then become a map of where further investigation is required.
On an old map of the world, these places would be marked "Here Be Dragons", perhaps.
Good diagrams for forming these high-level views include:
_ Activity diagrams - especially for core transaction pipelines.
_ Sequence diagrams - for understanding interactions between object graphs.
_ Package and/or class diagrams - for establishing an idea of structure.
Don't Trust Their Focus
The subjects that the original authors of sources considered interesting may not be
the events that you are interested in now. A corollary of "The Past Is Not The
Present".
The low-level concerns of the parish council rarely make front page news, but may
well generate screeds of text.
In context, then, the original documentation may wax lyrical at some length about
their optimistic locking model. Providing that it works, you may never need to touch it
and therefore understanding the intimate detail may be a waste of your time. Of
course, this is providing that it works.
The key has to be to understand your own requirements at that moment, and
search out the documentation that meets them.
Make a Timeline
Draw a history timeline showing significant events that occurred during the lifetime of
the development and use this to remind you of the context within which people
worked.
This helps because it forces you to think about why things happened and when they
happened - it's a framework for thinking about the past in a dynamic fashion.
Know what you don't know.
Documenting what you don't know allows people who follow you to determine where
they need to spend their time when investigating the system and to determine
between validated fact and hypothesis.
Conclusion
The authors have looked at applying some of the principles of history to the study of existing software systems, and
have tried to outline some practices to assist the student of those systems.
Whilst the archaeological metaphor looks initially compelling, we believe that the essentially human nature of software
and its comparatively contemporaneous nature make the history metaphor a more interesting fit.
Finally, it should be stressed that both the software archaeologist and software historian are merely roles that are
adopted at one time or another by the software professional throughout the course of their everyday work, and should
not be taken as anything more than useful metaphors for examining best practises in the field.
References
[1] Booch, Grady,
http://www.booch.com/architecture/blog/artifacts/Software%20Archeology.ppt
http://www.booch.com - down at the minute
[2]Hunt, A & Thomas, D. Software Archaeology.
http://www.pragmaticprogrammer.com/articles/mar_02_archeology.pdf
[3][2] Olson, Don et al. The Manager Pool. Addison Wesley, 2001.
[4]M A Jackson, Problem Frames: Analysing and Structuring Software Development Problems (2001)
[5]http://en.wikipedia.org/wiki/Problem_Frames_Approach for overview
[6][3] Fowler, M. Metaphoric Questioning. http://www.martinfowler.com/bliki/MetaphoricQuestioning.html
[7][4] http://www.math.utah.edu/~alfeld/math/polya.html
[8][5] Chandler, Daniel. Semiotics for Beginner. http://www.aber.ac.uk/media/Documents/S4B/semiotic.html
[9][6] Barley, Nigel. The Innocent Anthropologies: Notes from a Mud Hut. Penguin, 1986.
[10][7] Hammersley, Martin & Paul Atkinson. Ethnography: Principles in Practice. Routledge, 1994.
[11][8] Agar, Michael. The Professional Stranger: Information Introduction to Ethnography. Academic Press,
1996.
Further Reading
[9] Hunt, Andy & Dave Thomas. Software Archaeology.
http://www.pragmaticprogrammer.com/articles/mar_02_archeology.pdf.
[10] Seacord, Robert, Daniel Plakosh & Grace Lewis. Modernizing Legacy Systems. Addison Wesley, 2003.
[11] Demeyer, Serge, Stephane Ducasse & Oscar Nierstrasz. Object Orientated Reengineering Patterns. Morgan
Kaufmann, 2002.
About the Authors
Andy Schneider is the Lead Integration Architect for BP's Oil Trading & Supply systems. Andy is an industry exponent
of agile development techniques with over 15 years of relevant experience in the IT industry. He has extensive
experience in application and systems architecture, project management and software delivery. Andy regularly
publishes papers and presents on subjects such as technical leadership and systems development at conferences
such as OOPSLA and SPA.'
Email: Andy Schneider [[email protected]]
Pete Windle is a technical project delivery specialist, consulting for BJSS (http://www.bjss.co.uk). He lives
in a software development commune in Islington, London."
Email: Pete Windle [[email protected]]
|