A Study Guide for Refactoring

Version: 1.3    Quick Links
Author: Nik Boyd    Study Goals
Started: July, 2001    Study Schedule
Updated: August, 2002    Study Questions

This guide provides a structured introduction to the book:

Martin Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley Publishing Co., Inc., 1999. ISBN 0-201-48567-2.

Study Group Goals

For an introduction to study groups, please see A Learning Guide to Design Patterns, by Joshua Kerievsky

For this study group on refactoring, we want to:

Proposed Discussions

The front inside cover of Refactoring provides an indexed list of the refactorings. The back inside cover provides a list of the refactorings organized by Bad Smell. The proposed discussion order is intended to quickly introduce participants to the primary concepts and techniques. After the initial introductory sessions, we will systematically examine the most commonly used refactorings as organized by their associated Bad Smells. Some of the refactorings appear in multiple Smells, but revisiting and reviewing them in another context will help to reinforce their usefulness. The sessions are generally organized to cover the material in a single meeting. However, some sessions may spill over across more than one week if the discussions get lively.

Principles (53-73) Session 1
Motivates the discussions. It's all about product quality and developer productivity. This initial session will be used primarily to discuss the several topics in this chapter and answer questions about them. So, bring questions!
Testing (89-102) Session 2
Software engineering entails testing, especially regression testing. Refactoring is founded on regression testing to ensure that externally observable behavior is maintained during and after refactoring. This session will be used to discuss testing in general, but will focus on regression testing, the role it plays in refactoring, and regression test frameworks like JUnit.
Example (1-52) Sessions 3 + 4
During these two sessions we consider how the following refactorings apply to the example in Chapter 1: Extract Method (110), Move Method (142), Replace Temp with Query (120), Self Encapsulate Field (171), Replace Type Code with State/Strategy (227), Replace Conditional with Polymorphism (255).
Duplicated Code (76) Session 5
Duplicated code is likely the single most prevalent "sin" in software. Extract Method (110), Extract Class (149), Pull Up Method (322), Form Template Method (345), Substitute Algorithm (139)
Long Method (76) Session 6
Smaller methods enhance reuse, understandability and maintainability. Extract Method (110), Replace Temp with Query (120), Replace Method with Method Object (135), Decompose Conditional (238)
Long Parameter List (78) Session 7
Object-oriented method signatures are generally simple. Long parameter lists should usually be simplified by using existing objects or introducing new ones. Replace Parameter with Method (292), Introduce Parameter Object (295), Preserve Whole Object (288)
Large Class (78) Session 8
Classes should generally focus on only a few responsibilities. Classes that know or do too much should be decomposed into simpler classes. Extract Class (149), Extract Subclass (330), Extract Superclass (336), Extract Interface (341), Replace Data Value with Object (175)
Alternative Classes (85)
After a rapid construction phase, some method names may not adequately reveal their intention, or a method may not have an optimal placement within a collaboration. Rename Method (273), Move Method (142), Extract Superclass (336)
Inappropriate Intimacy (85) Session 9
Familiarity breeds consent. Too much intimacy results in strong coupling and unwarranted dependency. Move Method (142), Move Field (146), Change Bidirectional Association to Unidirectional (200), Replace Inheritance with Delegation (352), Hide Delegate (157)
Switch Statements (82) Sessions 10 + 11
Polymorphism is an elegant mechanism for separating cases. Replace Conditional with Polymorphism (255), Replace Type Code with Subclasses (223), Replace Type Code with State/Strategy (227), Replace Parameter with Explicit Methods (285), Introduce Null Object (260)
Primitive Obsession (81) Session 12 + 13
Coherent groups of primitive values and functions that operate on them should be modeled as objects. Replace Data Value with Object (175), Extract Class (149), Introduce Parameter Object (295), Replace Array with Object (186), Replace Type Code with Class (218), Replace Type Code with Subclasses (223), Replace Type Code with State/Strategy (227)

Study Questions

Disclaimer: These questions can and should be augmented and / or replaced by questions raised by the participating study group members.

Principles (53-73) Session 1
  1. What is the relationship between refactoring and factoring? What do these concepts have to do with the mathematical concepts of factoring and expression simplification?
  2. What standards can be applied to evaluate the quality of a software design? Are there any heuristics for determining the quality of a design? Are there any quantitative metrics for measuring design quality?
  3. Can essential complexity be reduced or merely displaced? Is essential complexity always conserved globally?
  4. Are there always trade-offs between refactoring and performance?
  5. Does the absence of source code limit the boundaries of what can be refactored?
  6. Does software development have a natural cycle? (hint: alternation between expansion and consolidation)
  7. Is incremental refactoring better than wholesale refactoring? Are there situations where only wholesale refactoring will avail?
  8. Should software code reviews be conducted based on written quality standards and guidelines? (hint: yes)
  9. Is there any way to persuade a manager to support refactoring when development is driven by schedule over quality? (hint: What are their points of pain? How can they be addressed by refactoring?)
 
Testing (89-102) Session 2
  1. How are unit tests related to interface contracts?
  2. What distinguishes unit testing from functional testing?
  3. How can we regression test functionality, e.g., of a human-computer interface? What tools support this?
  4. How does one force exceptions to occur during regression testing?
  5. How are regression test suites updated and managed as the code base changes?
 
Example (1-52) Sessions 3 + 4
  1. Were the proposed refactorings (number and / or type) surprising?
  2. Does replacing a temporary variable with a query ever obscure rather than clarify the code?
  3. Names (should) reveal intent. Are naming conventions important? How are naming conventions related to refactoring?
  4. What if a method uses information from multiple objects? Which class should own the method?
  5. The pricing strategy resulted in an abstraction and a few simple concrete classes. In Java, would using anonymous inner classes be a better (simpler) choice than the concrete (named) classes?
 
Duplicated Code (76) Session 5
  1. Sometimes performance optimizations involve code duplication, e.g., inlining and loop unrolling. Are there any other occasions when it makes sense to duplicate code? In any case, how can one reduce the impact of code that must be duplicated?
  2. What is the relationship between factoring and refactoring? Are method extraction and template method formation examples of factoring or refactoring?
  3. Many algorithms have a similar lifecycle: initialization, execution, cleanup. Given such code patterns, how can duplications of the initialization and cleanup code be eliminated? When / does it make sense to eliminate such duplication? (see, e.g., the Resource Manager pattern.)
  4. Sometimes the standard Collection classes can be used to simplify algorithms. What are the advantages of the standard Collection classes? While they can help simplify algorithms, what performance impact do they have?
 
Long Method (76) Session 6
  1. Is there any time when it makes sense to have a long method (e.g., loop unrolling)?
  2. Is it thread-safe to eliminate temporary variables?
  3. Object-oriented developers often enjoy finding reusable objects and classes. Method Objects are one way to discover such objects from local variables in an existing system. What are some of the other ways to discover reusable objects and classes in existing code? (hint: instance variables, parameters, states).
  4. Object-oriented designers must become good at naming chunks of behavior. Decomposition and extraction refactorings require names for newly identified chunks of behavior. Are there any useful guidelines and conventions for naming new chunks of behavior (see 295)?
 
Long Parameter List (78) Session 7
  1. The Law of Demeter recommends delegation over navigation. What are the tradeoffs between these two design approaches?
  2. Object-oriented interfaces typically have object parameters rather than lists of primitive values. What are the benefits of passing objects rather than the values to which they provide access? Are there any tradeoffs to this design approach?
 
Large Class (78) and Alternative Classes (85) Session 8
  1. How does one decide whether to extract a class as a superclass or a subclass? What guidelines are there for class placement within a hierarchy?
  2. What are responsibilities? What benefits does a responsibility-driven design approach have over a data-driven design approach?
  3. What are roles? How are roles related to contracts? How are these concepts used in object-oriented design?
 
Inappropriate Intimacy (85) Session 9
  1. The Law of Demeter recommends delegation over navigation. How does this Law reduce or eliminate inappropriate intimacy?
  2. How can the Association as Class concept be used to decompose bidirectional associations?
  3. Some garbage collectors have difficulty detecting cyclical garbage. Should bidirectional associations be severed at the end of the useful lifetime of the participating objects? How will this help a garbage collector?
  4. Inheritance is often overused. When and why should inheritance be used? When and why should it be avoided?
  5. Inheritance is the tightest kind of coupling. Collaboration provides close coupling. Mediation provides loose coupling. What are the tradeoffs of these forms of coupling? What is the relationship between coupling and dependency? (hint: implementations depend on interfaces)
 
Switch Statements (82) Sessions 10 + 11
  1. Why is using polymorphism preferable to using switch statements? (hint: responsibilities)
  2. Are there other techniques for dealing with behavioral variations? Are these techniques directly supported by mechanisms in programming languages or class libraries?
  3. What are the similarities and differences between the State and Strategy design patterns?
  4. What are the relationships between the Null Object pattern and the Exceptional Value and Meaningless Behavior patterns in the Checks pattern language?
 
Primitive Obsession (81) Session 12
  1. When should primitive values be wrapped with classes? (hint: model distinct data types with classes or interfaces+classes)
  2. Heterogeneous collections are candidates for new classes, but how can one detect a heterogeneous collection of values? What are the tell-tale signs? (hint: if the collected values have different types and / or different purposes)
 

Supplemental Reading

The following books are recommended as supplemental reading for Refactoring.

E. Gamma, R. Helm, R. Johnson, J. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Publishing Co., Inc., 1995. ISBN 0-201-63361-2. Provides an excellent introduction to design patterns and a catalog of solutions to common design problems.

J. Kerievsky. Refactoring to Patterns. Industrial Logic, Inc., 2001. Supplements the works on design patterns and refactoring by connecting these two activities (design and design improvement).

K.J. Lieberherr. Adaptive Object-Oriented Software: The Demeter Method with Propagation Patterns. PWS Publishing Co., 1996. ISBN 0-534-94602-X. Adaptive programming specifies the connections between objects as loosely as possible and defers the binding of algorithms to data structures until as late as possible. The Demeter system provides tools that support adaptive programming.

A.J. Riel. Object-Oriented Design Heuristics. Addison-Wesley Publishing Co., Inc., 1996. ISBN 0-201-63385-X.
Riel provides a fairly comprehensive set of 61 heuristics for object-oriented design. Unfortunately, the examples are all written in C++. Still, many of the heuristics are worth considering and easily transfered to Java development.

S.A. Whitmire. Object-Oriented Design Measurement. John Wiley & Sons, Inc., 1997. ISBN 0-471-13417-1.
While the title of this book indicates a focus on object-oriented designs, it also provides metrics applicable to software designs in general. It provides an excellent introduction to the elements of measurement theory, enough to generate a wide range of measurements tailored to specific needs. If you have a metrics-oriented manager or simply want more objective ways of evaluating your designs, this book is highly recommended.

M.H. Halstead. Elements of Software Science. Elsevier North-Holland, Inc., 1977. ISBN 0-444-00205-7.
Halstead pioneered the field of software metrics. This treatise still serves as a useful guide for thinking about and working with fundamental elements software elements such as operands, operators, program length, program volume, vocabulary size, language level, effort, difficulty, and predictive error rates.

The IEEE published several papers regarding design metrics.

S.R. Chidamber, C.F. Kemerer. A Metrics Suite for Object Oriented Design. IEEE Transactions on Software Engineering v20 #6, June 1994.
N. Fenton. Software Measurement: A Necessary Scientific Basis. IEEE Transactions on Software Engineering v20 #3, March 1994.
L.A. Laranjeira. Software Size Estimation for Object-Oriented Systems. IEEE Transactions on Software Engineering v16 #5, May 1990.
E.J. Weyuker. Evaluating Software Complexity Measures. IEEE Transactions on Software Engineering v14 #9, September 1988.
L.H. Putnam. A General Empirical Solution to the Macro Software Sizing and Estimating Problem. IEEE Transactions on Software Engineering vSE-4 #4, July 1978.