Automation for the people: Continual refactoring
来源:互联网 发布:js div内容 实体 编辑:程序博客网 时间:2024/04/30 11:27
Using static analysis tools to identify code smells
Over the years, I've seen lots of source code from many projects,ranging from elegant designs to code that appeared to be bundledtogether with duct tape. I've written new code and maintained otherdevelopers' source code. I'd rather write new code, but I enjoy takingsome existing code and reducing complexity in a method or extractingduplicate code into a common class. Earlier in my career, many believedthat if you weren't writing new code, you weren't being productive.Fortunately, in the late 1990s, Martin Fowler's book Refactoring (see Resources) helped make the practice of improving existing code without changing external behavior — well — cool.
Much of what I cover in this series is about efficiency:how to reduce the redundancy of time-consuming processes and performthem more quickly. I'll do the same for the tasks I cover in thisarticle and discuss how to perform them more effectively.
Atypical approach to refactoring is to make small changes to existingcode while you're introducing new code or changing a method. Thechallenges of this technique are that it's often inconsistently appliedamong developers on a team, and that it can be easy to missopportunities for refactoring. This is why I advocate using staticanalysis tools to identify coding violations. With them, you gainvisibility into the code base as a whole, and at the class or methodlevel. Fortunately, in Java™ programming, you can choose from anabundance of freely available open source static analysis tools:CheckStyle, PMD, FindBugs, JavaNCSS, JDepend, and many others.
In this article, you'll learn how to:
- Reduce conditional complexity code smells by measuring cyclomatic complexity using CheckStyle and providing refactorings such as Replace Conditional with Polymorphism
- Remove duplicated code code smells by assessing code duplication using CheckStyle and providing refactorings such as Pull Up Method
- Thin large class code smells by counting source lines of code using PMD (or JavaNCSS) and providing refactorings such as Extract Method
- Wipe out too many imports code smells by determining a class's efferent coupling using CheckStyle (or JDepend) and providing refactorings such as Move Method
I'll use the following general format as we examine each code smell:
- Describe the smell that can indicate a problem in the code
- Define the measure that can find that smell
- Show the tool that measures the code smell
- Present the refactoring and, in some cases, the patterns to fix the code smell
Inessence, this approach provides a framework for finding and fixing codeproblems throughout the code base. And you'll be better informed aboutthe riskier parts of the code base prior to making changes. Better yet,I'll show you how you integrate this approach into automated builds.
Conditional complexity
Smell: Conditional complexity
Measure: Cyclomatic complexity
Tool: CheckStyle, JavaNCSS, and PMD
Refactorings: Replace Conditional with Polymorphism, Extract Method
Smell
Conditional complexitycan manifest in several different ways in source code. One example ofthis code smell is to have multiple conditional statements such as if
, while
, or for
statements. Another type of conditional complexity can be in the form of switch
statements, as demonstrated in Listing 1:
Listing 1. Using a switch
statement to perform conditional behavior
...
switch (beerType) {
case LAGER:
System.out.println("Ingredients are...");
...
break;
case BROWNALE:
System.out.println("Ingredients are...");
...
break;
case PORTER
System.out.println("Ingredients are...");
...
break;
case STOUT:
System.out.println("Ingredients are...");
...
break;
case PALELAGER:
System.out.println("Ingredients are...");
...
break;
...
default:
System.out.println("INVALID.");
...
break;
}
...
switch
statementsthemselves are not a bad thing. But when each statement includes toomany choices and too much code, it could be an indication of code thatneeds to refactored.
Measure
To determine the conditional-complexity code smell, you determine a method's cyclomatic complexity.Cyclomatic complexity is a measure that dates back to 1975 as definedby Thomas McCabe. The Cyclomatic Complexity Number (CCN) measures thenumber of unique paths in a method. Each method starts with a CCN of 1regardless of how many paths are in the method. Conditional constructssuch as if
, switch
, while
, and for
statements are each given a value of 1, along with exception paths. Theoverall CCN for a method is an indication of its complexity. Manyconsider a CCN of 10 or more to indicate an overly complex method.
Tool
CheckStyle, JavaNCSS, and PMD are open source tools that measurecyclomatic complexity. Listing 2 shows a snippet of a CheckStyle rulesfile defined in XML. The CyclomaticComplexity
module defines the maximum CCN for a method.
Listing 2. Configuring CheckStyle to find methods with a cyclomatic complexity of 10 or greater
<module name="CyclomaticComplexity">
<property name="max" value="10"/>
</module>
Using the CheckStyle rules filefrom Listing 2, the Gant example in Listing 3 demonstrates how to runCheckStyle as part of an automated build. Gant is an automated buildtool that provides an expressive programming language with support forbuild dependencies. Developers write Gant scripts using the power ofthe Groovy programming language. Because Gant provides full access toAnt's API, anything you can run in Ant can be run from a Gant script.(See the "Build software with Gant" tutorial to learn about Gant.)
Listing 3. Using a Gant script to execute CheckStyle checks
target(findHighCcn:"Finds method with a high cyclomatic complexity number"){
Ant.mkdir(dir:"target/reports")
Ant.taskdef(name:"checkstyle",
classname:"com.puppycrawl.tools.checkstyle.CheckStyleTask",
classpathref:"build.classpath")
Ant.checkstyle(shortFilenames:"true", config:"config/checkstyle/cs_checks.xml",
failOnViolation:"false", failureProperty:"checks.failed", classpathref:"libdir") {
formatter(type:"xml", tofile:"target/reports/checkstyle_report.xml")
formatter(type:"html", tofile:"target/reports/checkstyle_report.html")
fileset(dir:"src"){
include(name:"**/*.java")
}
}
}
The Gant script in Listing 3creates the CheckStyle report shown in Figure 1. The bottom of thefigure indicates the CheckStyle cyclomatic-complexity violation for amethod.
Figure 1. CheckStyle report indicating a method failure based on a high CCN
Refactoring
Figure 2 is a UML representation of the Replace Conditional with Polymorphism refactoring:
Figure 2. Replacing conditional statements with polymorphism
See the full figure here.
In Figure 2, I:
- Create a Java interface called
BeerType
- Define a generic
showIngredients()
method - Create implementing classes for each of the
BeerType
s
Forbrevity, I provided one method implementation per class. Ostensibly,you would have more than one method when creating an interface.Refactorings such as Replace Conditional with Polymorphism and ExtractMethod (which I'll explain later in the article) help make code mucheasier to maintain.
Back to top
Duplicated code
Smell: Duplicated code
Measure: Code duplication
Tool: CheckStyle, PMD
Refactorings: Extract Method, Pull Up Method, Form Template Method, Substitute Algorithm
Smell
Duplicatecode can creep into a code base unbeknownst to anyone. Sometimes, it'seasier to copy and paste some code than to generalize the behavior intoanother class. One problem with copy-and-paste is that it forces anawareness of multiple copies of the code and the need to maintain them.A more insidious problem occurs when slight variations to the copiedcode cause inconsistent behavior, depending on which method isexecuting the behavior. Listing 4 is an example of code that closes adatabase connection, with the same code present in two methods:
Listing 4. Duplicate code
public Collection findAllStates(String sql) {
...
try {
if (resultSet != null) {
resultSet.close();
}
if (stmt != null) {
stmt.close();
}
if (conn != null) {
conn.close();
}
catch (SQLException se) {
throw new RuntimeException(se);
}
}
...
}
...
public int create(String sql, Beer beer) {
...
try {
if (resultSet != null) {
resultSet.close();
}
if (stmt != null) {
stmt.close();
}
if (conn != null) {
conn.close();
}
catch (SQLException se) {
throw new RuntimeException(se);
}
}
...
}
Measure
Themeasure for finding duplicated code is to search for code duplicationwithin classes and among other classes in the code base. Duplicationamong classes is more difficult to assess without the help of a tool.Because of the slight changes that copied code can often undergo, it'simportant to measure code that is not simply copied verbatim, but alsocode that is similar.
Tool
PMD'sCopy/Paste Detector (CPD) and CheckStyle are two of the open sourcetools available for finding similar code throughout a Java code base.The CheckStyle configuration file example in Listing 5 demonstrates theuse of the StrictDuplicateCode
module:
Listing 5. Using CheckStyle to find at least 10 lines of duplicate code
<module name="StrictDuplicateCode">
<property name="min" value="10"/>
</module>
The min
property inListing 5 sets the minimum number of duplicate lines that CheckStylewill flag for review. In this case, it will only indicate blocks ofcode that have at least 10 lines that are similar or have beenduplicated.
Figure 3 shows the results of the module settings from Listing 5 after the automated build runs:
Figure 3. CheckStyle report indicating excessive code duplication
Refactoring
In Listing 6, I take the duplicated code in Listing 4 and reduce the duplication by using the Pull Up Method refactoring — extracting behavior from a larger method into an abstract class method:
Listing 6. Pull Up Method
...
} finally {
closeDbConnection(rs, stmt, conn);
}
...
Duplicated code will happen. I would never recommend to a team to set the unrealistic goal of having noduplication whatsoever. However, an appropriate goal is to ensure thatduplicate code within the code base does not increase. By using astatic analysis tool such as PMD's CPD or CheckStyle, you can determineareas of high code duplication on a continual basis as part of runningan automated build.
Back to top
Long method (and large class)
Smell: Long method (and large class)
Measure: Source lines of code (SLOC)
Tool: PMD, JavaNCSS, CheckStyle
Refactorings: Extract Method, Replace Temp with Query, Introduce Parameter Object, Preserve Whole Object, Replace Method with Method Object
Smell
Ageneral rule of thumb I try to adhere to is to keep my methods to 20lines of code or fewer. Of course, there may be exceptions to thisrule, but if I've got methods over 20 lines, I want to know about it.Often, there's a correlation between long methods and conditionalcomplexity. And there's a connection between large classes and longmethods. I'd love to show the example of the 2,200-line method I foundon a project I had to maintain. I printed an entire class containing25,000 lines of code and asked my colleagues to find the bugs. Let'sjust say I made my point as I rolled the printed code down the hallway.
The highlighted section in Listing 7 shows a small portion of an example of a long method code smell:
Listing 7. Long method code smell
public void saveLedgerInformation() {
...
try {
if (ledger.getId() != null && filename == null) {
getLedgerService().saveLedger(ledger);
} else {
accessFiles().decompressFiles(files, filenames);
}
if (!files.get(0).equals(upload)) {
upload = files.get(0);
filename = filenames.get(0);
}
if (invalidFiles.isUnsupported(filename)) {
setError(fileName, message.getMessage());
} else {
LedgerFile entryFile = accessFiles().add(upload, filename);
if (fileType != null && FileType.valueOf(fileType) != null) {
entryFile.setFileType(FileType.valueOf(fileType));
}
getFileManagementService().saveLedger(ledger, entryFile);
if (!FileStatus.OPENED.equals(entryFile.getFileStatus())) {
getFileManagementService().importLedgerDetails(ledger);
}
if (uncompressedFiles.size() > 1) {
Helper.saveMessage(getText("ledger.file"));
}
if (user.getLastName() != null) {
SearchInfo searchInfo = ServiceLocator.getSearchInfo();
searchInfo.setLedgerInfo(null);
isValid = false;
setDefaultValues();
resetSearchInfo();
if (searchInfoValid && ledger != null) {
isValid = true;
}
}
} catch (InvalidDataFileException e) {
ResultType result = e.getResultType();
for (ValidationMessage message : result.getMessages()) {
setError(fileName, message.getMessage());
}
ledger.setEntryFile(null);
} ...
Measure
TheSLOC measure has been misused in years past as an indication ofproductivity. We all know, though, that it's not necessarily true thatthe more lines you write, the better. However, SLOC can be a usefulmeasure when it comes to complexity. The more lines in a method (orclass), the more likely it will be difficult to maintain the code inthe future.
Tool
The script in Listing 8 finds SLOC measures for long methods (and, optionally, large classes):
Listing 8. Gant script to identify large classes and methods
target(findLongMethods:"runs static code analysis"){
Ant.mkdir(dir:"target/reports")
Ant.taskdef(name:"pmd", classname:"net.sourceforge.pmd.ant.PMDTask",
classpathref:"build.classpath")
Ant.pmd(shortFilenames:"true"){
codeSizeRules.each{ rfile ->
ruleset(rfile)
}
formatter(type:"xml", tofile:"target/reports/pmd_report.xml")
formatter(type:"html", tofile:"target/reports/pmd_report.html")
fileset(dir:"src"){
include(name:"**/*.java")
}
}
}
Once again, I'm using Gant toaccess the Ant API to run Ant tasks. In Listing 8, I'm calling the PMDstatic analysis tool to search for long methods in the code base. PMD(along with JavaNCSS and CheckStyle) can also be used to find longmethods, large classes, and many other code smells.
Refactoring
Listing 9 shows an example of the Extract Method refactoring to reduce the long method smell in Listing 7. After extracting behavior from the method in Listing 7 to the code in Listing 9, I can call the newly created isUserValid()
method from the saveLedgerInformation()
method in Listing 7:
Listing 9. Extract Method refactoring
private boolean isUserValid(User user) {
boolean isValid = false;
if (user.getLastName() != null) {
SearchInfo searchInfo = ServiceLocator.getSearchInfo();
searchInfo.setLedgerInfo(null);
setDefaultValues();
resetSearchInfo();
if (searchInfoValid && ledger != null) {
isValid = true;
}
}
return isValid;
}
Often, long methods and largeclasses are indications of other code smells, such as conditionalcomplexity and duplicated code. By finding these methods and classes,you can fix other problems.
Back to top
Too many imports
Smell: Too many imports
Measure: Efferent coupling (fan-out per class)
Tool: CheckStyle
Refactorings: Move Method, Extract Class
Smell
Too many importsis an indication that a class relies on too many other classes. You'llnotice you've got this code smell when a change to one classnecessitates making changes to many other classes, because they are sotightly coupled to the class you are changing. The multiple imports inListing 10 are an example:
Listing 10. Multiple imports in a class
import com.integratebutton.search.SiteQuery;
import com.integratebutton.search.OptionsQuery;
import com.integratebutton.search.UserQuery;
import com.integratebutton.search.VisitsQuery;
import com.integratebutton.search.SiteQuery;
import com.integratebutton.search.DateQuery;
import com.integratebutton.search.EvaluationQuery;
import com.integratebutton.search.RangeQuery
import com.integratebutton.search.BuildingQuery;
import com.integratebutton.search.IPQuery;
import com.integratebutton.search.SiteDTO;
import com.integratebutton.search.UrlParams;
import com.integratebutton.search.SiteUtil;
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Collection;
import java.util.Collections;
import java.util.Date;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.log4j.Logger;
...
Measure
A way to find a class with too much responsibility is through the efferent coupling measure, also referred to as fan out complexity. Fan out complexity assigns a 1 to every class the analyzed class is dependent on.
Tool
Listing 11 shows an example of using CheckStyle to set the maximum number of fan out complexity:
Listing 11. Setting a maximum fan out complexity with CheckStyle
<module name="ClassFanOutComplexity">
<property name="max" value="10"/>
</module>
Refactoring
There are several ways to fix the tight coupling that can occur as a result of too many imports. For code such as Listing 10, these refactorings include the Move Method refactoring: I move methods from the individual *Query
classes into a Java interface and define common methods that all Query
classes must implement. Then, I use the factory method pattern so that the coupling is with the interface.
Byusing a Gant automated build script to execute the CheckStyle Ant task,I can search my code base for classes that rely on too many otherclasses. When I modify code in these classes, I can implement certainrefactorings such as Move Method along with certain design patterns toimprove the ease of maintenance incrementally.
Back to top
Refactor ... early and often
ContinuousIntegration (CI) is the practice of integrating changes often. As it'stypically practiced, an automated CI server, run from a separatemachine, triggers an automated build with every change applied to aproject's version-control repository. To ensure that the build scriptsfrom Listing 3 and Listing 8 are run consistently and with every change to a code base, you'll want to configure a CI server such as Hudson (see Resources). Hudson is distributed as a WAR file that you can drop into any Java Web container.
Because the examples in Listing 3 and Listing 8 use Gant, I'll walk through the steps for configuring the Hudson CI server to run Gant scripts:
- From the Hudson dashboard, install the Gant plug-in for Hudson by selecting Manage Hudson, followed by Manage Plugins, and then select the Available tab. From this tab, select the Gant plug-in check box and click the Install button.
- Restart the Web container (for example, Tomcat).
- Select Manage Hudson, then Configure System. From the Gant installationsection, enter a unique name and the location of the Groovyinstallation on the machine where Hudson is running. Save the changes.
- Go back to the Dashboard (by selecting the Hudson link) and select an existing Hudson Job, select Configure, click the Add build step button, and select the Invoke Gant script option.
WithHudson configured to run the automated build scripts written in Gant,you can get rapid feedback on measures that are related to code smellssuch as long method and conditional complexity as soon as they areintroduced into the code base.
Back to top
Other smells and refactorings
Notall smells have correlating measures. However, static analysis toolscan uncover other smells than the ones I've demonstrated. Table 1 listsexamples of other code smells, tools, and possible refactorings:
Table 1. Other smells and refactorings
This article provides a pattern of correlating a code smell to ameasure that can be configured to be flagged from an automated staticanalysis tool. You can apply refactorings with or without the use ofcertain design patterns. This gives you a framework for consistentlyfinding and fixing code smells in a repeatable manner. I'm confidentthat the examples in this article will help you use static analysistools to find other types of code smells than the ones I'vedemonstrated.
Resources
Learn
- Automation for the people (Paul Duvall, developerWorks): Read the complete series. The "Continuous Inspection" (August 2006) and "Improving code with Eclipse plugins" (January 2007) installments are especially relevant to this article's topic.
- Smells to Refactorings: A table of recommended refactorings for specific code smells.
- Refactoring: Improving the Design of Existing Code (Martin Fowler, Addison-Wesley Professional, 1999): The seminal book on improving the design of an existing code base.
- Alpha List of Refactorings: Martin Fowler's list of refactorings.
- Refactoring to Patterns(Joshua Kereviesky, Addison-Wesley Professional, 2004): Applying designpatterns to refactorings to improve code. I got the idea of printingout a large class from one of Josh's talks at Agile 2005 in Denver.
- "In pursuit of code quality: Monitoring cyclomatic complexity" (Andrew Glover, IBM developerWorks, March 2006): What to do when code complexity is off the charts.
- "In pursuit of code quality: Refactoring with code metrics" (Andrew Glover, IBM developerWorks, May 2006): On-target refactoring with code metrics and the Extract Method pattern.
- "Continuous Integration: Improving Software Quality and Reducing Risk" (Paul Duvall et al., Addison-Wesley Signature Series, 2007): Chapter 7, Continuous Inspection, covers many of the tools covered in this article.
- "Build software with Gant" (Andrew Glover, developerWorks, May 2008): Read this tutorial to learn how to use Gant for flexible builds.
- "(Ant to Gant) automagically" (Andrew Glover, The Disco Blog, April 2008): Generate Gant scripts based on an existing Ant script.
- "Gant with Hudson in5 steps" (Andrew Glover, The Disco Blog, May 2008): Configuring the Hudson Continuous Integration server to run a Gant build script.
- "In pursuit of code quality: Code quality for software architects" (Andrew Glover, IBM developerWorks, April 2006): Use coupling metrics to support your system architecture.
- Browse the technology bookstore for books on these and other technical topics.
- developerWorks Java technology zone: Hundreds of articles about every aspect of Java programming.
Get products and technologies
- Gant: Download Gant and start building software in a predictable and repeatable manner.
- CheckStyle: Download CheckStyle to gather metrics and better assess certain code smells.
- PMD: Download PMD to gather metrics and better assess certain code smells.
- Hudson: A freely available open source server for Continuous Integration.
- Automation for the people: Continual refactoring
- Automation for the people: Continuous Inspection
- Automation for the people: Continuous feedback
- Automation for the people: Continuous testing
- Automation for the people: Asserting architectural soundness
- Automation for the people: Pushbutton documentation
- Automation for the people: Speed deployment with automation
- Automation for the people: Deployment-automation patterns, Part 1
- Automation for the people: Deployment-automation patterns, Part 2
- Automation for the people: Parallel development for mere mortals
- Automation for the people: Choosing a Continuous Integration server
- Automation for the people: Improving code with Eclipse plugins
- Automation for the people: Build Java projects with Raven
- Automation for the people: Continuous Integration anti-patterns Part 1
- Automation for the people: Continuous Integration anti-patterns, Part 2
- Automation for the people: Hands-off load testing
- Automation for the people: Manage dependencies with Ivy
- Automation for the people: Hands-free database migration
- java与c#中面向对象的不同实现
- UML基础知识
- Automation for the people: Pushbutton documentation
- delphi 快捷键
- DECLARE_DYNCREATE/IMPLEMENT_DYNCREATE等宏
- Automation for the people: Continual refactoring
- Automation for the people: Hands-free database migration
- 解决linux下poppler不支持中文PDF的问题。
- 实现表格隔行变色
- Automation for the people: Parallel development for mere mortals
- Linux下配置NTP 架设本地时间服务器
- reactos操作系统实现(91)
- 风雨哈佛路
- 从语言升级为平台:JAVA老矣,尚能饭否?