Static analysis helps manage risk in Java

March 01, 2013

When it comes to software development, the old adage is best spun in a slightly different way: better "early" than never. Accordingly, static analysis...

 

Today’s software development teams are under immense pressure; the market demands high-quality, secure releases at a constantly increasing pace while security threats become more and more sophisticated. Considering the high cost of product failures and security breaches, it is more important than ever to address these risks throughout the software development process. Potential problems need to be spotted early to prevent release delays or, worse, post-release failures.

Fortunately, there are numerous tools to help developers manage these risks, helping to identify potential problems early in the development phase when issues are less disruptive and easier to fix. They are readily accessible to developers and easy to use within many development environments. This applies to developers programming in any language; however, we focus on Java in this discussion (see Sidebar 1).

 

Sidebar 1: Though Java’s mature ecosystem, numerous IDEs, and abundance of reference materials ease Java application development, they can also bestow a false sense of security upon developers, who should be watchful to mitigate Java’s weaknesses.


21

 

 

Static analysis helps mitigate risk

When considering static analysis tools for Java or otherwise, it is important to understand what these tools are. The term “static analysis” refers to the approach of analyzing a program without executing it. As we’ll see in the next section, static analysis tools can be used to produce reports on anything from coding standard violations to specific errors or vulnerabilities. Simply put, static analysis tools analyze source code to find information useful for managing risk.

One benefit of static analysis is that it can be performed early in the development cycle, often before the application will even execute. It is commonly integrated into an automated build, so that there is virtually no overhead to running frequent analyses. By integrating static analysis into the inner development loop, users maximize the value they get from such tools.

When used in conjunction with a well-designed development process, static analysis tools provide crucial visibility into the state of the software. This enables development teams to understand the level of risk in their code and where the risk resides so they can take action to mitigate or remove it entirely (Table 1). Individual tools generally focus on specific problems faced by software development teams, and teams often use a combination of these tools to get a comprehensive view of their development effort.

 

Table 1: Static analysis tools typically find specific types of issues, with each type representing a different class of risk and requiring a different type of action.


21

 

 

Developers have traditionally used static analysis tools via a simple IDE integration or as stand-alone tools. While the tools add significant value to the development effort, the proliferation of tools has created efficiency problems as developers spend more and more time using and maintaining different tools and sifting through more and more results. To wisely manage development resources, teams must be able to effectively manage, filter, and prioritize all those issues.

To address these problems, development testing platforms have emerged to unify and manage all of this static analysis information in one place, simplifying the user experience and increasing visibility and efficiency at larger scales while providing relevant access controls and reporting. Development testing platforms are even starting to blur the line between static analysis and other types of analysis by utilizing – during the static analysis process – artifacts generated during earlier program runs. For example, these platforms can use code coverage information from test runs during static analysis to effectively identify missing test cases automatically. The traditional approach to this problem requires significant manual effort based on simple coverage thresholds. By leveraging data from different sources, these platforms are able to significantly reduce the manual effort and time required to accomplish this with other methods.

Selecting static analysis tools for Java

The most popular, free, static analysis tools for Java are probably Checkstyle, PMD, and FindBugs. While they all fall under the “static analysis” umbrella, their strengths are so sufficiently different that many consider the tools to be complementary rather than alternatives.

Checkstyle

Checkstyle is billed as “a development tool to help programmers write Java code that adheres to a coding standard[1],” although it does not strictly limit itself to coding standard enforcement. It provides a documented API for users to define their own custom checks. Typical coding standards utilize basic rules to make code more readable and reduce the likelihood that future code changes will introduce bugs. Standards tend to define conventions about formatting (white space, bracketing, naming, commenting, and so on), inheritance, and visibility. When adequately enforced, well-designed coding standards can help developers reduce risk. Enforcement can be difficult, though, since coding standards generate a lot of violations and there can be significant pressure to ignore noisy rules. With legacy code, this can make enforcing new coding standards unfeasible. While most of the issues identified by Checkstyle do not affect code correctness, robustness, or performance, there is real value in helping developers quickly understand code written by others. It is not always obvious how to quantify the risk represented by these violations and it is problematic to measure risk directly from violation counts, but changes in those counts can be a reasonable proxy for changes in risk.

PMD

PMD is described as “…a source code analyzer. It finds unused variables, empty catch blocks, unnecessary object creation, and so forth[2].” It, too, is evolving and the current checks focus mainly on syntactic oddities that might belie developer mistakes, such as overcomplicated expressions, empty blocks, unused variables, parameters, and class members. It also has a popular module to identify duplicated code. Because it is generally reporting “suspicious code” as opposed to specific coding errors or standards violations, the user will need to carefully select the checks enabled for everyday use. Because enforced rules are selected by the user, this tool can be useful for both legacy and greenfield projects, and it is often easy to correlate these counts with risk. Unfortunately, it might not be obvious whether reported issues should be considered defects or maintenance concerns.

FindBugs

FindBugs is probably the most popular of these tools. It looks for actual bugs in the code, as well as suspicious code and standards violations. Because of the wide range of reported issues, it is important to use a configuration that includes the most relevant checks for the project. This is especially true for legacy projects, as it’s easier to keep new projects clean from the beginning. Like PMD, any team can benefit from using FindBugs and associating issue counts to risk can be straightforward.

Commercial static analysis tools show similar diversity, identifying everything from standards violations to actual defects and security vulnerabilities. To illustrate how a commercial tool might compare to a free tool, I analyzed version 1.496 of the Jenkins job management system (www.jenkins-ci.org) using a proprietary static analysis solution and version 2.0.1 of FindBugs, with all checks enabled. On this code base, 852 unique issues were identified – with only 28 issues identified by both products. The proprietary solution found 197 unique issues, with 188 of those coming from high-impact categories (security and concurrency bugs, resource leaks, and unhandled exceptions like null dereferences). FindBugs found 627 unique issues, with 29 coming from those high-impact categories. In short, each of the tools found significant high-impact issues missed by the others, so using a proprietary solution or FindBugs alone will leave significant risk undetected.

Development testing – Tying it all together

Static analysis tools are a powerful ally in the software development effort for Java developers, as these tools enable developers to gain insight into risk throughout the software development life cycle. They are typically easy to automate, enabling users to spend their time fixing problems rather than running the tools.

When it comes to managing risk, more information is generally better – as long as that information illuminates actual sources of risk that developers care about. When deciding which tools to adopt, remember to consider not just the types of issues that analysis tools identify, but how those tools can work together to provide additional value. Also, be sure to configure them appropriately so that the number of issues doesn’t overwhelm your users.

Modern development testing platforms take testing tools to another level by unifying the data in one place, simplifying the user experience, and creating opportunities to provide even more value.

References

[1] http://checkstyle.sourceforge.net/

[2] http://pmd.sourceforge.net/

Jon Jarboe is Senior Technical Manager for Coverity, where he helps developers understand the value of adopting development testing and managing risk throughout the software development life cycle.

Coverity [email protected] www.coverity.com

Follow: Twitter Blog Facebook Google+ LinkedIn YouTube