Java For Small Teams
Java For Small Teams
Java For Small Teams
of Contents
Introduction 1.1
Process 1.2
Build Fast Feedback Loops 1.2.1
Instant Feedback 1.2.2
Fast Feedback 1.2.3
Slower Feedback 1.2.4
Agree The Language You Use For Tests 1.2.5
Use Coverage As A Tool Not A Target 1.2.6
Style 1.3
Consider Code Generators Carefully 1.3.1
Optimise For Readability 1.3.2
Prefer Readable Code To Comments 1.3.3
Javadoc Judiciously 1.3.4
Remember Kiss And Yagni 1.3.5
Prefer Composition 1.3.6
Keep It Solid 1.3.7
Keep Your Code Dry 1.3.8
Prefer Reversible Decisions 1.3.9
Make Dependencies Explicit 1.3.10
Prefer Immutable Objects 1.3.11
Use A Consistent Code Layout 1.3.12
Group Methods For Easy Comprehension 1.3.13
Keep Methods Small And Simple 1.3.14
Methods Should Do One Thing 1.3.15
Avoid Null 1.3.16
Use Final Liberally 1.3.17
Provide No More Than One Worker Constructor 1.3.18
Avoid Checked Exceptions 1.3.19
Specifics 1.4
Know How To Implement Hashcode And Equals 1.4.1
2
Do Not Reassign Parameters 1.4.2
Limit Scope 1.4.3
Prefer For Each Loops To For Loops 1.4.4
Prefer Maps And Filters To Imperative Loops 1.4.5
Avoid Apis From Prehistory 1.4.6
Beware Casts And Generics Warnings 1.4.7
Do Not Use Magic Numbers 1.4.8
Do Not Use The Assert Keyword 1.4.9
Avoid Floats And Doubles 1.4.10
Do Not Use Reflection 1.4.11
Tests 1.5
Write Specifications Not Tests 1.5.1
Think Units Not Methods 1.5.2
Name Tests With A Specification Style 1.5.3
Pick Examples Carefully 1.5.4
Make Tests Easy To Understand 1.5.5
Understand How To Use Mocks And Stubs 1.5.6
Understand Your Options For Code Reuse 1.5.7
Write Repeatable Tests 1.5.8
Only Unit Test Code It Makes Sense To Unit Test 1.5.9
Testing FAQS 1.5.10
Bad advice 1.6
Single Exit Point Rules 1.6.1
Always Use A StringBuffer 1.6.2
Hungarian Notation 1.6.3
3
Introduction
Introduction
What is This?
This book is an attempt to capture what "good" Java code looks like and the practices that
help produce it.
There is no one right answer to what good code looks like and there are many well-
respected books that serve the same purpose such as Effective Java, Clean Code and
others.
Freely distributable
Open for update - contributions, corrections and updates are encouraged
Brief - much is left out in an attempt to be easily digestible
Narrow - it captures one opinion of "good" appropriate for a specific context
We assume a number of things about you and the environment you are working in.
Some of the suggestions may be valid in other contexts, others might constitute terrible
advice for those contexts.
It is also just one opinion from many valid alternatives. To be useful it needs to be an opinion
that you can agree with and sign up to. If you disagree with something in this book please
make your own thoughts known so it can be improved.
Finally, not all the code we work on is perfect. Sometimes we inherit our own mistakes,
sometimes we inherit other people's.
The point of this document is not to say that all code must look like this but to have an
agreed destination that we are aiming for.
4
Introduction
Some sections will be more relevant to some audiences than others but we encourage
everyone to at least skim all sections even if you do not read them in depth.
Structure
The document is split into five sections:
Version
This book is updated often. The latest changes to the book can be viewed online at
gitbook.com.
Versioned releases are available for free from the book's website.
If you are reading a PDF or print copy of this book the release version will be displayed on
the inside cover. If there is no inside cover then you are reading an unreleased version of the
book.
History
Most of the content of this book began life as internal wiki pages at NCR Edinburgh. We
started to convert the wiki into this book at the end of 2015 so that it could be easily shared
with other parts of our company.
Rather than keep this as an internal document we decided to open it up to everyone in the
hope that together we could make it better.
5
Introduction
A word on Trade-Offs
There are no right answers in software engineering.
It is a balancing act in which we must trade off one concern against another and make a
judgement call about which balance is better for the specific scenario we have found
ourselves in.
One of the most common mistakes we've seen experienced programmers make is to blindly
consider only one or two concerns (often the ones with catchy acronyms) without thought for
others.
We've carefully set out the context in which we think the advice in this book will be useful,
but the context is still very broad. Slightly different situations might benefit from very different
trade offs. What worked well for you in one project might not work so well in the next.
When we do use them we have thought carefully before doing so, but what we really mean
is almost-always or almost-never.
Having said this if you find yourself discounting any of the recommendations in this book
please stop and think first. Don't fall into the trap of thinking certain advice cannot apply to
you. We often make our worst mistakes when we believe we are being elegant or clever.
The full horror of our ineptitude sometimes does not become apparent for months or years.
Where there are differences between Java 8 and 7 we will point them out. If you are unlucky
enough to be working with an earlier version of Java you will have to discover any
differences to Java 7 on your own.
License
This work is Copyright NCR, but may be read and shared with others under the terms of
the Creative Commons Attribution Share Alike licence v4.0.
6
Introduction
Author
This document was written by Henry Coles and numerous contributors.
Contributors
Francesco Burato
Gary Duprex
Grant Forrester
Kevin Grant
Keir Lawson
Marco Di Paola
Gordon Rogers
Herve Saint-Amand
Ewan Summers
Cover
The cover was produced by Peter Berry based on a wood engraving of a Long billed curlew
from the 1885 text "Nouveau dictionnaire encyclopdique universel illustr"
To our knowledge Long Billed Curlews are not especially proficient in Java.
7
Process
Process
Code (good and bad) doesn't just appear from nowhere, someone needs to sit down and
write it. How the coder approaches this task can influence the amount of time spent
swearing at the code later.
This section looks at development philosophy, workflow, and other factors that teams should
consider before starting to work together.
8
Build Fast Feedback Loops
Working on a legacy project, where the only way to discover if a code change is good is to
deploy it to a dev/test/qa environment, is frustrating and demoralizing.
Make sure your project has a well designed development workflow - the effort of setting this
up will be repaid many times over.
9
Instant Feedback
Instant Feedback
A modern IDE such as Eclipse or IntelliJ will provide instant feedback as you type, using the
underlying compiler and configurable static analysis tools.
You can increase the amount of instant feedback you receive by making good use of the
Java type system and configuring the static analysis tools.
While feedback from the IDE is fast and convenient, it has some drawbacks.
For these reasons you should avoid purely IDE centric work flows. Code should not be
considered complete by a developer until tests have been run via the build file.
10
Fast Feedback
Fast Feedback
Feedback from the build script has two major advantages, however:
Because slower feedback is acceptable from the build script, a larger set of static and
dynamic analysis can be run from here. This will usually include a repeat of your instant
feedback.
At least two suites should be maintained that are runnable locally on any developer
machine.
Because they are typically run immediately after compiling or before committing/pushing
code, Martin Fowler refers to these as:
The criteria for a test being placed in the compile suite should, however, be more than just
its execution speed.
They must be fast (milliseconds or less per test) but must also be highly deterministic and
repeatable. This ensures that the suite provides clean feedback - the only reason that a test
should fail after a code change is if the change has caused regression.
Although this sounds simple, in practice it requires considerable rigor to ensure that tests
cannot interfere with each other or be affected by external factors.
11
Fast Feedback
Tests in the commit suite may be slower and may also be slightly less repeatable. They
should aim to be 100% repeatable but they may do things that carry the risk of occasionally
causing a failure, like use network IO or write to disk.
Although many tests in this suite may do no more than launch code within the same JVM as
the tests themselves, some of the tests should also launch the built artifact (war, ear, jar) and
perform at least some degree of testing against it.
Although the commit suite will likely depend on external resources such as containers,
databases, queues, etc., it should still be runnable on any machine with a single command.
Installing and starting dependent resources should be handled automatically by the build
scripts and tests - your project should not come with a page of notes on how to set up a
development machine.
Commonly, the Maven Cargo plugin is used to download and configure containers for
testing.
12
Slower Feedback
Slower Feedback
Both the compile suite and the commit suite should be run on a CI server, normally triggered
by a commit/push to the repository.
In addition to the compile and commit suites, other suites should be created.
These suites may require resources not available on a local machine and/or take large
amounts of time to execute.
They may also re-run the same tests against more realistic dependencies. If an in-memory
database is normally used when running integration tests locally, the same tests might be
run again against a production database.
For a Maven build, these suites are likely to be implemented using profiles or as separate
Maven modules.
These suites will be run as frequently as possible. Most likely, this will mean on a timed
basis because it is likely they will consume too much time to be run on commit. Here, "too
much time" is defined as taking longer than the likely interval between commits/pushes to
the monitored repository.
Timed test runs also sometimes run the suites when no code changes have occurred - this
can provide useful information when identifying flaky tests.
13
Agree The Language You Use For Tests
The testing pyramid is a widely recognized diagram of how testing should be approached.
It shows large numbers of unit tests at the bottom, with a smaller number of integration tests
above them and a yet smaller number of system tests at the peak. Often, some clouds of
manual testing are added at the top.
This diagram has probably been drawn thousands of times. Although unit tests will appear at
the bottom of each version, the words used at the other levels will vary wildly.
Even when the same words are used the meanings attached to them might be different.
14
Agree The Language You Use For Tests
Although people might nod when you discuss "unit tests", "integration tests", "system tests",
"end 2 end tests", "service tests", there is no guarantee that they are thinking of the same
thing as you.
Depending on who you speak to, a "unit test" might be anything from a word document full of
instructions, "any test written by a programmer", through to various formal (but by no means
authoritative) definitions that appeared in text books.
Unit Tests
A fairly tight definition of unit tests is now in common use in the Java community. We
recommend that you and your team use this definition.
Note: Although writing your tests first is often a very good idea, a test that meets the other
criteria is still a unit test regardless of when it was written.
When we talk about "unit" testing, what constitutes a unit isn't necessarily that obvious.
A somewhat circular definition is that a unit is the smallest thing that makes sense to test
independently.
It will often be a single class, but this is not necessarily the case. It may make sense to treat
a group of classes as a unit (particularly if most of them are non-public) or occasionally even
a method.
If we accept that a unit is a small thing, and that we'll know it when we see it, then we can
see that the criteria for being a unit test largely matches the criteria we put forward for the
compile suite.
The only difference is that the compile suite does not care about isolation.
If we choose to write a test that tests two (or more) units in tandem, it still belongs in the
compile suite if it meets the other criteria.
15
Agree The Language You Use For Tests
System Tests
System tests are also fairly well-defined. They are tests that exercise the overall system - i.e
all your code and all the code it interacts with in a realistic environment.
Integration Tests
Integration tests are harder to define. They occupy the large space of everything that doesn't
fit the unit or system tests definitions.
The two following diagrams show how this terminology fits into our world of test suites.
This document will use the terminology unit test, Integration test and System test as shown
in these diagrams.
For clarity, it will sometimes state exactly what is being tested when discussing integration
tests - e.g "test via the REST API of the war file running in Tomcat".
Although it is tedious, this long-hand terminology is clear. It is recommended that you use it
when discussing testing across teams. Within your own team it is likely you will develop a
shorter language you all understand.
16
Agree The Language You Use For Tests
17
Use Coverage As A Tool Not A Target
The tool should work for you; you do not work for the tool.
It is most useful when code coverage is run at the point at which the code and tests are
being written, rather than on a CI server hours later.
Gaps in code coverage highlight areas of code that have not been tested. Some of these
gaps may be expected and intentional, others may be a surprise. It is these surprise gaps
that provide useful information.
Code that has 100% branch coverage may or may not have been tested. Code coverage
tells you that some tests have executed the code, not that they have meaningfully tested it.
Don't let it lull you into a false sense of security.
Some teams set coverage targets that code must meet (75% seems to be a common figure).
Although well-intentioned, this practice is often damaging.
Code coverage is easy to measure. Other properties of tests that are desirable (or highly
undesirable) are not easy to measure e.g.:
For a test to be of value, it must enable refactoring; tests that are tied to one particular way
of solving the problem often have negative value because they must be modified or rewritten
whenever the code is changed. Unfortunately, it is easy to write tests in this way for a
number of months or years before you realize you were doing it wrong.
By concentrating on the one property that is easy to measure, the others are de-
emphasized. But, much worse than this, trying to meet a coverage target can actively push
developers towards writing tests that are tied to the implementation. Bad tests are easier to
write than good tests.
It is probably fair to say that there is a problem when code has only 30% unit test coverage.
On the other hand, if coverage is achieved by setting a target, code with 80% coverage may
be harder to work with than code with a lower figure.
18
Use Coverage As A Tool Not A Target
So don't set targets, instead make sure your team is committed to writing good tests.
A good test is one which helps explains the code, catches regression and doesn't get in the
way when changes are made. Writing the tests before the code can help encourage good
tests and will ensure that code has high coverage.
19
Style
Style
There are many aspects to programming style, from the mundane questions of where to
place braces and new lines through to the more interesting questions of how you design and
structure your code.
This section looks at all of these aspects, starting with the more abstract concerns before
drilling down to the more concrete.
Although static analysis tools can measure aspects of many of the things discussed in most
cases they cannot make break-the-build decisions about whether the code is good. A skilled
human is required to make trade offs and apply discretion.
20
Consider Code Generators Carefully
Summary
Code generators can automatically implement certain types of functionality, saving time and
eliminating the possibility of certain classes of bugs.
Although they have much to recommend them, code generators also have a cost that should
be considered carefully before incorporating them into your project.
Favor generators that allow a clear separation between generated and non-generated
functionality, but make sure you understand the trade-offs you are making before including
any generator into your project.
Details
Code generators can be grouped into three general types:
Boilerplate generators
Compile time annotation processors
Runtime generators/frameworks
Boilerplate Generators
Boilerplate generators are the simplest form of code generation. They can be further split
into :
Generators that insert code into existing classes (e.g. methods auto generated by an
IDE).
Generators that produce scaffolding that is checked into version control and modified.
Generators that produce new classes from a model. The generated code is not normally
checked into version control.
We recommend that the first type are using sparingly, if at all. This is discussed further in
"Know How to Implement Hashcode and Equals".
Generating code from a model (such as a schema or grammar) can be a useful approach as
long as the generated code is not modified and is packaged separately. If generated and
non-generated code are packaged within the same module then this can start to cause
friction (see below).
21
Consider Code Generators Carefully
Most use the annotated classes purely as input, from which new classes are generated.
Often, the new classes extend or implement the annotated class or interface but remain
separate. These are really just a subset of model based boilerplate generators where the
model input model is annotated Java classes.
Some (such as project Lombok) update the annotated classes themselves, adding additional
behavior. This is likely to increase both surprise and friction which are discussed below.
Downsides
There are clearly a lot of upsides to code generators, so why wouldn't they always make
sense?
Surprise
If you generate code at compile or runtime the you are no longer programming in Java.
You are programming in an augmented Java that does things that developers maintaining
the code may not be aware of.
It may break fundamental assumptions that programmers have about what can or cannot
happen within their code.
Runtime generators will usually generate more surprise than compile time systems - they
add an element of magic that breaks the usual Java rules. Runtime generators also often
weaken type safety, moving classes of problem a developer would normally expect to occur
at compile time to runtime.
The first time a developer encounters a code generator in a project, everything it does will be
surprising.
After a period of learning, most of the surprise should go away but each developer will need
to go through this learning period. The learning involved can be significant - gaining a
complete understanding of framework such as Spring is, for example, a significant effort.
22
Consider Code Generators Carefully
The most worrying problem is when there is still some surprise left after the initial learning
period.
If you find yourself asking the question "could this be because of the code generator?" when
something unexpected happens with your system, and having to eliminate that possibility
each time, then you have introduced a very real cost into your project.
Friction
Code using compile-time generators will not import cleanly into IDEs unless the IDE
understands how to run the generator. Even when the system is supported by an IDE it may
require plugins to be installed, configuration options to be set etc.
The amount of friction, and how often it is encountered, will depend on the IDE and the
quality of the support. There may be little friction and it may only be encountered when a
new developer joins a project. Or it may be considerable and triggered each time code is
cleaned.
The most effective way to reduce the friction is to package the generated code separately
from the code that depends upon it. The generated code then becomes a normal binary
dependency and the fact that it is automated becomes an internal implementation detail.
While this works well, it may also have a downside. It may create artificial modules. If the
code was not auto-generated, would it have made sense to package it as a separate
module?
Runtime generators do not usually introduce much friction, although sometimes issues might
be experienced if javaagents are not present when running tests from the IDE.
The Trade-off
So those are the issues.
Surprise and friction sound like minor concerns compared to the promise of functionality for
free, but their impact can be significant.
Whether or not it makes sense to introduce a code generator often depends on how much it
will be used. If there is a large amount of functionality that can be auto-generated then it
probably makes sense, if the amount is relatively small it may be best to stick with vanilla
Java.
23
Optimise For Readability
Details
Many new programmers worry about the performance of each method they write, avoid code
they expect to be inefficient and write in a style that attempts to minimize object allocations,
method calls, assignments or other factors they expect to have a cost.
Within the context in which we work, performance should be one of the concerns considered
last. Instead, attention should be paid to making code as simple and readable as possible.
If a performance issue arises, profiling should be used to identify where the problems
actually lie.
This does not mean the performance should be disregarded completely, but it should always
be trumped by code readability and simplicity until it can be proven that there is a real
benefit to optimization. Where code can be written in a more efficient manner without any
increase in complexity or trade-off with readability then the (presumed) more efficient code
should be preferred.
24
Prefer Readable Code To Comments
If you are about to write a comment, first think if there is a way you could change the code
so that it would be understandable without comments.
Details
From Clean Code by Robert C Martin.
"Nothing can be quite so helpful as a well-placed comment. Nothing can clutter up a module
more than frivolous dogmatic comments. Nothing can be quite so damaging as an old crufty
comment that propagates lies and misinformation."
Comments should be used only to explain the intent behind code that cannot be refactored
to explain itself.
Bad
Better
if (employee.isEligibleForFullBenefits())
A comment is only useful if it explains something that the code itself cannot.
This means that any comments you do write should provide the why, not the what or the
how
Bad
25
Prefer Readable Code To Comments
Better
Better still
if (requiresRootPrivileges(port)) {
throw new InvalidPortError(port);
}
Here, the functional intent has been captured in the method name, the comment has been
used solely to provide some context as to why the logic makes sense.
The comment arguably still adds value however - if nothing else it gives a reader unfamiliar
with the topic two key phrases to search for on the web.
26
Javadoc Judiciously
Javadoc Judiciously
Summary
Javadoc can help document code but often there are better ways to do so. Think carefully
before deciding to write it.
Details
Javadoc is Good
Javadoc is invaluable for external teams that must consume your code without access to the
source.
All externally consumed code should have javadoc for its public methods.
Ensure that all such javadoc concentrates on what a method does, not how it does it.
Javadoc is Bad
Javadoc duplicates information that ought to be clear from the code itself and carries a
constant maintenance cost.
Do not Javadoc code that will be consumed and maintained only by your immediate team.
Instead spend effort ensuring that the code speaks for itself.
27
Remember Kiss And Yagni
Create only the functionality you need now - not what you think you might need in the future.
Details
The KISS (Keep It Simple, Stupid) and YAGNI (You Ain't Going To Need It) acronyms
provide good advice that is worth remembering while coding.
KISS advises that we keep our code and designs as simple as possible.
Few people would disagree with this, but unfortunately it is not always obvious what simple
means.
All of the above are reasonable definitions of simple. None of them is the single definition
always makes sense to follow.
Recognizing simple isn't easy and keeping things simple takes a lot of work.
If we could somehow measure the complexity of our software, we would find that there is
some minimum value that each piece of software must contain.
Real programs will always contain this inherent complexity plus a bit. This extra complexity
is the accidental complexity we have added because we are less than perfect.
Telling accidental complexity apart from inherent complexity is of course also hard.
Fortunately YAGNI gives us some useful advice on how to keep things simple without having
to tell accidental and inherent complexity apart.
28
Remember Kiss And Yagni
The more a system does, the higher its overall complexity will be. If we make a system that
does less, it will be simpler - it will have less inherent complexity and less accidental
complexity
Your goal is, therefore, to create the minimum amount of functionality that solves the
problems you have right now.
Don't implement things because you think you might need them later. Implement in the
future if you need it.
Don't try and make things "flexible" or "configurable". Make them do just what they need
to do - parameterize them at the point you have a need to do so.
If you create more than the minimum amount of functionality, you will have more code to
debug, understand and maintain from that point forward until someone has the confidence to
delete it.
29
Prefer Composition
First consider using composition, then fall back to using inheritance only when composition
does not seem to be a good fit.
Details
Composition means building things by adding other things together. Inheritance is building
things by extending behavior based on an existing class by creating a child classes.
To take a minimal example- If there is a requirement for a class to accept and store String
values, some programmers new to Java will reach for inheritance as follows:
class UsesComposition {
Despite requiring more code, an experienced Java programmer would not even consider the
first approach. So why is it that the second version is preferable?
30
Prefer Composition
There are several overlapping explanations, we'll start with the most abstract and move on
to more practical ones.
So favoring composition over inheritance is just one specific instance of the more general
advice to favor weak relationships between our classes.
Using inheritance makes sense when there is an IS-A relationship there but it is an
inappropriate mechanism to use purely for reusing code.
The interface to our class includes all sort of methods from ArrayList such as:
clear
remove
contains
Do these methods make sense for our class? If someone calls them, could it interfere with
the logic in performBusinessLogic ?
We don't know enough about what our example class is meant to do to answer these
questions definitively, but the answer is most likely that we would prefer not to expose these
methods.
If we switch from ArrayList to some other list implementation this is visible to the classes
clients. Code that previously compiled may now break even if no methods specific to
ArrayList are called - the change of type alone might cause compilation failures.
31
Prefer Composition
Java doesn't support multiple inheritance so we only get to pick one thing to extend. If our
class also needed to store Integers then inheritance isn't even an option so we'd have to use
composition.
class MyClass {
private final SomeDependency dependency;
MyClass(SomeDependency dependency) {
this.dependency = dependency;
}
It is easy to inject a mock into MyClass . Tricks exist to isolate the code in
MyUntestableClass from SomeDependency for the purpose of unit testing, but they are far
more involved.
Inheritance is Static
Inheritance sets a fixed relationship between concrete classes at compile time. With
composition it is possible to swap in different concrete classes at runtime.
Interface Inheritance
32
Prefer Composition
In fact, the design choice you often have to make is between implementation inheritance or
a combination of composition and interface inheritance.
In these situations, the advice is still to prefer the approach that uses composition.
ProcessorUpperCaseDecorator(Processor child) {
this.child = child;
}
@Override
public void process(String someString) {
child.process(someString.toUpperCase());
}
@Override
public void process(String someString) {
super.process(someString.toUpperCase());
}
With the composition based version we can decorate any Processor . With the inheritance
version we would need to re-implement the decorator for each concrete type to which we
wished to add the upper case behavior.
33
Prefer Composition
Almost anything that can be achieved with implementation inheritance can also be achieved
using the combination of interface inheritance and composition.
Implementation inheritance has one single advantage over composition - it's less verbose.
So implementation inheritance should be used when both of the following conditions are
met
34
Keep It Solid
Keep It SOLID
Summary
The SOLID acronym provides some guidance on design that you should follow.
Details
".. you should design modules that never change. When requirements change, you extend
the behavior of such modules by adding new code, not by changing old code that already
works."
Robert Martin
An indication that you might not be following this principle is the presence of switch
statements or if/else logic in your code.
One indication that you are breaking this principle is the presence of instanceof statements
in your code.
35
Keep It Solid
The Interface Segregation Principle states that clients should not be forced to implement
interfaces they don't use; prefer small, tailored interfaces to large, catch-all ones.
One indication that you might be breaking this principle is the presence of empty methods or
methods throwing OperationNotSupportedException in your code.
Abstractions should never depend upon details. Details should depend upon abstractions.
1. Package the interfaces a 'high-level' component depends upon with that component
2. Package the interface a component depends upon separately from both the client and
implementation
This first approach is classic dependency inversions (contrast it with the traditional approach
of have the high level component depend upon the lower layers).
The second approach is known as the "Separated Interface Pattern". It is a little more heavy
weight, but also more flexible as it makes no assumption about who should own the
interface.
An indication that you are breaking this principle is the presence of package cycles within
your code.
36
Keep Your Code Dry
Every time you copy and paste code, flick yourself in the eye. This is a great disincentive to
doing it again but over time may cause blindness.
Details
If the same logic is required more than once then it should not be duplicated; it should
instead, be extracted to a well named class or method.
This will be both easier to read and easier to maintain because a change will only be
required in one place should the logic need to change.
Bad
class Foo {
private int status;
private boolean approved;
foo() {
if (status == 12 || approved) {
doFoo();
}
}
bar() {
if (status == 12 || approved) {
doBar();
}
}
}
Better
37
Keep Your Code Dry
class Foo {
private final static int PRE_APPROVED = 12;
foo() {
if (isApproved()) {
doFoo();
}
}
bar() {
if (isApproved()) {
doBar();
}
}
private isApproved() {
return status == PRE_APPROVED || approved;
}
}
Things are a little trickier when we have similar but not identical logic.
Although it is quick and easy, the worst thing we can do is copy and paste.
Terrible
38
Keep Your Code Dry
This seemed quick and easy now, but is the start of a codebase that will suck time each time
we try to understand or change it.
A straightforward but very limited approach to re-use code is to introduce Boolean flags.
Not great
If we introduce an interface:
interface WidgetAction {
void apply(Widget widget);
}
Better
39
Keep Your Code Dry
The Java 7 version is quite verbose due to the anonymous inner class boiler plate.
Arguably, Boolean flags might be preferable for very simple cases such as this but, if we
extract the logic in performSideEffect and performDifferentSideEffect methods into top-
level classes implementing WidgetAction , then the Strategy version becomes compelling.
In Java 8, there is little question that the Strategy pattern is preferable in even the simplest
of cases.
40
Keep Your Code Dry
We do not need to introduce our own interface - the built-in Consumer<T> is enough. We
should consider introducing one if the doThings method were exposed publicly or if the
logic in performSideEffect was complex enough to pull into a top-level class.
As a pipeline
41
Prefer Reversible Decisions
Details
Many of the decisions you make while designing your code will eventually turn out to be
wrong.
If you can make your decisions reversible by containing their consequences and adding
abstractions then this future change will not matter.
For example - if you introduce a third party library and reference it throughout your code,
then you have made high the cost of reversing the decision to use that library. If you
constrain it to a single location and create an interface for it, the cost of reversing the
decision is low.
But don't forget KISS and YAGNI - if your abstractions complicate the design then it is better
to leave them out.
42
Make Dependencies Explicit
Always inject dependencies into a class using its constructor. Do not use other methods
such as setters or annotations on fields.
Details
Code is easier to understand if the interfaces and classes that each object depends on are
conspicuous and visible.
The most visible dependencies are the ones that are injected into a method as a parameter.
Less visible are the ones stored as fields but, depending on how those fields are populated,
the dependencies can still be relatively easy to discover.
Constructor Injection
Constructor dependency injection clearly communicates an object's dependencies in a
single location and ensures objects are only ever created in valid states. It allows fields to be
made final so that their life cycle is unambiguous.
Setter Injection
Setter injection increases the number of possible states an object could be in. Many of those
states will be invalid.
Bad
43
Make Dependencies Explicit
Better
Here, it is clear that we must supply a Bar as we are unable to construct the class without
it.
Field Annotations
While annotations on fields seem convenient they mean that the dependency will not be
visible in the public API. They also tie construction of your class to the frameworks that
understand them and prevent fields from being made final.
If you are working with a dependency injection framework such as Spring, either move
construction of your objects into configuration classes or restrict the use on annotations to
constructors. Both methods allow your classes to be constructed normally and ensure that
all dependencies are visible.
Hidden Dependencies
44
Make Dependencies Explicit
Anything that is not injected into a class using a constructor or as a method parameter is a
hidden dependency.
They are pulled in from Singletons , ThreadLocals , static method calls or by simply calling
new .
Bad
Here we must ensure that the Database class is in a valid state before calling the doThings
method of the code below, but we have no way of knowing this without looking through every
line of code.
Better
By definition, hidden dependencies are hard to discover but they have a second issue - they
are also hard to replace.
Seams
Seams are a concept introduced by Michael Feathers in "Working Effectively with legacy
code"
45
Make Dependencies Explicit
He defines it as:
"a place where you can alter behavior in your program without editing in that place."
This approach introduces a seam but does not address our concerns around visibility. The
dependency remains hidden.
If we used this approach, our codebase would remain hard to understand and we would find
ourselves constantly fighting test order dependencies.
With constructor injection, we gain a seam and make the dependency visible. Even if
Database is a singleton, we are still able to isolate our code from it for testing.
46
Prefer Immutable Objects
Details
Mutable state makes programs harder to understand and maintain.
When objects are short-lived, and do not leave method scope, mutable state causes few
problems. Writes and reads will be close together and there will be a clear order in which
this happens.
If an object escapes from a method then it may be accessed from more than one location
within the code.
We must start by assuming that anything that can happen to these objects will. We can only
confirm that certain situations do not occur by examining the whole program.
The set of things that might happen to an immutable object is far smaller than for a mutable
one. By constraining how long lived objects can behave we have made things simpler. There
are fewer possibilities that we must consider.
Unfortunately, it is not always easy to tell from a class definition what the lifecycle of objects
of that type will be. Perhaps only short-lived instances will be created. Perhaps only long-
lived ones. Perhaps a mixture of the two.
47
Prefer Immutable Objects
@Override
public int hashCode() {
return Objects.hashCode(id);
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Foo other = (Foo) obj;
return Objects.equals(id, other.id);
}
}
We would need to search our codebase for all usages of it to establish the following :
Writes to longs are not atomic and nothing within Foo itself establishes a happens-before
relationship between the field write and read.
If setId and getId are ever called from different threads we might get back stale or
garbage values.
The hashcode of this class relies on a mutable field. If we modify it after we place it in a set
then our program will not behave as we expect.
48
Prefer Immutable Objects
Even if our program behaves correctly, we need to do work in order to understand how it
functions.
setId can be called at any point after the object is created. We can, therefore, only
understand how data flows through our program by looking for all calls to setId - perhaps
there are several, perhaps there is only one. The only way we can discover this is by
examining the entire program.
Immutable Objects
If we can make our objects immutable we gain guarantees that mean we do not need to
worry about how our objects are used.
@Immutable
public final class Foo {
private final Long id;
@Override
public int hashCode() {
return Objects.hashCode(id);
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Foo other = (Foo) obj;
return Objects.equals(id, other.id);
}
}
It is inherently thread-safe.
49
Prefer Immutable Objects
We know that whatever value we construct it with will remain until it dies. There is only one
possible point where data is written so we do not need to search for others.
Annotations
This does not in any way change the object's functionality but provides a way to
communicate the intent of this being an immutable class. Static analysis tools such as
Mutablility Detector can check if this intent has been violated.
We can tell at a glance that Foo is immutable as it has final fields of a well known
immutable type.
The final keyword ensures only that the reference a field points to will not change.
If the field were of type Bar then we would not know if it were mutable or not without
examining Bar to see if it too were immutable. Even if we were not using a static analysis
tool the use of the Immutable annotation would make this assessment faster.
Instead of updating the state of immutable objects, we create new instances that retain the
state we do not wish to modify.
This pattern seems strange to some Java programmers at first, but the programming model
is similar to how the familiar String class works.
@Immutable
public final class Bar {
private int anInt;
private String aString;
@CheckReturnValue
public Bar withAnInt(int anInt) {
return new Bar(anInt, this.aString);
}
@CheckReturnValue
public Bar withAString(String aString) {
return new Bar(this.antInt, aString);
}
}
50
Prefer Immutable Objects
Instances of Bar with new values can be obtained by calling withAString and withAnInt .
The call here to withAnInt achieves nothing because the return value is not stored. Most
likely, the programmer intended to write:
If you know that a class will only ever be used to create short-lived, local objects then you
might consider making it mutable. But you must weigh this against the additional work
required to ensure that the class is only ever used in this fashion as the codebase grows.
Options exist to auto-generate both immutable and mutable classes, thereby removing
mutable objects' main advantage. Two of these options are discussed further in "Know How
to Implement Hashcode and Equals".
Mutable objects used to be the norm in Java. As a result, many common frameworks require
mutable objects. Persistence and serialization frameworks often require Java beans with no
args constructors and setters. Other frameworks might require you to use two-stage
construction with a lifecycle method such as init.
It is not always highlighted in the documentation but some long standing frameworks have
been updated to support immutable objects.
Jackson for example now allows constructors and factory methods to be annotated :-
51
Prefer Immutable Objects
@JsonCreator
public Foo(@JsonProperty("x") int x, @JsonProperty("y") int y) {
this.x = x;
this.y = y;
}
}
Other frameworks, such as Hibernate, can only be used with classes that provide a default
constructor. Although they can be configured to set fields directly without the need for setters
this causes more problems than it solves.
If you are tied to a framework that requires mutability then you will need to use mutable
objects where you interface with that framework.
52
Use A Consistent Code Layout
Detail
The way in which Java code is formatted and laid out is largely a matter of personal
preference.
Some styles (such as omitting braces in conditional statements) can arguably make certain
types of bug slightly more likely.
Others might require more work to keep the code compliant (such as aligning fields into
columns) but, to a first approximation, no particular scheme is greatly superior to any other.
Every codebase should, however, have a single agreed formatting style which is consistently
applied and is understood by everyone working on that codebase.
This prevents commit wars in which different team members re-format things to their
personal preference. It also makes code easier to understand as there is a cognitive cost for
the reader if formatting changes radically from file to file.
Unless there is already broad agreement across teams about how things should be
formatted, trying to enforce one official set of rules is likely to create more ill will than benefit.
A global preferred reference should therefore be set, but teams should be free to deviate
from this as they see fit as long as a consistent style is used for the code they maintain.
These formatting rules are well thought out, clearly documented and not overly prescriptive.
53
Use A Consistent Code Layout
We will not describe them in any detail here, but code formatted to these rules will look
something like the following :-
class Example {
int[] myArray = {1, 2, 3, 4, 5, 6};
int theInt = 1;
String someString = "Hello";
double aDouble = 3.0;
switch (a) {
case 0:
Other.doFoo();
break;
default:
Other.doBaz();
}
}
void bar(List<Integer> v) {
for (int i = 0; i < 10; i++) {
v.add(new Integer(i));
}
}
}
However, we suggest that the guidance in the Google guide on when to write Javadoc is
ignored in favor of our own.
Tabs may appear differently depending on how an editor is configured. This will result in
constant reformatting as different programmers adapt the file to their editor settings. Spaces
avoid this problem.
In some languages (e.g. JavaScript before the rise of code minifiers) tabs have/had an
advantage as they reduced the size of the source file compared to using multiple spaces.
The increase in size of the source file is of no relevance for Java.
54
Use A Consistent Code Layout
There are various arguments about the supposed advantages of this style, but we suggest
its use mainly because it is common in the Java community.
Although simple if else statements can be more concisely written by omitting the braces
we suggest that they are always included. This reduces the chance of a statement being
placed outside the conditional when this was not the intent.
55
Group Methods For Easy Comprehension
In addition to arranging by accessibility, they should also be ordered into a logical flow.
Detail
This scheme tries to achieve two goals:
To achieve the 2nd goal, methods should be arranged into logical groups, with methods
always appearing above the ones they call.
The two goals clearly conflict because grouping the public API methods together at the top
of the file prevents grouping them with the implementation methods that they used. If this
causes a large problem it may be an indication that the class has too many responsibilities
and could be refactored into one or more smaller classes.
Questions of the "correct" location of a method will also occur when an implementation
method is called from multiple locations or methods have recursive relationships. There is, of
course, no one right answer and any ordering that broadly meets the second goal may be
used.
Constructors and static factory methods should usually be placed first in the class. The fact
that a method is static should not, however, generally influence where it is placed.
Example
56
Group Methods For Easy Comprehension
private int a;
Layout() {...}
Fields should always be placed at the top of the class before any methods.
57
Keep Methods Small And Simple
Details
Small things are easier to understand than big things. Methods are no different.
One way to measure the size of a method is via the number of lines of code it contains.
As a guide methods should not usually be longer than 7 lines in length. This is not a hard
rule - just a guide of when to feel uncomfortable with a method's size.
Another way to gauge the size of a method is to see how many possible paths there are
through it. The Cyclomatic complexity of a method gives a measure of this - it will increase
as the amount of conditional logic and number of loops grows.
As a guide, methods should not usually have a complexity above 5. Again, this is not a hard
rule, just a guide of when to feel uncomfortable.
Your code will naturally contain some methods that are larger than others - some concepts
are inherently more complex than others and the implementation will not become simpler if
broken down further or expressed in a different way.
But most large methods can be made smaller in one of three ways :
Bad
58
Keep Methods Small And Simple
while (names.hasMoreElements()) {
String name = (String)names.nextElement();
String value = request.getHeader(name);
httpHeaders.put(name.toLowerCase(), value);
}
return httpHeaders;
}
Better
59
Keep Methods Small And Simple
Re-expressing logic
Terrible
Better
Neither of the above changes alter the structure of our program or even affect the signature
of the method. Both still reduce both line count and complexity while increasing readability.
Simplifying things with a series of higher impact changes that extract a model of our domain
is, however, often the best approach.
60
Keep Methods Small And Simple
It is difficult to guess what this model might look like for our contrived example, but is likely
that this conditional logic could be replaced with polymorphism.
enum ADomainConcept {
AAA(true),
ABA(true),
CC(true),
FWR(true),
OTHER(false),
ANDANOTHER(false);
boolean isFnardy() {
return isFnardy;
}
}
Some of these new features aren't all that new, but code is still written without them:
61
Methods Should Do One Thing
Details
A useful guide as to whether a function is doing only one thing is given in "Clean Code" by
Robert C Martin.
"another way to know that a function is doing more than one thing is if you can extract
another function from it with a name that is not merely a restatement of its implementation."
Bad
if (importantBusinessLogic()) {
foo.setStatus(FNAGLED);
this.collaborator.collectFnagledState(foo);
}
}
Better
62
Methods Should Do One Thing
63
Avoid Null
Try to limit the times you or your clients need to write the following:
if ( != null ) {
...
}
Details
Although it is likely that libraries and frameworks you interact with will return null, you should
try to ensure that this practice is isolated to third party code.
The core of your application should assume that it does not have to worry about null values.
The null object pattern - when you have something you think is optional
The type-safe null pattern (aka Option, Optional & Maybe) - when you need to express
that an interface might not return something
Design by contract
The pattern is very simple, just provide an implementation of the interface that does
"nothing" or has a neutral behavior. This can then be safely referenced by its clients, with no
need to check for null.
64
Avoid Null
It is a simple pattern. An Optional is basically just a box that can hold either one or zero
values. You can check if the box is empty (using isPresent ) and retrieve its value via a get
method.
Optional should be used whenever a public method might not return a value as part of
normal program flow.
It might not be immediately obvious what value Optional provides over just using null. If you
need to check that an Optional has something in it before calling get , how is this different
from checking a value is not null to avoid a NullPointerException ?
Firstly, if your method declares that it returns Optional<Person> then you can instantly see
from the type signature that it might not return a value. If it only returned Person you would
only know that it might return null if you looked at the source, tests or documentation.
Equally important, if you know that you always return Optional within your codebase when
something might not be present, then you know at a glance that a method returning Person
will always return a value and will never return null.
Finally, the preferred way to use Optionals is not to call the get method or to explicitly check
if it contains a value. Instead the values that are contained (or not contained) in an Optional
can be safely mapped, consumed and filtered by various method on the class.
In the simplest case a possibly empty Optional can be accessed by calling the orElse
method which takes a default value to use if the Optional is empty.
As mentioned, the sweet spot for using Optionals is for the return types for methods. They
should not generally be held as fields (use the null object pattern here instead) or passed to
public methods (instead provide overloaded versions that do not require the parameter).
One objection that is sometimes raised by Java programmers encountering Optional for the
first time is that it is possible for an Optional to be null itself. While this is true, returning a
null Optional from a method is a perverse thing to do and should be considered a coding
error.
Static analysis rules exists that can check for code that returns null Optionals.
Design by Contract
We wish for all code that we control to be able to ignore the existence of null (unless it
interfaces with some third party code that forces us to consider it).
65
Avoid Null
Objects.requireNonNull can be used to add a runtime assertion that null has not been
passed to a method.
Because your core code should generally assume that null will never be passed around,
there is little value in documenting this behavior with tests; assertions add value because
they ensure that an error occurs close to the point where the mistake was made.
JSR-305 provides annotations that can be used to declare where null is acceptable.
Although JSR-305 is dormant, and shows no signs of being incorporated into Java in the
near future, the annotations are available at the maven co-ordinates :-
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
<version>3.0.1</version>
</dependency>
Findbugs
Error Prone
These can be configured to break the build when null is passed as a parameter where we do
not expect it.
Annotating every class, method or parameter with @Nonnull would quickly become tedious
and it would be debatable whether the gain would be worth the amount of noise this would
generate.
@javax.annotation.ParametersAreNonnullByDefault
package com.example.somepackage ;
Once non null parameters have been made the default behavior, any parameters that do
accept null can be annotated with @Nullable .
66
Avoid Null
67
Use Final Liberally
Details
Making parameters and variables that are assigned once final makes a method easier to
understand because it constrains the things that could possibly happen within the code.
It would be reasonable to make all parameters and assign-once variables final, but this
needs to be weighed against the noise created by inserting the final keyword everywhere.
For short methods, whether the benefit outweighs the cost is arguable, but if a method is
large and unwieldy then the case for making things final is much stronger.
At a minimum, everything should be made final within large methods. This may also be
extended to shorter methods at the team's discretion. A blanket policy has the advantage of
being easy to automate/understand. A more nuanced policy is harder to communicate.
When working with legacy code, making parameters and variables final is also a useful first
step in gaining understanding of the method before re-factoring. Methods that have proved
difficult to express in smaller chunks will also become easier to understand when single
assignment variables are final.
68
Provide No More Than One Worker Constructor
Details
Having a single place where fields are assigned during construction makes it easy to
understand the states that class can be constructed in.
Bad
The duplication of values in the above code could be removed but it would remain confusing
because the concern of initializing the class is spread across three locations.
If more fields were added it would be easy to forget to initialize them in the existing
constructors.
69
Provide No More Than One Worker Constructor
Fortunately, we have made all fields final so this would give a compilation error. If the class
was mutable, we would have a bug to discover at runtime.
Better
Fields are now only written in one location, resulting in less duplication.
We can also see at a glance that Foo cannot be constructed with null values. In the
previous version, this could only be determined by scanning three different locations.
70
Avoid Checked Exceptions
Details
Exceptions are for exceptional circumstances - design your code such that they are not
thrown in scenarios that are expected to happen.
This means that they should not be used for normal control flow.
Checked exceptions bloat and complicate code. You should avoid adding them to your API,
except when there is a clear action that the caller can always take to recover from the error
scenario.
If you are working with a library that uses checked exceptions, you can wrap them by re-
throwing a runtime exception.
try {
myObject.methodThrowingException();
} catch (SomeCheckedException e) {
throw new RuntimeException(e);
}
If you have caught an Exception or a Throwable , so are unsure of the exact type, you can
avoid creating unnecessary wrappers using Guava's Throwables.propagate .
try {
myObject.methodThrowingException();
} catch (Exception e) {
throw Throwables.propagate(e);
}
This will wrap checked exceptions and re-throw unchecked exception as is.
71
Avoid Checked Exceptions
72
Specifics
Specifics
This section provides some more specific advice on various Java language features and
gotchas.
A lot of what is covered in this section can be automated by tools such as FindBugs, PMD,
Checkstyle and Sonar.
73
Know How To Implement Hashcode And Equals
Details
It is well known that if you override equals then you must also override the hashCode
method (see Effective Java item 9).
If logically-equal objects do not have the same hashCode they will behave in a surprising
manner if placed in a hash based collection such as HashMap .
By "surprising", we mean your program will behave incorrectly in a fashion that is very
difficult to debug.
The contract for equals is handily stated in the Javadoc of java.lang.Object . We will not
repeat it here or repeat the discussion of what it means, that can be found in Effective Java
and large swathes of the internet. Instead we will look at strategies for implementing it.
Whichever strategy you adopt, it is important that you first write tests for your
implementation.
It is easy for an equals method to cause hard-to-diagnose bugs if the code changes (e.g. if
fields are added or their type changes). Writing tests for equals methods used to be a painful
and time-consuming procedure, but libraries now exist that make it trivial to specify the
common cases (see Testing FAQs).
Don't
This is the simplest strategy and the one you should adopt by default in the interests of
keeping your codebase small.
Most classes do not need an equals method. Unless your class represents some sort of
value it makes little sense to compare it with another so stick with the inherited
implementation from Object.
74
Know How To Implement Hashcode And Equals
An irritating gray area are classes where the production code never has a requirement to
compare equality but the test code does. The dilemma here is whether to implement the
methods purely for the benefit of the tests or to complicate the test code with custom
equality checks.
There is, of course, no right answer here; we would suggest first trying the compare-it-in-the
test approach before falling back to providing a custom equals method.
The custom equality checks can be cleanly shared by implementing a custom assertion
using a library such as AssertJ or Hamcrest.
Effective Java tentatively suggests having your class throw an error if equals is unexpectedly
called
This seems like a good idea but, unfortunately, it will confuse most static analysis tools. On
balance, it probably creates more problems than it solves.
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((field1 == null) ? 0 : field1.hashCode());
result = prime * result + ((field2 == null) ? 0 : field2.hashCode());
return result;
}
75
Know How To Implement Hashcode And Equals
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
MyClass other = (MyClass) obj;
if (field1 == null) {
if (other.field1 != null)
return false;
} else if (!field1.equals(other.field1))
return false;
if (field2 == null) {
if (other.field2 != null)
return false;
} else if (!field2.equals(other.field2))
return false;
return true;
}
Unless your IDE can be configured to produce clean methods (as discussed below) we do
not generally recommend this approach. It is easy for bugs to be introduced into this code by
hand editing over time.
@Override
public int hashCode() {
return Objects.hash(field1, field2);
}
The Objects class also simplifies implementing equals a little by pushing most null checks
into the Objects.equals method.
76
Know How To Implement Hashcode And Equals
@Override
public boolean equals(Object obj) {
if (this == obj) // <- performance optimisation
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass()) // <- see note on inheritance
return false;
MyClass other = (MyClass) obj;
return Objects.equals(field1, other.field1) &&
Objects.equals(field2, other.field2);
}
The first if statement is not logically required and could be safely omitted; it may, however,
provide performance benefits.
Usually, we would recommend that such micro-optimizations are not included unless they
have been proven to provide a benefit. In the case of equals methods, we suggest that the
optimization is left in place. It is likely to justify itself in at least some of your classes and
there is value in having all methods follow an identical template.
The example above uses getClass to check that objects are of the same type. An
alternative is to use instanceof as follows
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (!(obj instanceof MyClass)) // <- compare with instanceof
return false;
MyClass other = (MyClass) obj;
return Objects.equals(field1, other.field1) &&
Objects.equals(field2, other.field2);
}
This results in a behavioral difference - comparing instances of MyClass with its subclasses
will return true with instanceof but false with getClass .
However, if instanceof is used, it is easy for the symmetric property of the equals contract
to be violated if a subclass overrides equals. i.e.:
77
Know How To Implement Hashcode And Equals
a.equals(b) // true
b.equals(a) // false, a violation of symmetry
If you find yourself in a situation where you need to consider the nuances of whether
subclasses are equal to their parents then we strongly suggest you reconsider your design.
Having to think about maintaining the equals contract in a class hierarchy is painful and you
shouldn't need to put yourself, or your team, through this for normal server-side coding
tasks.
In the majority of cases, if you think it makes sense for your class to implement hashCode
and equals , we strongly suggest you make your class final so hierarchies do not need to
be considered.
If you believe you have a case where it makes sense for subclasses to be treated as
equivalent to their parent, use instanceof but ensure that the parent equals method is
made final.
These classes do provide the option of a single line reflection based implementation.
78
Know How To Implement Hashcode And Equals
in your application. If you are confident that you would detect such adverse impacts then
using these methods as initial placeholder implementations may be a reasonable approach.
But in general we suggest you avoid them.
Code Generators
A number of projects exist that can auto-generate value objects at build-time. Two of the
better known options are :
Google auto
Project Lombok
Google Auto
Google Auto will create a subclass with the obvious implementation of an abstract class
annotated with @AutoValue . This implementation will include functioning hashcode and
equals methods.
import com.google.auto.value.AutoValue;
@AutoValue
abstract class Animal {
static Animal create(String name, int numberOfLegs) {
return new AutoValue_Animal(name, numberOfLegs);
}
Animal() {}
This is clearly far less effort than hand crafting a complete Animal class, but there are some
downsides.
Some of the issues with code generators are discussed in "Consider Code Generators
Carefully", which categorized them into friction and surprise.
Here, Google Auto introduces some friction as the code shown above will not compile within
an IDE until the generator has run to produce the AutoValue_Animal class.
79
Know How To Implement Hashcode And Equals
Because it is a value, Animal would normally be implemented as a final class - but we have
been forced to make it abstract. The team behind Auto recommend you add a package-
private constructor to prevent other child classes being created.
Unlike normal Java, the order in which accessors are declared is important because it is
used by the generator to define the order of the constructor parameters. Re-ordering the
accessors can, therefore, have the surprising effect of introducing a bug.
Lombok
Lombok can also (amongst other things) generate full implementations of value objects.
@Value
public class ValueExample {
String name;
@NonFinal int age;
double score;
}
It will alter the class at build-time to produce an implementation along the lines of:
80
Know How To Implement Hashcode And Equals
While Google Value asks the programmer to provide a valid public API for a class, Lombok
creates the public API based on a description of its internal state. The description is valid
Java syntax but has a different meaning when interpreted by Lombok.
Lombok causes some friction. It is not practical to use Lombok without an IDE that
understands it - code using the autogenerated api will appear to be invalid. An IDE plugin
must be installed.
While it (arguably) introduces less friction than auto once the IDE plugin is installed, the
behavior of Lombok is much more surprising. It is easy to explain what Auto does - it
generates a class at build-time that implements an interface you define. It is much harder to
predict or explain what Lombok will do.
Although Lombok requires the programmer to write less code than solutions such as Auto, it
deviates further from normal Java.
81
Know How To Implement Hashcode And Equals
If you consider using a code generator for Value classes, we would recommend you
consider approaches such as Auto before Lombok.
To its credit Lombok does provide an escape route (see "Prefer reversible decisions") in the
form of delombok which allows you to output the generated classes. These can then be
used to replace the annotated originals.
Removing Auto is similarly straightforward - the generated classes can be checked into the
source tree. The artificial abstract class/implementation split can then be removed via simple
refactorings.
82
Do Not Reassign Parameters
Detail
Reassigning to parameters makes code harder to understand and provides no meaningful
advantage over creating a new variable.
If the method is large, it can be difficult to track the lifecycle of a parameter. Even within
short methods, re-using parameters will cause problem. As the variable is being used to
represent two separate concepts, it is often not possible to name it meaningfully.
If another variable of the same type as a parameter is needed, it should be declared locally.
Bad
Better
return desiredStatus ;
}
Parameters may be declared final so that the reader can tell at a glance that its value will not
change.
83
Limit Scope
Details
Bad
Better
Better still
Sometimes, assigning to well-named, temporary variables will result in more readable code
than calling a method inline because it helps the reader follow the data and logical flow.
As a rule of thumb, if you are unable to come up with a name for a variable that does little
more than mirror a method from which its value was retrieved, then the variable should be
eliminated.
84
Limit Scope
85
Prefer For Each Loops To For Loops
Details
The for each loop introduced with Java 5 avoids the potential out-by-one errors of indexed
for loops and is more concise than code using iterators.
Bad
A little better
86
Prefer Maps And Filters To Imperative Loops
Details
Most loop based code can be re-written in a more declarative style using filters and maps.
Java 8 made this easy by introducing lambdas and the streams API, but the same style can
be applied in Java 7 using anonymous inner classes and third party libraries such as Guava.
Filters and maps highlight what the code is intended to achieve. This is less clear in the
imperative implementation.
Bad
Better (Java 8)
87
Prefer Maps And Filters To Imperative Loops
Note that, although the Java 7 version requires more lines of code (in the form of the ugly
boilerplate for the anonymous inner class), the logic of the selectValues method is clearer.
If the logic required in the Predicate or mapping Function is required in multiple places then
it is straightforward to move this to a common location. This is harder to achieve with the
imperative version.
Also note that the method that creates the Predicate has been made static. It is a good idea
to do this, where possible, when returning an anonymous class to prevent a long lived
instance preventing the parent class from being garbage collected. Although the Predicate is
only short-lived in this instance, applying static dogmatically in all cases avoids the overhead
of thinking.
88
Avoid Apis From Prehistory
Details
Java has been around for over 20 years. In order to maintain backwards compatibility, it has
hoarded all manner of APIs that no longer make sense to use. Some of them are handily
marked with @Deprecated annotation, others are not.
Unfortunately, many are still used in university courses and online examples. New Java
programmers may not be aware they have been replaced - a few to watch out for include:
Each of these replacements (except Enumeration ) differ from the originals by not being
synchronized. If you think you need a synchronized collection go away somewhere quiet and
think again.
89
Beware Casts And Generics Warnings
If you find yourself writing one, stop and ask yourself why you are writing it.
What would need to be changed in your code so you did not need to write that cast?
Detail
Java's type system is there to help us - it catches bugs at compile-time and documents our
code, making it easier to understand and navigate.
1. We have reached the limits of Java's type system and the programmer must take
control
2. The overall design of the code is poor
3. The code uses raw generic types
For example:
90
Beware Casts And Generics Warnings
The compiler will not be happy that we have failed to fully declare the type of List we are
dealing with and will (depending on how it's been configured) generate an error or warning
on the line where list is declared e.g.
Make sure that all such warnings are addressed, either by imposing a zero compiler
warnings policy or by configuring the compiler to treat them as errors.
In this case, removing both the cast and the warning is straight forward:
Poor Design
Sometimes, removing a cast or fixing a warning is non-trivial. We have bumped into issue
two - poor design.
For example:
91
Beware Casts And Generics Warnings
Normally, objects placed into a collection should be of a single type or of multiple types
related by a common superclass or interface.
Here, unrelated types have been placed into the same list with a String used to
communicate some sort of information about how "processing" of a widget has failed.
The classic OO fix for this code would be to introduce a ProcessResult interface with two
concrete implementations.
interface ProcessResult {
void doSomething();
}
@Override
public void doSomething() {
result.doSomething();
}
@Override
public void doSomething() {
// do something with result string
}
92
Beware Casts And Generics Warnings
It may also sometimes make sense to use a disjoint union type aka Either .
This technique can be particularly useful as an interim step when reworking legacy code that
uses mixed type raw collections, but can also be a sensible approach when dealing with
error conditions.
Unfortunately, Java does not provide an Either type out of the box but at its simplest it
looks something like:
boolean isLeft() {
return left != null;
}
L left() {
return left;
}
R right() {
return right;
}
Libraries such as Atlassian's Fugue provide implementations with much richer functionality.
Using the simplistic form of Either with Java 7 the code could be re-written as:
93
Beware Casts And Generics Warnings
While most Java programmers will prefer the earlier OO version, this version has two
advantages:
1. It requires no change to the structure of the original code - all we have really done is
make the types document what is happening
2. It requires less code
This pattern can help quickly tame a legacy code base that is difficult to comprehend.
Before we do this, we must make certain that the cast is safe and there is no better solution
to our problem.
Similarly, we may need to sometimes suppress a Generics warning, this can be done by
annotating with @SuppressWarnings e.g.
@SuppressWarnings("unchecked")
<T> T read(final Class<T> type, String xml) {
return (T) fromXml(xml);
}
Here, the compiler has no way of knowing what type has been serialized to the String.
Hopefully the programmer does or else a runtime error will occur.
94
Do Not Use Magic Numbers
Detail
Placing numeric or string literals directly into source code causes two problems:
Bad
Better
If the constants you extract relate to an identifiable concept, create an Enum instead:
Good
95
Do Not Use Magic Numbers
Some coding standards make statements such as "0 and 1 are exceptions to this rule". This
is, however, an oversimplification.
Sometimes 0 and 1 will have a clear local meaning as they are being used as part of low
level code e.g.:
if (list.size() == 0) {...}
But 0 and 1 may also have domain-specific values that should be extracted into constants
like any other literal.
Server-side Java can also often be re-written in a cleaner fashion without the use of numeric
literals, e.g.:
if (list.isEmpty()) {...}
96
Do Not Use The Assert Keyword
Details
Assertions written with the assert keyword are only enabled when the -ea JVM flag is set.
The intent of this flag is to allow the assertions to be enabled in development and testing but
disabled in production to avoid the performance overhead of assertion logic. This is usually a
premature optimization and increases the opportunity for mistakes as the code will behave
differently in development vs production.
Switching off assertions in production also greatly dilutes their value. If a coding error has
been made assertions ensure that it is reported early, close to the bug. If assertions are
turned off in production bugs may propagate silently. This may make their consequences
more severe and will certainly make the issue harder to diagnose.
So for these reasons, unless you are working in a very performance sensitive domain,
assertions should always be enabled.
For always-on assertions third party libraries such as Guava's preconditions provide a better
solution than the assert keyword.
A separate issue is the use of the assert keyword in tests. This is usually the result of a
lack of familiarity with Java and JUnit.
In codebases found in the wild where assert has been used in tests the -ea flag is rarely
set, meaning that the tests can never fail. For tests JUnit's built in assertions or modern test
focused assertion libraries such as AssertJ should always be used.
97
Avoid Floats And Doubles
Detail
Floats and doubles introduce a minefield of rounding and comparison issues. While they are
a sensible choice for some domains where you do not care about rounding errors integers or
BigDecimal are usually a better choice for server-side business code.
The core issue is that floating point numbers are not able to represent many numbers (e.g.
0.1 ).
This leads to unexpected results that may not be caught by simple test cases
But
The simplest solution in this case would be to replace the floats with integer values (i.e. track
the balance in units of cents rather than dollars).
In situations where floats can't be replaced by integers code can be re-written to use
BigDecimal .
98
Avoid Floats And Doubles
Note that although BigDecimal can be constructed from a float this would take us back to
where we started.
The primitive floating point types have performance advantages over BigDecimal that can
be significant in highly numerical domains such as machine learning, physics engines,
scientific applications etc. In these domains the performance benefit may greatly out-weigh
the additional risk of error.
Code using BigDecimal is also inherently more verbose and clumsy than code that uses
primitives. If you are working in a domain where the imprecision of floating point types is
acceptable you might prefer the cleaner code they allow, but be sure you are making this
choice consciously with an understanding of the pitfalls involved.
99
Do Not Use Reflection
Details
Reflection is a powerful tool; it allows Java to do things that would otherwise be either
impossible or require large amounts of boilerplate code.
So why would we want to avoid using a powerful tool that Java provides?
The compiler is our first form of defense against defects and the type system is one of the
most effective tools we have to document our code. We should not throw these things away
lightly.
Although they may make some attempt to take it into account, the additional possibilities
reflection creates for how a program might behave means the tools can no longer provide
rigorous guarantees that they have understood the program. In the presence of reflection
refactorings that would otherwise be safe may change program and analysis tools may
report incorrect results.
100
Do Not Use Reflection
This method is never called, I can safely delete it. Oh. Reflection.
I can safely change the behavior of this private method as I know where it is called from. Oh.
Reflection.
101
Tests
Good Tests
The testing pyramid and different levels of tests are discussed in "Agree the language you
use for tests".
As discussed in that section it is important to maintain a layered strategy with many fast
running unit tests and smaller numbers of integration and system tests. The precise
proportions that work best will vary from project to project but the pyramid formation will
hold.
Although it is important to write tests at all these levels this section mainly concerns itself
with unit tests as these are the ones we write most of and run most frequently.
102
Write Specifications Not Tests
There is an unfortunate tendency for developers to look at tests as a thing you have to do
because it's "best practice". Some extra work to be performed after the real work is done. A
chore.
If we are not writing tests that do this we should stop writing them.
When tests are viewed as a chore to be completed after the code is written only this first
point is considered.
If you have a test that is tied to the code's implementation, to change the way the code is
implemented you have to spend effort changing the test.
If the test must change whenever the implementation changes then we cannot trust that the
test will stop regression. How do we know we did not introduce a bug into the test when we
changed it?
103
Write Specifications Not Tests
Executable Specifications
So how do we make sure we do not write negative value tests? How do we make sure we
write tests that provide the benefits in our list?
To test something you only need to verify that it "does what it does". To specify you need to
describe the important things that it must do in a way that can be clearly understood.
It describes what something must do without making assumptions about how it will do it. It
allows for multiple implementations. If a specification is tied to one implementation then it is
over-specified and will have to change when the implementation does.
Specification First
One simple technique that can help is to write the specification before the code. i.e. TDD.
Next write just enough code to make that test pass (and no more).
Take a moment to see if there is a sensible refactoring that would improve the code, then
write the next test and continue the cycle.
104
Write Specifications Not Tests
Both of these practices are a good idea individually, even if they are not combined.
If we wrote our specification first, but moved in larger steps (possibly because we believed
we knew what our implementation should look like) we would realize our first advantage - a
guarantee that the code we wrote could be tested.
If code is not written with testing in mind then it can be difficult to write a test for it that fits
our definition of a unit test.
We can make our code more likely to be testable by following simple rules such as:
But even if we follow these rules we can still find that it is difficult to test our code if we have
not designed for it. Writing our specification first requires our design to consider testing.
Although we ensured our code was testable, because we moved in large steps with an
implementation in mind we might not achieve the other benefits.
If we were to write our code without first writing a test we might discover we were finished
that our code was difficult to test. The process of writing that code would however have been
easier if we had applied the second technique - moving in small steps.
If we wrote only a small amount of code before executing it and observing the result of each
small code change, we would probably spend less time debugging, be less likely to write
code we did not need and move faster over all.
Even if it is applied rigorously it is entirely possible to write terrible code and specifications.
TDD doesn't mean you can stop thinking.
Despite this, if you have a good understanding of the technologies and domain in which you
are working, TDD is usually the best approach if you wish to optimize for quality.
If you do not understand your domain or technology well you may find writing a specification
first hard.
The classic solution to this problem is to first gain understanding by producing a throw away
spike.
Spikes
105
Write Specifications Not Tests
A spike is just some quick and dirty code to explore how you might tackle the problem. At the
end of the spike you will know if that approach works well or if it is worth looking for
alternative approaches.
By producing a spike, you gain more understanding of both the domain and the technology
you are working with. Even if the conclusion at the end of the spike is that it was a poor
approach, the spike was still useful as it increased your understanding.
Once you have learned what you can from the spike, it should be thrown away and the final
code test driven using the knowledge you have gained.
An alternative is to try to stabilize the spike so that it is fit for use. If you do this, you will
usually end up with something of lower quality than if you had started again.
You will also end up spending more effort on this piece of code over the lifetime of the
project than if you had thrown the spike away.
What you gain for this loss in quality and increase in effort is a faster first delivery.
Sometimes this is a trade-off worth making, sometimes it is not.
106
Think Units Not Methods
What is a Unit?
To think in terms of units we have to first answer the difficult question of what a unit actually
is.
Testing in terms of methods is effectively the same as saying that a unit is a method. It is
easy to show why this does not always work.
If we were to try and write a unit test for the push method of java.util.Stack we might end
up with something like:
@Test
public void testPush() {
Stack<String> testee = new Stack<String>();
testee.push("foo");
assertThat(testee.pop()).isEqualTo("foo");
};
@Test
public void testPop() {
Stack<String> testee = new Stack<String>();
testee.push("foo");
assertThat(testee.pop()).isEqualTo("foo");
};
The problem we are hitting is that we have defined too small a unit. We are trying to
describe the behavior of something that is only useful when it collaborates with other units of
the same size.
If we start thinking of java.util.Stack as our unit then tests become much easier to write:
107
Think Units Not Methods
@Test
public void shouldRetrieveValuesInOrderTheyAreAdded() {
Stack<String> testee = new Stack<String>();
testee.push("a");
testee.push("b");
assertThat(testee.pop()).isEqualTo("b");
assertThat(testee.pop()).isEqualTo("a");
}
We have written a test that, instead of trying to describe what a method does, describes the
behavior of the class as a whole.
The idea that our job is to test methods is common with developers that are new to unit
testing, and is unfortunately re-enforced by some IDEs and tools that provide templates to
generate tests for each method of a class.
As we have seen, for Stack it makes far more sense to consider the behavior of the class
of a whole.
If we were to try to test the java.util.Collections class we would find that it is perfectly
reasonable to treat the sort , reverse , singleton , etc. methods as separate units. Each
one represents a self contained logical behavior.
If we were to inherit the code below without any tests what tests might we write for it?
108
Think Units Not Methods
ThingaMeBob() {
actions = Arrays.asList(new Addition(), new Subtraction());
}
109
Think Units Not Methods
@Test
public void shouldMatchWhenStringIsAdd() {
}
@Test
public void shouldNotMatchWhenStringIsNotAdd() {
}
@Test
public void shouldAddTwoNumbers() {
}
// etc
@Test
public void shouldAddTwoNumbers() {
}
@Test
public void shouldSubtractTwoNumbers() {
}
// etc
At some point we would hopefully question why this code is so over-engineered and
consider refactoring to something simpler like.
110
Think Units Not Methods
if ("minus".equals(s)) {
return a - b;
}
The answer of course is that the test which exercised all three classes through the public
interface of ThingaMeBob proved the most useful. We did not have to change it at all. When
it ran green we knew our refactoring was successful and everything still works.
We deleted the ones for Addition and Subtraction . The smaller units we created were
just implementation detail.
What if we were asked to test drive the desired behavior from scratch? What would we
write?
We would most likely write something that looked like our 2nd simpler version of
ThingaMeBob and a test that looked something like ThingAMeBobTest .
If we were then asked to add support for another 10 operations, we might leave our design
fundamentally the same.
What if a new requirement came for the behavior in ThingAMeBob to be more dynamic, with
different operations being enabled and disabled at runtime?
It would then make sense to refactor to something like our earlier more complex version.
We would already have tests written in terms of ThingaMeBob that describe all supported
behaviors. Should we also fully describe Addition , Subtraction and the other 10
operations with tests as we extract them into classes?
111
Think Units Not Methods
There is no right answer here, but I hope it is clear that the most useful unit that we have
identified is ThingaMeBob . The smaller units are part of just one implementation of the
functionality we require.
If we choose to write tests for each extracted class those tests would have some value.
The test written in terms of ThingaMeBob would do a poor job of describing what each of the
small extracted units does. If a test was failing it wouldn't be instantly obvious which class
the bug was in. If we had to change one of the extracted classes it wouldn't be instantly
obvious which test to run.
So there is definitely value in writing tests for each of the extracted classes. At the same
time, if we were not to do so, that would also be a reasonable decision and it would reduce
the cost of the refactoring.
The ThingaMeBob tests will be fast and repeatable and allow us to work easily with the code.
If we could only have tests at one level, the level we would choose is ThingaMeBob .
So, as a starting point, assume that a unit will be a class, but recognize that this is not a hard
rule.
A unit is really a "single self contained logical concern" - it may make sense to have several
classes collaborate in order to capture that concern - as long as that collaboration provides a
single well defined entry point.
Making units too large may result in tests that are difficult to understand and expensive to
maintain.
As a rule of thumb, if you might reasonably have made one or more classes inner classes of
a different class, perhaps they should be treated as a single unit.
112
Name Tests With A Specification Style
This is superfluous once you get good at writing test names, but in a mixed team it is useful
as it encourages thinking about the test in the right way.
The rest of the name should describe a behavior and, optionally, a scenario (identified by the
word When).
shouldBeEmptyWhenCreated
shouldReturnMostRecentlyAddedItemsFirst
shouldThrowAnErrorWhenItemsRemovedFromEmptyStack
Contrast this with common naming patterns found in some code bases:
emptyStack
testEmptyStack
testPush
These names alone tell us nothing about how a Stack should behave.
isEmptyWhenCreated
returnsItemsInOrderTheyAreAdded
throwsAnErrorWhenItemsRemovedFromEmptyStack
Although more verbose the formulaic should form has an advantage - it provides a clear
pattern to follow.
If a developer knows that a test name must start with should (often because they have seen
this pattern within existing tests) it is hard for them to revert to a different style and write a
test that is not a proposition.
The verbosity of should pays for itself by forcing developers to think about tests in the right
fashion.
Kevlin Henney compares shoulds to training wheels on a bike - a device to help while we are
learning.
113
Name Tests With A Specification Style
This depends on the makeup of the team and how often the team changes.
If the majority of people who are likely to work on the codebase over its lifetime are
accustomed to writing tests in this style then the added verbosity is not worth it. If a
sufficiently large proportion are not then it is probably best for the team to stick with the
convention.
If you're confident the answer is "no" to both then switch to a different style - example style.
In example style the name describes only the "When" part. It does not describe the expected
behavior, e.g.:
emptyStack
oneItemAdded
removalFromEmptyStack
To understand tests named with the example style, you must read the code within the tests.
For this reason, this specification style should be preferred when possible.
On a practical level this avoids the extra overhead of updating test names if method names
are ever refactored.
More subtly, including names can make you think in the wrong fashion - verifying method
implementation rather than specifying unit behavior.
This is not a hard rule - sometimes it will be difficult or impossible to describe a meaningful
behavior without referring to the unit's interface.
The domain language may also overlap with the method names, so you may find yourself
using the same words as are also used as a method name.
114
Name Tests With A Specification Style
115
Pick Examples Carefully
The overall behavior of the component or unit is explained by supplying a series of example
input and output values, or example interactions with other components.
One approach is to look at the possible inputs to the component under test.
We could fully specify our code if we provided the expected output for each possible input.
Usually, this is not practical because the possible range of inputs is far too large. Instead, we
can look for categories of values within the possible range of inputs (e.g valid and invalid)
and pick an example from each one.
However, the best approach is usually not to think in terms of possible inputs and examples,
but to instead think first of the behaviors we would like our code to exhibit.
Once we have identified the behavior we can then pick examples that demonstrate it. The
actual values used are often unimportant - "Make tests easy to understand" discusses some
techniques to make unimportant values less prominent in tests and highlight the important
ones.
Properties are identified that must hold true for all inputs or for a subset of possible inputs
that meet certain criteria. The tests do not contain any example values - just a description of
how they must be constrained. The examples used to check the properties are generated
randomly and only ever seen if the check fails.
The tests describe what is important about the input values. In example testing this
must be inferred by the reader
The tests will automatically find edge cases and bad assumptions made by the
programmer
116
Pick Examples Carefully
There is currently little experience with property-based testing in the Java community, so
questions remain on how best to use it.
One obvious issue is that it introduces randomness, although most frameworks provide
some mechanism to control it and repeat test runs.
A good rule thumb is that test cases covering 0 (or empty), 1 and "many" are likely to be
necessary. There will also be important edge cases, e.g. algorithmic code dealing with
integers might need to consider Integer.MAX and Integer.MIN .
The zero, one many rule defines the minimum number of cases you can hope to consider.
To properly describe your code's behavior will likely require many more.
When test driving, it is usually easiest to start with the zero test case.
Multiple assertions within a test may be an indicator that the test is testing more than one
thing. Multiple assertions should be treated with suspicion, but are not necessarily a problem
e.g.
@Test
public void shouldReturnItemsInOrderTheyWereAdded() {
ArrayDeque<String> testee = new ArrayDeque<String>();
testee.add("foo");
testee.add("bar");
assertEquals("foo",testee.pop());
assertEquals("bar",testee.pop());
}
This test tests only one concern, but uses multiple asserts to do so.
117
Pick Examples Carefully
Once you've tested a concern, don't let it leak into other tests - if you do then those tests are
no longer testing only one thing.
If the contract ever changes so that this side effect is not longer important, all tests will need
to be updated.
Although we shouldn't let a property leak into test cases where it does not belong this does
not necessarily mean that it will be confined to a single test case. It may take several
examples to fully demonstrate a property.
118
Make Tests Easy To Understand
We achieve this in part by choosing clear specification style names for each test case, but
we must also ensure that the code implementing each test case is easy to understand.
Given - create the values and objects required for the test
When - executes the code under test
Then - verifies the output/behavior is as expected
These stages are also sometimes called arrange, act and assert by people particularly
attached to the letter 'a'.
Although it is important that these three stages are visible, trying to rigorously separate them
or label them with comments adds noise to a test.
Bad
@Test
public void shouldRetrieveValuesInOrderTheyAreAdded() {
// given
Stack<String> testee = new Stack<String>();
String expectedFirstValue = "a";
String expectedSecondValue = "b";
// when
testee.push(expectedFirstValue);
testee.push(expectedSecondValue);
String actualFirstValue = testee.pop();
String actualSecondValue = testee.pop();
// then
assertThat(actualFirstValue).isEqualTo(secondValue);
assertThat(actualSecondValue).isEqualTo(firstValue);
}
Better
119
Make Tests Easy To Understand
@Test
public void shouldRetrieveValuesInOrderTheyAreAdded() {
Stack<String> testee = new Stack<String>();
testee.push("a");
testee.push("b");
assertThat(testee.pop()).isEqualTo("b");
assertThat(testee.pop()).isEqualTo("a");
}
If the unit you are testing is a class make this clear by always naming it testee within a
test.
If you need to store an expected value in a variable, call it expected (but don't store it in a
variable just for the sake of it).
If you need to store a result that you will compare against an expected value in variable,
name it actual (but don't store it in a variable just for the sake of it).
If an aspect of the input is important to the test case, highlight it by setting it explicitly in the
test case - don't rely on that value being set in a generic setup method.
Even if the same value is set by default, it is better to re-supply it in the test so it is clearly
visible.
If a particular value is not important, indicate this to the reader by using well-known neutral
values such as "foo" for strings, or use clear names such as someInt or anInt for
variables and methods that supply values.
120
Make Tests Easy To Understand
Bad
@Test
public void shouldXXX() {
MyClass testee = new MyClass();
assertThat(testee.process(0, "", 3))
.isEqualTo(Status.FAIL);
}
Better
@Test
public void shouldXXX() {
int invalidValue = 3;
MyClass testee = new MyClass();
assertThat(testee.process(anInt(), aString(), invalidValue))
.isEqualTo(Status.FAIL);
}
Why is it important that the testee below returns the enum CONTINUE ?
Bad
@Test
public void shouldXXX() {
assertThat(testee.process()).isEqualTo(CONTINUE);
}
@Before
public void setup() {
MyClass testee = new MyClass();
testee.setDefaultProcessState(CONTINUE);
}
Other tests might not need to care about what the default state is, but this test does so we
should write:
Better
121
Make Tests Easy To Understand
@Test
public void shouldXXX() {
testee.setDefaultProcessState(CONTINUE);
assertThat(testee.process()).isEqualTo(CONTINUE);
}
As we start to deal with more complex domain objects, it becomes harder to separate the
important values from the ones that are required to construct valid objects but not of
particular interest to our test. Fortunately, we can use the builder pattern to ease the pain,
reduce duplication, and keep the tests readable.
instead of:
Copy and paste coding is bad in tests as well as production code - the more code there is,
the harder it is to read and a change to a concern will result in shotgun surgery if it has been
duplicated throughout the tests.
Test code must tell more of a story - highlighting what is important and hiding what is not.
Test code should not be as DRY ( Don't Repeat Yourself ) as production code. It should be
DAMP ( contain Descriptive And Meaningful Phrases ).
If refactoring a small amount of code out of a test method into a shared method hides what
is happening, accept the duplication and leave it in place. If it does not affect readability then
refactor mercilessly.
122
Make Tests Easy To Understand
Although JUnit allows you to supply an assertion message this adds noise to the test. Like
comments, these messages should be saved for those occasions when you cannot
communicate using code alone.
Bad
Good
assertEquals(2, actual);
The built in assertions are fairly limited. Alternative assertion libraries such as AssertJ
provide richer functionality and result in more readable code.
123
Understand How To Use Mocks And Stubs
Worker code does stuff. We can test worker code with state based testing - i.e. asserting
that expected values are returned from methods, or objects are left in expected states.
Manager code is harder to test than worker code because we need to make a choice - do
we try to infer its behavior from its outputs using state based testing, or do we use
interaction based testing?
In interaction based testing, we check that objects talk to each other in the expected fashion.
To do this we need to somehow eavesdrop on the conversation. This is achieved by using
objects that impersonate real ones.
Mocking Frameworks
Although it is common to refer to all objects created by a mocking framework as mocks this
is inaccurate.
Dummy object - needs to be present to satisfy a type signature but is never actually
used
Stub - must be present and may supply indirect inputs
Mock - verifies that expected interactions take place
Fake - like a real thing but less heavy - e.g an in memory database
Spy - object that records its interactions with others
Of these only stubs, mocks and spies might be created by a mocking framework.
We will talk about spies in a moment, but most test doubles can be conceptually viewed as
being either a stub or a mock.
The important difference between them is that a mock has an expectation that will cause a
test to fail if it is not met. i.e. if an expected method is not called on a mock the test will fail.
124
Understand How To Use Mocks And Stubs
A stub does not care if it is called or not - its role is simply to supply values.
Traditional Mocks present a code readability dilemma. They define an expected outcome (a
then), but are also part of the fixture required for the test to execute (a given).
// given / arrange
Subscriber subscriber = context.mock(Subscriber.class);
Publisher publisher = new Publisher();
publisher.add(subscriber);
// when / act
publisher.publish(message);
// then / assert
context.assertIsSatisfied();
Spies
Spies record their interactions with other objects.
In practice this means that Spies act as stubs by default, but as mocks when we want them
to.
125
Understand How To Use Mocks And Stubs
// given
Subscriber subscriber = Mockito.mock(Subscriber.class);
Publisher publisher = new Publisher();
publisher.add(subscriber);
String message = "message";
// when
publisher.publish(message);
// then
Mockito.verify(subscriber).receive(message);
When spies act as mocks that must also supply indirect inputs, it is best to make them as
forgiving as possible when supplying values but as specific as possible when verifying.
Lets imagine that, for some reason, the subscribers in our example had to return a positive
integer in order for the code to execute without error. Perhaps there is some sort of assert
statement in the code:
publisher.publish(message);
Mockito.verify(subscriber).receive(message);
We will not discuss the Mockito API in any detail here, but this line:
Mockito.when(subscriber.receive(message)).thenReturn(1);
126
Understand How To Use Mocks And Stubs
Ensures that when the receive method is called on the spy with a string that equals the
message variable, it will return 1 .
If this line was not present the spy would do what Mockito does by default, which is to return
0 .
What would our test do if, due to a bug, receive was called with a different string?
Mockito.verify(subscriber).receive(message);
It would throw an error before it reached this point because the assertion in our production
code would trigger.
Mockito.when(subscriber.receive(anyString())).thenReturn(1);
This pattern of being lenient when supplying values, but specific when verifying also tends to
result in tests that are less brittle when things change.
It can be tempting to also use a mocking framework to stub values instead of using their
constructors and modifier methods. For complex objects using stubs can appear easier than
constructing real ones.
Don't do this.
Mocking frameworks should be used only to isolate our tests from objects with behavior. If
you have values that are difficult to construct consider the test data builder pattern instead
127
Understand How To Use Mocks And Stubs
Sometimes there is no choice about which to use. For example, it is not possible to
meaningfully specify how a cache should behave from its inputs and outputs alone. Other
times we must weigh the pros and cons.
A state-based test for manager code is likely to be less easy to read and understand as it
must rely on the behaviors of the objects the SUT interacts with. The test will also be
coupled to these behaviors and will require changes if those behaviors change - you have
effectively increased the size of the "unit" you are testing as discussed in "Think units not
methods".
Interaction-based testing requires us to peek beyond the unit's external interface and into its
implementation. This carries the risk that we might over-specify and create an
implementation-specific test.
On balance, it is preferable to lean towards state based testing and where possible enable it
in the design of your code. There will, however, be many situations in which you will decide
that interaction based testing is preferable.
128
Understand Your Options For Code Reuse
When people start programming in an OO language for the first time they tend to over-use
inheritance for this purpose before discovering that composition is generally a better idea.
Unfortunately, it is not easy to use composition to reuse code in JUnit tests and this can lead
you to write difficult-to-maintain test class hierarchies.
A small amount of duplication may be preferable to introducing a class hierarchy when other
options do not exist, but some types of test can be reused without inheritance.
Assertions
Code related to assertions is straightforward to reuse outside of class hierarchies. This can
be done trivially, by creating classes containing static assert methods that can be statically
imported (as the built in JUnit assertions now are), or more elegantly by creating custom
matchers for hamcrest or AssertJ.
Object Creation
For small, simple objects, the mother pattern can be used, but this can quickly become a
maintenance issue if the objects become more complex over time.
A better pattern is the Builder pattern, this can have the added advantage of allowing tests to
clearly highlight important and unimportant input.
Repeated Behaviors
If you are using JUnit then repeated section of code within a test can be packaged and re-
used as custom rules.
129
Write Repeatable Tests
If your test does any of these things then it is not a unit test. This is not to say that your test
is not valuable.
130
Only Unit Test Code It Makes Sense To Unit Test
If a compatible in-memory fake database is available then it can be meaningfully unit tested
against that. If no fake is available, there is no value in writing tests that mock out the JDBC
driver - the first level of testing should instead be integration testing against a real database.
There is also little value in explicitly specifying the behavior of very simple boiler plate code
such as get/set methods. The expected behavior is clear without the presence of a test and
their actual behavior ought to be verified by other tests that use the code while testing more
complex logic. If code coverage indicates that these methods have not been executed by
other tests perhaps you can delete them?
131
Testing FAQS
Testing FAQ
How Do I Test a Private Method?
You don't test methods (private or public), you test the behavior of a unit as a whole.
If you cannot exercise the logic of a private method via the public interface, is that logic
actually required? If it is required, and is sufficiently complex that it is causing you testing
pain, then perhaps you should extract that concern into a separate unit that can be tested in
isolation and injected in via the constructor?
If the method is void, it must be performing some sort of side effect that can be checked by
either state testing or interaction testing.
For example, if you are trying to 'test the add method' of collection class, you should
probably instead be writing tests like:
@Test
public void shouldIncreaseInSizeWhenItemsAdded() {
Collection testee = new ArrayList();
assertEquals(0, testee.size());
testee.add("itemA");
assertEquals(1, testee.size());
testee.add("itemB");
assertEquals(2, testee.size());
}
A good solution is to inject a strategy for retrieving the date/time into your class as a
dependency.
Java 8 provides the java.time.Clock class which can be used for this purpose.
132
Testing FAQS
The static factory method fixed will create an instance that represents a constant time.
Other methods provide implementations suitable for production use.
Java 7 does not provide an out of the box class for this purpose so you will need to roll your
own.
This is not a requirement for vanilla JUnit 4 tests, but it is possible that you may need to do
so if you are using a custom runner.
@Test(expected = FooException.class)
public void shouldThrowFooExceptionWhenFeelsLikeIt
Is concise and suffices for simple scenarios, but has a gotcha. If the test method exercises
more than one method of the testee, the expectation applies to the whole test method rather
than the specific interaction with the testee that is expected to throw it.
If data held within the exception is important, it is also not possible to assert on it with this
method.
133
Testing FAQS
@Test
public void shouldThrowFooExceptionWhenFeelsLikeIt() {
try {
testee.doStuff();
fail("Expected an exception");
} catch (FooException expectedException) {
assertThat(expectedException.getMessage(), is("felt like it"));
}
}
This is easy to follow, but a little verbose. It is also easy to forget to include the call to
fail() if you are not test driving your code.
JUnit now provides an alternate solution in the form of the 'ExpectedException' method rule.
This allows for more fine grained exception checking:
@Rule
public ExpectedException thrown= ExpectedException.none();
@Test
public void foo() throws IOException {
thrown.expect(FooException.class);
thrown.expectMessage("felt like it");
testee.doStuff();
}
This is more concise, but breaks the usual given/when/then flow of a test by moving the then
part to the start of the method.
For Java 8 AssertJ provides some custom assertions that can be used without breaking this
flow.
@Test
public void testException() {
assertThatThrownBy(() -> { testee.doStuff(); })
.isInstanceOf(Exception.class)
.hasMessageContaining("felt like it");
}
Although it maintains the flow, the lambda in which the testee is called looks a little ugly.
When it can be used we recommend sticking with the concise expected = format. For more
complex situations it is largely a matter of taste.
134
Testing FAQS
An abstract class is just a dependency that some other code will use - a dependency that
you have made harder than usual to isolate due to your choice to make it an abstract class.
So first off, would your design look better if the functionality was being re-used by
composition rather than inheritance?
Assuming that you can't improve your design by getting rid of the abstract class you can
either:
Treat it as an implementation detail and check that each of its clients behaves as
expected.
Test it in isolation by creating an anonymous concrete class
The first approach will result in tests that are less tied to the implementation, but there will be
repetition between the tests for each subclass.
The second approach will avoid repetition but is tied to the implementation and is likely to be
brittle.
http://www.jqno.nl/equalsverifier/
It checks that a class fulfills the hashcode-equals contract with a single line test that is trivial
to write:
@Test
public void shouldObeyHashCodeEqualsContract() {
EqualsVerifier.forClass(MyValue.class).verify();
}
It does, however, do a very thorough job of checking the contract - including how it interacts
with inheritance. It is non-trivial to make a non-final class conform to the contract.
Although equals verifier does a good job of checking the hashcode equals contract, it has no
knowledge of how you expect the methods to actually behave. If you wish equality to (for
example) be defined by a single ID field only, you must write additional tests that verify this
behavior.
For the common scenario of a class that should be considered equal based on all of its fields
the behavior may be checked in a single test:
135
Testing FAQS
@Test
public void shouldObeyHashCodeEqualsContract() {
EqualsVerifier.forClass(MyValue.class).allFieldsShouldBeUsed().verify();
}
This may become the default behavior in a future version of EqualsVerifier, but must be
specifically specified in 1.7.5
136
Bad advice
Bad Advice
Some truly terrible ideas are commonly circulated as "best practice". In this section we
provide examples and explain why they should be ignored.
137
Single Exit Point Rules
Details
Single exit point is an idea with a long history dating back to the era of liberally applied gotos
and spaghetti code.
In that context, adding constraints on what could happen within a function was helpful.
Knowing that there is only one point that a large function can exit from makes it easier to
understand.
Many modern functional languages continue to either enforce or encourage single exit
points.
Lets look what happens when we are told we must only have one exit point:
if (x == 10) {
retVal = -value;
} else if (x > 0) {
retVal = value + x;
}
return retVal;
}
Multiple exit
138
Single Exit Point Rules
if (x > 0) {
return value + x;
}
return 0;
}
}
There isn't much in it, but the multiple exit point version is easier to comprehend.
Trying to apply the single exit point constraint resulted in an additional local variable to hold
return state. In the multi exit version we can clearly see what is returned when none of the
conditions match. In the single exit version it is slightly less clear as the returned value is
declared at the start of the method then overwritten.
So does this mean that single exit point methods are bad?
No.
if (x == 10) {
retVal = -value;
} else if (x > 0) {
retVal = value + x;
} else {
retVal = 0;
}
return retVal;
}
We've addressed the issues we identified earlier. We only assign to retVal once and it is
clear what is assigned when none of the conditions match.
139
Single Exit Point Rules
Not really.
We have switched from using a statement (if) to working with expressions (i.e. things that
return a value). This allows us to get rid of the additional variable while maintaining a single
exit point.
Is this version clearer than the multi-exit version? That is debatable and ultimately a matter
of personal taste.
The code using ? is terse and some will find it harder to understand.
The multi-exit version is more verbose but its proponents would argue it is easier to
comprehend.
If your personal preference is for the ? operator version, it still does not follow that the
single exit point rule is something you should try to universally apply.
The most likely result is that you will push people towards writing code like the earlier
bloated version of our method. As Java is a largely statement-based language, you will also
encounter logic where the multi-exit version is undeniably clearer.
Martin Fowler and Kent Beck express things nicely in "Refactoring: Improving the Design of
Existing Code"
". . . one exit point is really not a useful rule. Clarity is the key principle: If the method is
clearer with one exit point, use one exit point; otherwise don't"
There is nothing wrong with single exit functions, but only write them when it makes sense to
do so.
140
Always Use A StringBuffer
The performance difference is unlikely to be significant in most cases, but the resulting code
isn't noticeably less-readable - so it is a premature optimization without a cost.
Lets see what happens when we apply this advice when no loop is present:
Is it less efficient?
141
Always Use A StringBuffer
NEW java/lang/StringBuilder
DUP
LDC "Foo"
INVOKESPECIAL java/lang/StringBuilder.<init> (Ljava/lang/String;)V
ALOAD 1
INVOKEVIRTUAL java/lang/StringBuilder.append /
(Ljava/lang/String;)Ljava/lang/StringBuilder;
ILOAD 2
INVOKEVIRTUAL java/lang/StringBuilder.append (I)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
ARETURN
A StringBuilder is created by the compiler behind the scenes to handle the concatenation
so our simpler cleaner code produces identical bytecode to the more verbose option.
The presence of loops in the code may prevent the compiler performing this optimization,
but code without branches will be optimized every time. Although compilers may exist that
do not support this optimization it is unlikely that you will ever use them.
142
Hungarian Notation
For example:
bFlag
nSize
m_nSize
Where b indicates a Boolean type, n an integer type and m_ that the named variable is
a field.
Such notation might be useful if you are reading code printed to paper, but all the information
it provides is readily available in a modern IDE.
Naming things is hard enough without adding additional concerns that the name must
handle.
These types of notation are like comments. They add noise and must be maintained in
tandem with the information they duplicate. If extra effort is not spent to maintain them they
become misleading.
"nowadays, HN and other forms of type encoding are simply impediments. They make it
harder to change the name or type of a variable, function, member or class. They make
it harder to read the code. And they create the possibility that the encoding system will
mislead the reader"
143