When dealing with the problem of a long method, the simplest thing you can usually do is to break it into it’s conceptual pieces and have a series of tail called functions that call the next chunk of the computation.
So given this:
1 2 3 4 5 6
1 2 3 4 5 6 7 8 9 10 11 12 13 14
The two major downsides to breaking down a large method this way are:
- The code can be a bear to test as you need fully fleshed out
OtherTypes and need to check vast details of
Things. As we know, often in long methods, small perturbations in inputs can cause very unexpected changes in the outputs, so often, tests will miss things it shouldn’t. You can test bottom-up (from
chunk1), but it will still be hard to ensure you’ve covered the cases you care about.
- The code, while easier to understand than
methodThatIsTooLong, is still not easy to understand in isolation. That is, you still need to understand the whole flow, to know what the pieces are doing.
- Data values that
doChunk3and so on need have to be passed through
doChunk2, even though it could not care less and makes the functions where values are passed through more difficult to understand as you have to realize that these items are pass-through and not actually used by the function itself.
How else to break up a function like this? It’s better to break these into self contained functions with a top level function that coordinates them. That would look more like this:
1 2 3 4 5 6
This has several strong advantages:
doChunkmethods can now be tested independently, as well as their intermediate values.
- Because of this, the testing surface of the whole is much smaller, as the input and output values are now more tightly constrained.
- While you may have to define intermediate value POJOs (and for the
love of everything good don’t use
Pair) the intermediate values are now well defined, and simplifies understanding of the code.
- In languages with multiple return values, the extra return types for the intermediate values can be skipped which can make it easier to do in the first place, even though real intermediate objects in the end do make for more readable code in the end.
- The only arguments needing to be passed to these functions are the data items needed by it’s computation, not all the data the rest of the whole computation needs.
- It tends to reveal more opportunities for further refactorings, limiting mutability, rainbows and unicorns.
So in short, refactoring to leaf methods rather than chained tail calls is a good thing.
See here for other posts on refactoring.