Small Functions considered Harmful
In this post, I aim to:
— Shed light on some of the presumed benefits of small functions
— Explain why I personally think some of the benefits don’t really pan out as well as advertised
— Explain why small functions can actually prove counterproductive sometimes
— Explain the times when I do think smaller functions truly shine
General programming advice doled out invariably seems to extoll the elegance and efficacy of small functions. The book Clean Code — often considered something of a programming bible by many — has a chapter dedicated to functions alone, and the chapter begins with an example of a truly dreadful function that also happens to be long. The book goes on to lay blame on the length of the function as its most grievous offense, stating that:
Not only is it (the function) long, but it’s got duplicated code, lots of odd strings, and many strange and inobvious data types and APIs. Do you understand the function after three minutes of study? Probably not. There’s too much going on in there at too many different levels of abstraction. There are strange strings and odd function calls mixed in with doubly nested if statements controlled by flags.
The chapter briefly ponders what qualities would make the code “easy to read and understand” and “allow a casual reader to intuit the kind of program they live inside”, before declaring that making the function smaller will necessarily achieve this purpose.
The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.
The idea that functions should be small is something that is almost considered too sacrosanct to call into question. It often gets trotted out during code reviews, on Twitter discussions, conference talks, books and podcasts on programming, articles on best practices for refactoring code and so forth. This idea made its merry way into my timeline again a few days ago in the form of this tweet:
Fowler, in his tweet, links to his article on function length, where he goes on to state that:
If you have to spend effort into looking at a fragment of code to figure out what it’s doing, then you should extract it into a function and name the function after that “what”.
Once I accepted this principle, I developed a habit of writing very small functions — typically only a few lines long . Any function more than half-a-dozen lines of code starts to smell to me, and it’s not unusual for me to have functions that are a single line of code .
Some people seem so enamored with small functions that the idea of abstracting any and every piece of logic that might seem even nominally complex into a separate function is something that is passionately advocated for.
I’ve worked on codebases inherited from folks who’d internalized this idea to such an unholy extent that the end result was pretty hellish and entirely antithetical to all the good intentions the road to it was paved with. In this post, I hope to explain why some of the oft-touted benefits don’t always pan out the way one hopes and the times when some of the ideas can actually prove to be counterproductive.
Supposed benefits of smaller functions
A number of reasons usually get wheeled out to prove the merit behind smaller functions.
Do one thing
The idea is simple — a function should only ever do one thing and do it well. On the face of it, this sounds like an extremely sound idea, in tune, even, with the Unix philosophy.
The bit where this gets murky is when this “one thing” needs to be defined. The “one thing” can be anything from a simple return statement to a conditional expression to a piece of mathematical computation to a network call. As it so happens, many a time this “one thing” means a single level abstraction of some (often business) logic.
For instance, in a web application, a CRUD operation like “create user” can be “one thing”. Typically, at the very least, creating a user entails creating a record in the database (and handling any concomitant errors). Additionally, creating a user might also require sending them a welcome email. Furthermore, one might also want to trigger a custom event to a message broker like Kafka to feed this event into various other systems.
Thus, a “single level of abstraction” isn’t just a single level. What I’ve seen happen is that programmers who’ve completely bought in to the idea that a function should do “one thing” tend to find it hard to resist the urge to apply the same principle recursively to every function or method they write.
Thus, instead of a reasonably airtight abstraction that can be understood (and tested) as a single unit, we now end up with even smaller units that’ve been carved out to delineate each and every component of “the one thing” until it’s fully modular and entirely DRY.
The fallacy of DRY
DRY and a propensity to make functions as small as possible aren’t necessarily the same thing, but I’ve observed that the latter does many a time lead to the former. DRY, in my opinion, is a good guiding principle, but an awful lot of times, pragmatism and reason are sacrificed at the altar of a dogmatic adherence to DRY, especially by programmers of the Rails persuasion.
Raymond Hettinger, a core Python developer, has a fantastic talk called Beyond PEP8: Best practices for beautiful, intelligible code. This talk is a must-watch, not just for Python programmers but for anyone interested in programming or who programs for a living, because it very incisively lays bare the fallacies of a dogmatic adherence to PEP8, which is the Python style guide many linters implement. That the focus of talk is on PEP8 isn’t so much important than the rich insights one can take away from the talk, many of which are language agnostic.
Even if you don’t watch the entire talk, you should watch this one minute of the talk which draws a frighteningly accurate analogy to the siren call of DRY. Programmers who insist on DRYing up as much of code as possible risk not seeing the forest for the trees.
My main problem with DRY is that it coerces one into abstractions — nested and premature ones at that. Inasmuch as it’s impossible to abstract perfectly, the best we can do abstract well enough insofar as we can. Defining “well enough” is hard and is contingent on a large number of factors.
In the following diagram, the word “abstraction” can be used interchangeably with “function”. For instance, if we’re assuming how best to design the abstraction layer A, we might need to consider the following:
— the nature of the assumptions underpinning abstraction A and how likely (and for how long) they are likely to hold water
— the extent to which the layers of abstractions underlying abstraction A (abstraction X and abstraction Y) as well as any abstraction built on top of abstraction A (abstraction Z) are prone to remain consistent, flexible, extensible and correct in their implementation and design
— the requirements and expectations of any future abstractions (abstraction M) that might be built on top of the abstraction A and any abstraction that might need to be supported beneath A (abstraction N)
The abstraction A we develop is inevitably going to be subject to constant reassessment in the future and in all likelihood partial or even complete invalidation as well. The one overarching feature that will stand us in good stead for the inevitable modification that’d be needed is to design our abstraction to be flexible.
DRYing up code to the fullest extent possible right now would mean depriving our future selves of the flexibility to accommodate any changes that might be required. What we really should be optimizing for is to allow ourselves enough leeway to make the inevitable changes that will be required sooner or later instead of optimizing for the perfect fit straight away.
The best abstraction is an abstraction that optimizes for good enough, not perfect. That’s a feature, not a bug. Understanding this very salient nature of abstractions is the key to designing good ones.
Alex Martelli, the coiner of the phrase
duck-typing and famous Pythonista, has a famous talk titled The Tower Of Abstraction, and the slides are well worth a read.
The fabulous Rubyist Sandi Metz has a famous talk called All The Little Things, where she posits that “duplication is far cheaper than the wrong abstraction”, and thus to “prefer duplication over the wrong abstraction”.
Abstractions, in my opinion, can’t ever be entirely “right” or “wrong” since the line demarcating “right” from “wrong” is inherently blurred and ever-changing. Our carefully handcrafted artisanal “perfect” abstraction is only one business requirement or bug report away from being consigned to the status of “wrong”.
I think it helps to view abstractions as a spectrum as shown in the diagram we saw earlier in this post. One end of this spectrum optimizes for precision, where every last aspect of our code needs to be exactly precise. This certainly has its fair share of benefits but doesn’t serve well for designing good abstractions since it strives for a perfect alignment. The other end of this spectrum optimizes for imprecision and the lack of boundaries. While this does allow for maximal flexibility, I find this extreme to be prone to other drawbacks.
As with most other things, “the ideal” lies somewhere in between. There is no one-size-fits-all happy medium. The “ideal” also varies depending on a vast number of factors — both programmatic and interpersonal — and the hallmark of good engineering is to be able to identify where in the spectrum this “ideal” lies for any given context, as well as to constantly re-evaluate and recalibrate this ideal. Sometimes, given a context, there truly does exist the one best way of doing something. But this context can change anytime and so can this “best way”.
The name of the game
Speaking of abstraction, once it’s decided what to abstract and how, it’s important to give it a name.
And naming things is hard.
It’s considered something of a truism in programming that giving things longer, more descriptive names is a good thing, so much so that some even advocate for replacing comments in code with a function bearing the name of the comment. The idea here is that the more descriptive a name, the better the encapsulation. It’s not uncommon to find codebases littered with
This might probably fly in the Java world where verbosity is the norm, but I’ve never particularly found code with such lengthy names easy to read. What could’ve been, say, 4–5 lines of code is now stashed away in a function that bears an extremely long name. When I’m reading code, seeing such a verbose word pop up suddenly gives me pause as I try to process all the different syllables in this function’s name, try to fit it into the mental model I’ve been building thus far and then decide whether or not to investigate the function in greater detail by jumping to its definition and reading the implementation.
The problem with “small functions” though, is that the quest for small functions ends up begetting even more small functions, all of which tend to be given extremely verbose names in the spirit of making code self documenting and eschewing comments.
As a result, the cognitive overhead of processing the verbose function (and variable) names, mapping them into the mental model I’ve been building so far, deciding which functions to dig deeper into and which to skim, and piecing together the puzzle to uncover the “big picture” becomes rather difficult.
Smaller functions leads to the programmer requiring to write more functions, which then requires them to come up with more names for those functions. Personally, I find keywords, constructs and idioms offered by the programming language much easier from a visual perspective as compared to looking at custom variable or function names. Like for instance, when I’m reading an
if-else block, I rarely ever spend any mental cycles processing the keywords
elseif but spend my time understanding the logical flow of the program.
Interrupting my flow of reasoning with
aVeryVeryLongFuncNameAndArgList is a jarring disruption. This is especially true when the function being called is actually a one-liner that can be easily inlined. Context switches are expensive, whether they are CPU context switches or a programmer having to mentally switch context while reading code.
The other problem with a surfeit of small functions, especially ones with very descriptive and unintuitive names is that the codebase is now harder to search. A function named
createUser is easy and intuitive to grep for, something like
renderPageWithSetupsAndTeardowns (a name held up as a shining example in the book Clean Code), by contrast, is not the most easily memorable name or the most searchable. Many editors also do a fuzzy search of the codebase, so having too many functions with similar prefixes is also more likely to pull up an extraneous number of results while searching, which is hardly ideal.
Loss of Locality
Small functions work best when we don’t have to jump across file or package boundaries to find the definition of a function. The book Clean Code proposes something called The Stepdown Rule to this end.
We want the code to read like a top-down narrative. We want every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions. I call this The Stepdown Rule.
This sounds great in theory but rarely have I seen it play out well in practice. Instead, what I have seen almost invariably is the loss of locality as more functions are added to the code.
Let’s assume we start out with three functions, A, B and C, each which is called (and ergo read) one after the other. Our initial abstractions were underpinned by certain assumptions, requirements and caveats, all of which we assiduously researched and reasoned about during the time of initial design.
Soon enough, let’s say we have a new requirement pop up or an edge case we hadn’t foreseen or a new constraint we need to cater to. We need to modify function A since “the one thing” it encapsulates isn’t valid anymore (or wasn’t ever valid to begin with and now we need to rectify it). In-keeping with what we’ve read in Clean Code, we decide the best way to deal with this is to, well, create more functions that will hide away the messy new requirements that’ve cropped up.
A couple of weeks after we’ve made this change, if our requirements change yet again, we might need to create even more functions to encapsulate all the additional changes required.
Rinse and repeat, and we’ve arrived exactly at the problem Sandi Metz describes in her post on The Wrong Abstraction. The post goes on to state that:
Existing code exerts a powerful influence. Its very presence argues that it is both correct and necessary. We know that code represents effort expended, and we are very motivated to preserve the value of this effort. And, unfortunately, the sad truth is that the more complicated and incomprehensible the code, i.e. the deeper the investment in creating it, the more we feel pressure to retain it (the “sunk cost fallacy”).
While this might be true when the same team that originally worked on the codebase continues maintaining it, I’ve seen the opposite play out when new programmers (or managers) take ownership of the codebase. What started out with good intentions has now turned into spaghetti code that sure as hell ain’t clean anymore, and now the urge to “refactor” or sometimes even rewrite the code becomes all the more tempting.
Now one might argue that, to a certain extent, this is inevitable. And they’d be right. What we rarely talk about is how important it is to write code that will die a graceful death. I’ve written in the past about how important it is to make code operationally easy to decommission, but this is even more true when it comes to the codebase itself.
All too often, programmers think of code as “dead” only if it’s deleted or not in use anymore or if the service itself is decommissioned. If we start thinking about the code we write as something that dies every single time we add a new git commit, I think we might be more incentivized to write code that’s amenable to easy modification. When thinking about how to abstract, it greatly helps to be cognizant of the fact that the code we’re building might probably only be a few hours away from dying (being modified). Thus optimizing for ease of modification of code tends to work better than trying to build topdown narratives of the sort proposed in Clean Code.
Smaller functions also lead to either larger classes or just more number of classes in languages that support Object Oriented Programming. In the case of a language like Go, I’ve seen this tendency lead to larger interfaces (combined with the double whammy of interface pollution) or a large number of tiny packages.
This exacerbates the cognitive overhead involved in mapping the business logic to the abstractions we’ve carved out. The more the number of classes/interfaces/packages, the harder it is to “take it all in” in one fell swoop, which does zilch to justify the maintenance cost of these various classes/interfaces/packages we’ve built.
Proponents of smaller functions also almost invariably tend to champion that fewer arguments be passed to the function.
The problem with fewer function arguments is that one runs the risk of not making dependencies explicit. In languages with inheritance like Ruby, this leads to functions depending on a lot of global state and singletons. I’ve definitely seen Ruby classes with 5–10 tiny methods, all which typically do something very trivial and take maybe a parameter or two as arguments. I’ve also seen a lot of them mutate shared global state or rely on singletons not explicitly passed to them, which is an anti-pattern if ever there was one.
Furthermore, when the dependencies aren’t explicit, testing becomes a lot more complicated into the bargain, what with the overhead of setting up and tearing down state before every individual test targeting our itty-bitty functions can be run.
Hard to Read
This has already been stated before but it bears reiterating — an explosion of small functions, especially one line functions, makes the codebase inordinately harder to read. This especially hurts those for whom the code should’ve been optimized for in the first place — newcomers.
There are several flavors of newcomers to a codebase. Users new to the language or framework in use, users new to the domain, users new to the organization or the team, and also in some cases, the original authors of the codebase who are returning to work on it after a while.
A good rule of thumb, in my experience, has been to keep in mind someone who might check a number of the aforementioned categories of “new”. Doing so helps me re-evaluate my assumptions and rethink the overhead I might be inadvertently imposing on someone new who’ll be reading the code for the first time. What I’ve realized is that this approach actually leads to far better and simpler code than might’ve been possible otherwise.
Simple code isn’t necessarily the easiest code to write, and rarely is it ever the DRYest code. It takes an enormous amount of careful thought, attention to detail and care to arrive at the simplest solution that is both correct and easy to reason about. What is most striking about such hard-won simplicity is that it lends itself to being easily understood by both old and new programmers, for all possible definitions of “old” and “new”.
When I’m “new” to a codebase, if I’m fortunate enough to already know the language and/or the framework being used, the biggest challenge for me is to understand the business logic or implementation details. When I’m not so fortunate and am faced with the daunting task of manoeuvring my way through a codebase written in a language foreign to me, the biggest challenge I face is to walk a tightrope between understanding just enough of the language or the framework in order to be able to make sense of what the code is doing without going down a rabbit hole and at the same time being able to isolate the “one single thing” of interest that I’d need understand to make the necessary progress to move the project to the next stage.
What I’m really hoping for during the times I venture into uncharted territory is to make the least number of mental hops and context switches while trying to find the answer to a given question. Investing time and thought into making things easy for the future maintainer or consumer of the code is something that will have a huge payoff, especially for open source projects. This is something I wish I’d done better earlier in my career and is something I’m very mindful of these days.
Shallow Modules and Classitis
One of the most powerful ideas proposed in the book A Philosophy of Software Design (which was published subsequent to the initial publication of this post) is that of deep and shallow modules.
The best modules are those that provide powerful functionality yet have simple interfaces. I use the term deep to describe such modules. Module depth is a way of thinking about cost versus benefit. The benefits provided by a module is its functionality. The cost of a module (in terms of system complexity) is its interface. A module’s interface represents the complexity that the module imposes on the rest of the system: the smaller and simpler the interface, the less complexity that it introduces. The best modules are those with the greatest benefit and the least cost.
The best modules are deep: they have a lot of functionality hidden behind a simple interface. A deep module is a good abstraction because only a small fraction of its internal complexity is visible to its users. A shallow module is one whose interface is relatively complex in comparison to the functionality that it provides. It is no simpler to think about the interface than to think about the full implementation. If the method is documented properly, the documentation will be longer than the method’s code.
Unfortunately, the value of deep classes is not widely appreciated today. The conventional wisdom in programming is that classes should be small, not deep, Students are taught that the most important thing in class design is to break up larger classes into smaller ones. The same advice is often given about methods: “Any method longer than N lines should be divided into multiple methods” (N can be as low as 10). This approach results in a large number of shallow classes and methods, which add to the overall system complexity.
The extreme of the “classes should be small” approach is a syndrome I call classitis, which stems from the mistaken view that “classes are good, so more classes are better.” Classitis may result in classes that are individually simple, but it produces tremendous complexity from the accumulated interfaces. It also tends to result in a verbose programming style from all of the boilerplate for each class.
A counterargument I’ve heard is that a class with a large number of small methods can still be deep if the public interface it offers only comprises of a handful of methods. While this might be true, it doesn’t invalidate the fact that the class is still fairly complicated for a newcomer to the implementation.
When do smaller functions actually make sense
All things considered, I do believe small functions absolutely have their utility, especially when it comes to testing.
This isn’t a post on how to best write functional, integration and unit tests for a vast number of services. However, when it comes to unit tests, the way network I/O is tested is by, well, not actually testing it.
I’m not a terribly big fan of mocks. Mocks have several shortcomings. For one thing, mocks are an artificial simulation of some result. Mocks are only as good as our imagination and our ability to predict the various failure modes our application might encounter. Mocks are also very likely to get out of sync from the real service they stand-in for, unless one painstakingly tests every mock against the real service. Mocks also work best when there is just a single instance of every particular mock and every test uses the same mock.
That said, mocks are still pretty much the only way one can unit-test certain forms of network I/O. We live in an era of microservices and outsourcing most (if not all) of the concerns not core to our main product to a vendor. A lot of an application’s core functionality now involves a network call or five, and the best way to unit-test some of these calls is by mocking them out.
On the whole, I find limiting the surface area of mocks to the least amount of code to work best. An API call to an email service to send our newly created user a welcome email requires making an HTTP connection. Isolating this request to the smallest possible function allows us to mock the least amount of code in tests. Typically, this should be a function with no more than 1–2 lines to make the HTTP connection and return any error along with the response. The same is applicable when publishing an event to Kafka or creating a new user in the database.
Property based testing
For something that can provide such an enormous amount of benefit with such little code, property based testing is woefully underused. Pioneered by the Haskell library QuickCheck and gaining adoption in other languages like Scala (ScalaCheck) and Python (Hypothesis), property based testing allows one to generate a large number of inputs that match some specification to a given test and assert that the test passes for each and every one of these cases.
Many property based testing frameworks target functions and as such it makes sense to isolate anything that can subjected to property based testing to a single function. I find this especially useful while testing the encoding or decoding of data, testing JSON or msgpack parsing and so forth.
This post’s intention was neither to argue that DRY nor small functions are inherently bad (even if the title disingenuously suggested so). Only that they aren’t inherently good either.
The number of small functions in a codebase or the average function length in itself isn’t a metric to brag about. There’s a 2016 PyCon talk called onelineizer about an eponymous Python program that can convert any Python program (including itself) into a single line of code. While this makes for a fun and fascinating conference talk, it would be rather silly to write production code in the same matter.
The aforementioned advice applies universally, not just to Go. As the complexity of the programs we author has greatly increased and the constraints we work against have become all the more protean, it behooves programmers to adapt their thinking accordingly.
Programming orthodoxy, unfortunately, remains heavily influenced by books written during an era when Object Oriented Programming and Design Patterns reigned supreme. A lot of these ideas and best practices widely promulgated so far have largely gone unchallenged for decades now and direly require reconsideration, especially so since the programming landscape as well as paradigms have evolved vastly in the recent years.
Wheeling out old tropes is not only lazy but also lulls programmers into a false sense of reassurance they can ill-afford.