Refactoring with Dependency Inversion and Injection
In this post I want to show how to refactor with the Dependency Inversion Principle. I find it hard to create a mock example with sufficient complexity, so I will just take the cryptocurrency buy software that I have featured in this blog a few times. In each iteration I will show the full code, talk about the problems.
We start with a really simple case. We just have a command line argument which controls how much money we want to invest. It will look like this:
import argparse def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) options = parser.parse_args() print(f"Buying for {options.volume} on market X.") if __name__ == "__main__": main()
The problem here is that we only have one market, and we only have one strategy to buy. Although I have mocked that with a print
call, we only have one particular library which makes the HTTP requests. But we don't really see these parts in the code, it is just in the prose.
Adding some classes
Let us add some classes to add some more complexity such that we get closer to a realistic state of the code. I have added some classes, such that each class owns an instance of the ones that it needs. The high-level structure of the code is now this:
And this is the actual code:
import argparse class CoolRequestLib: def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: print("Cool:", url, self.auth_token) class ShinyMarketplace: def __init__(self, auth_token: str): self.cool_request_lib = CoolRequestLib(auth_token) def place_order(self, volume: float) -> None: self.cool_request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}" ) class TrivialStrategy: def __init__(self, auth_token: str, volume: float): self.shiny_marketplace = ShinyMarketplace(auth_token) self.volume = volume def execute(self): self.shiny_marketplace.place_order(self.volume) def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") options = parser.parse_args() trivial_strategy = TrivialStrategy(options.auth_token, options.volume) trivial_strategy.execute() if __name__ == "__main__": main()
You can see how we have some concrete classes, TrivialStrategy
, ShinyMarketplace
and CoolRequestLib
which perform this task. At this stage, having a bunch of free functions might be easier, but in a realistic example you might have more state that you want to keep around.
Adding a second marketplace
This is code is okay, as long as one doesn't want to support a different strategy, a different marketplace or a different request library. Suppose we are to add a different marketplace, say CheapMarketplace
. We could add a flag, and in this way just change the ShinyMarketplace
in a way that dispatches the request that way. There are two ways that we could pass the flag down to the ShinyMarketplace
class: Either pass it as a constructor argument, or make it global. For the sake of the argument, let me make it global. The code then looks like this:
import argparse options: argparse.Namespace class CoolRequestLib: def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: print("Cool:", url, self.auth_token) class ShinyMarketplace: def __init__(self, auth_token: str): self.cool_request_lib = CoolRequestLib(auth_token) def place_order(self, volume: float) -> None: if options.marketplace == "Shiny": self.cool_request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}", ) else: self.cool_request_lib.send_request( f"https://cheap-marketplace.example/buy?fiat={volume}", ) class TrivialStrategy: def __init__(self, auth_token: str, volume: float): self.shiny_marketplace = ShinyMarketplace(auth_token) self.volume = volume def execute(self): self.shiny_marketplace.place_order(self.volume) def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") global options options = parser.parse_args() trivial_strategy = TrivialStrategy(options.auth_token, options.volume) trivial_strategy.execute() if __name__ == "__main__": main()
The overall structure now has the additional options
, which the marketplace
now depends on:
Even more flags
We can imagine that we add more and more flags, such that we can have different strategies and different request libraries. This then gives us a pretty coupled structure:
import argparse options: argparse.Namespace class CoolRequestLib: def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: if options.requestlib == "Cool": print("Cool:", url, self.auth_token) else: pass class ShinyMarketplace: def __init__(self, auth_token: str): self.cool_request_lib = CoolRequestLib(auth_token) def place_order(self, volume: float) -> None: if options.marketplace == "Shiny": self.cool_request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}" ) else: pass class TrivialStrategy: def __init__(self, auth_token: str, volume: float): self.shiny_marketplace = ShinyMarketplace(auth_token) self.volume = volume def execute(self): if options.strategy == "Trivial": self.shiny_marketplace.place_order(self.volume) else: pass def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") global options options = parser.parse_args() trivial_strategy = TrivialStrategy(options.auth_token, options.volume) trivial_strategy.execute() if __name__ == "__main__": main()
I have skipped over implementing the alternatives, I just put in a pass
as a placeholder. Think of that as similarly elaborate logic.
Problems with the approach
This approach has a significant advantage: It is easy to add more variants wherever we want. The procedure is always the same:
- Add a global flag to the command line.
- Insert an
if options.something == "some value"
and introduce different behavior this way.
No matter what you want to replace, this approach allows you to change it. This is great flexibility. And it will enable you to quickly iterate and react to changing needs. If I wanted to add a flag which would disable buying on certain weekdays, I could.
Another thing is that there is no indirection involved at all. All classes are concrete. You always know which particular instance you have. You don't necessarily know which branch in the control flow (if
and else
) is taken. This can be cured by just printing the options
to the console at startup. Then you know for a particular run which values the options have. And the default values are also visible in the add_argument()
calls. This way one can figure out statically which routes are taken through the code. The IDE will have an easy time to show you usages and definitions, because everything is concrete and can statically analyzed.
Let us, however, take a look at the disadvantages of this approach. They may be subtle, depending on the ways that you want to work with the code. Also they may appear abstract, and only of academic but not practical concern.
There is a lot of coupling in this approach, and coupling is a source of complexity which I don't want in software. The problem domain is already complicated enough, the software design should help to manage the complexity and not cause more on its own. That of course sounds very abstract.
Testability
We can get very concrete and try to figure out how this can be tested. So, what can we do, without changing the code? We have the command line runner, which instantiates a TrivialStrategy
, which in turn instantiates a ShinyMarketplace
, which then uses the CoolRequestLib
. Whenever we execute that, it will always send a HTTPS request to the production service and trigger a trade with actual money. This is a production setting, and not a test setting. Yet I don't see how we could make ShinyMarketplace
behave in a way such that it only sends a HTTPS request to a mock library without changing the code. I could either add another global flag which effectively neuters the request library, or I use that flag to change something in the marketplace. Either way would require us to change the code. And that effectively testifies that the code cannot be tested as it currently is. And that is really bad.
The only way out that I see is to wrap the whole code into a container (say Docker) or somehow inject a HTTPS proxy which captures all the requests and delivers canned responses. This is not really a good solution because it is not a unit test, it is a system level end-to-end test. And it doesn't test a single component, so we would need to test various flag combinations and requests in order to obtain a satisfactory test coverage. Alternatively one could use mocking frameworks which use the introspection capabilities of Python to replace a function or object with something else. That can be used successfully, yet I would like my code to be testable without such magic tools. My rationale is that there may be other re-use cases except testing, and I don't want production code to rely on introspection and monkey-patching as a design element.
Also we could test the lower levels, say the request library. That doesn't have any dependencies. We could also try to test the marketplace, but then we would have to already wrap that into the HTTPS proxy to get somewhere. We don't have a way to make sure that the strategy works the way that we intend it.
Extensibility
Software usually gets extended over time. And ideally we would be able to extend it in a way that doesn't cause more coupling, and doesn't hinder testing. Also the extensions would ideally be independent of each other.
In the current state we would have to add more flags for every extension and then add control flow somewhere else in the code. My major pain point with this approach is that I would have to change other code as well. We can either put the control flow to select the marketplace into the marketplace class (and make it monolithic), or we have to change the method that calls it. This then changes the callers, although we just want to supply an additional implementation.
Mental load
I cannot hold that much stuff into my head at the same time. Therefore I rely on taking notes, organizing tickets on a Kanban board and having small functions and focused classes. If I would work on a software in a state like the above, I would have a hard time. Since I couldn't keep all the dependencies in my head, I would have to re-read code, make notes about the interplay and use code folding in the IDE to hide irrelevant branches.
There is also the tendency to load up objects with more and more functions. This actually turns out convenient, because the more data you can access, the easier you can program. However, you are just making more knots in an already tangled piece, so the coupling rapidly increases if let unchecked.
Breaking out the request library
The major hindrance to testing was that we have a fixed request library there. I want to be able to mock that, and I don't want to rely on a mock framework or monkey patching. So we will now use the Dependency Inversion Principle. The idea is that I define an interface for a request library, which the marketplace will depend on. And the concrete implementation will also depend on the interface, because it implements it. But the marketplace will not depend directly on the concrete implementation. In simplified UML notation it looks like this:
See how there is no directed connection from ShinyMarketplace
to CoolRequestLib
or the other way around? They are now independent. They both depend on the abstract RequestLib
interface, but there is no transitive dependency. We have broken the dependency graph! Now let's see how that looks like in the code:
import argparse options: argparse.Namespace class RequestLib: def send_request(self, url: str) -> None: raise NotImplementedError() class CoolRequestLib(RequestLib): def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: print("Cool:", url, self.auth_token) class ShinyMarketplace: def __init__(self, auth_token: str): self.request_lib: RequestLib if options.requestlib == "Cool": self.request_lib = CoolRequestLib(auth_token) else: pass def place_order(self, volume: float) -> None: if options.marketplace == "Shiny": self.request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}" ) else: pass class TrivialStrategy: def __init__(self, auth_token: str, volume: float): self.shiny_marketplace = ShinyMarketplace(auth_token) self.volume = volume def execute(self): if options.strategy == "Trivial": self.shiny_marketplace.place_order(self.volume) else: pass def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") global options options = parser.parse_args() trivial_strategy = TrivialStrategy(options.auth_token, options.volume) trivial_strategy.execute() if __name__ == "__main__": main()
The implementation of CoolRequestLib
has simplified, it only knows about what it needs to do, and there is no dependency on the global flags any more. We still construct the CoolRequestLib
in the init function of ShinyMarketplace
and have moved the flag for the choice of request library there. This hasn't really improved anything yet, but we are closer than before.
The next step is Dependency Injection. Instead of letting ShinyMarketplace
construct its dependency, we will inject it. We can also see an interesting “code smell” which suggests that we should use Dependency Injection: The init method of ShinyMarketplace
is passed the argument auth_token
, which is only needed to construct CoolRequestLib
. So the shiny marketplace doesn't really need that itself. In a sense, the init function has a factory function for a request library inlined. We are going to pull that out now. It doesn't change the UML diagram, it only removes the spurious direct connection that we still had, but I didn't put into the diagram.
import argparse options: argparse.Namespace class RequestLib: def send_request(self, url: str) -> None: raise NotImplementedError() class CoolRequestLib(RequestLib): def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: print("Cool:", url, self.auth_token) class ShinyMarketplace: def __init__(self, request_lib: RequestLib): self.request_lib = request_lib def place_order(self, volume: float) -> None: if options.marketplace == "Shiny": self.request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}" ) else: pass class TrivialStrategy: def __init__(self, auth_token: str, volume: float): if options.requestlib == "Cool": request_lib = CoolRequestLib(auth_token) else: pass self.shiny_marketplace = ShinyMarketplace(request_lib) self.volume = volume def execute(self): if options.strategy == "Trivial": self.shiny_marketplace.place_order(self.volume) else: pass def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") global options options = parser.parse_args() trivial_strategy = TrivialStrategy(options.auth_token, options.volume) trivial_strategy.execute() if __name__ == "__main__": main()
The inline factory is now part of TrivialStrategy
, and not part of ShinyMarketplace
. The marketplace gets its dependency from external. We could now write a mock request library which would record the calls made, and even return some canned results. It is now easy to write unit tests with ShinyMarketplace
, one just has to inject the mock request library and can test whatever one wants.
More dependency inversion
We still have the problem that the TrivialStrategy
constructs the ShinyMarketplace
, and it also constructs the CoolRequestLib
that is needed there. This means that we cannot test the strategy on its own, we need to inject the dependencies there as well. And before we can do that, we need to define more interfaces. And in the end we will have interfaces and concrete implementations of these interfaces. The interfaces are independent from each other, but that is just accidental in this simple example. And the concrete implementations are independent of each other, which is always the goal.
We could write mock or fake implementations for each of the interfaces and then test one concrete implementation with that. And notice how the TrivialStrategy
doesn't depend on ShinyMarketplace
, and therefore doesn't have a transitive dependency on either RequestLib
or CoolRequestLib
. A fake marketplace doesn't even have to use requests, it can just simulate a broker with a certain sequence of prices and keep a local balance sheet. No need for requests at all. And we can also test the ShinyMarketplace
as in the previous step.
This is the code at this stage:
import argparse class RequestLib: def send_request(self, url: str) -> None: raise NotImplementedError() class CoolRequestLib(RequestLib): def __init__(self, auth_token: str): self.auth_token = auth_token def send_request(self, url: str) -> None: print("Cool:", url, self.auth_token) class Marketplace: def place_order(self, volume: float) -> None: raise NotImplementedError() class ShinyMarketplace(Marketplace): def __init__(self, request_lib: RequestLib): self.request_lib = request_lib def place_order(self, volume: float) -> None: self.request_lib.send_request( f"https://shiny-marketplace.example/place_order?volume={volume}" ) class Strategy: def execute(self) -> None: raise NotImplementedError() class TrivialStrategy(Strategy): def __init__(self, marketplace: Marketplace, volume: float): self.marketplace = marketplace self.volume = volume def execute(self) -> None: self.marketplace.place_order(self.volume) def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") options = parser.parse_args() request_lib: RequestLib if options.requestlib == "Cool": request_lib = CoolRequestLib(options.auth_token) else: pass marketplace: Marketplace if options.marketplace == "Shiny": marketplace = ShinyMarketplace(request_lib) else: pass stategy: Strategy if options.strategy == "Trivial": stategy = TrivialStrategy(marketplace, options.volume) else: pass stategy.execute() if __name__ == "__main__": main()
The character of the code has changed. There are now six classes, three of them are purely abstract. There is no special interface types in Python. Java has them, and C++ also doesn't have them. So one just uses a class where the methods are not implemented. One can use the abc
module for abstract base classes and mark them as such. There is also the possibility to use the protocols concept defined in PEP-544, which is supported by the static analysis tool mypy such that one doesn't have to use inheritance. I don't find this form of inheritance a problem, because it doesn't inherit any specific code, just the declaration of behavior.
Also the code for the trivial strategy, the shiny marketplace and the cool request library have significantly simplified. In the init method they just take an instance of whatever object they need to fulfill their duties. There are no parameter chains which are passed down to construct objects. This means that we can even change the init method of ShinyMarketplace
without having to change anything in TrivialStrategy
. This wasn't the case before, but it is now. That improves the code also with regard to the Single Responsibility Principle because there is only one reason to change TrivialStrategy
now. Only if we want to change the way that this strategy behaves, we need to change it. Also we have improved on the side of the Open–closed principle, because we can now extend the code without having to modify the source code.
We only had three classes here, but you might be able to imagine how this works in a larger code base, where there are many layers of classes. This allows us to get much closer to the Clean Architecture than we had been before with the monolithic code. We have severed various spurious connections in the code, and we can test and extend without having to change the classes that we have isolated. This is great progress!
Can we reason about the code?
One of the advantages of the monolithic design that we started with was that it was easier to reason about. We always knew which concrete classes we had. There was no indirection with interfaces. So how do we know which request library we have when we look at the shiny marketplace? We don't. There is no way to statically tell. The IDE cannot help us. We have to run the code in order to figure that out. The debugger will certainly help us when we step through the code. We can take a look at the objects and see their actual type, not just the interface that they implement.
If the types really adhere to their interfaces, and are tested themselves, I would argue that we don't need to know. We should design the code such that we don't need to know. I usually don't care about which type of sequence I have, as long as I can iterate over it. I trust that whoever has implemented range
and list
made sure that they work. I do care that I have a sequence, and mypy can assert that statically. So I would argue that the need of knowing the particular implementation is a symptom of not trusting the unit tests of that other part.
The decoupling allows us to reason about the elements in the code independently. That wasn't possible before, because the classes always owned concrete classes inside of them. I find this a large advantage over statically knowing exactly what I have. Also I can now test the pieces independently, and have more trust in the pieces and therefore the system that I build from them.
Monolithic factory
Although the code of the strategy, marketplace and request library feels really clean and decoupled, we still have that main
function. That essentially is a monolithic factory. So we have refactored a monolithic code into a decoupled one, but we need a monolithic factory to put the system together during runtime. Given that we likely have more than three interfaces, and more than one implementation of each interface, the complexity of that factory will likely only increase. The number of flags used there will increase. At this stage the options
doesn't need to be global any more, but all the coupling is now contained in main
.
We also don't know which are the default implementations of each interface without looking into the main factory method. Exposing that information is something that likely doesn't really hinder readability and testability of the code, but would add information to the developers. So perhaps we can find a way to decentralize this information a bit.
If all our objects would be trivial to construct (__init__()
having no arguments), we could just add instances as default arguments. But in this simplified example we already see how they need some parameters which are not directly available when the class is defined. Say the authentication token in the request library, or the volume that we put into the strategy.
To make it worse, we have that our concrete classes depend on instances implementing other interfaces. The ShinyMarketplace
for instance needs an instance of RequestLib
, although it doesn't care whether it is CoolRequestLib
or something else. We could try to construct those independently for each object, but then we would end up with multiple instances of CoolRequestLib
, which we likely don't want. Imagine it uses a pool of connections that is to be re-used. If we have multiple instances, there would be more connections open than we really would want to have. Or we later introduce a database connection, then there would be many open connections and we could not really make a transaction which takes queries from different objects. We need some way to re-use instances to fabricate other instances.
Many entry points
One way to simplify the main()
function would be to create multiple ones, where the if
/else
branches are gone, and a different option is chosen each time. We could create one console entry point which doesn't allow the user to switch implementations, but only takes the required options:
def main_with_defaults(): parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") options = parser.parse_args() request_lib: RequestLib = CoolRequestLib(options.auth_token) marketplace: Marketplace = ShinyMarketplace(request_lib) stategy: Strategy = TrivialStrategy(marketplace, options.volume) stategy.execute()
This is already much simpler, and there is no control flow necessary. In case that we want to write a test or experiment with a different marketplace, we could just copy-and-paste that, and replace the ShinyMarketplace
with CheapMarketplace
and call that main_with_cheap_marketplace
. This would duplicate some of the code, but it would make it easier to work with it, at least for the time being. When the construction of the CoolRequestLib
changes, both command line entry points need to be changed. It is am improvement to before, because only the driver code has to be changed, not the core code. Yet it is not that great, and it just doesn't seem to scale with more interfaces and more options.
Abstract factories
We could add another level of abstraction. We could add factory methods for each of the interfaces that we have. Then we could group these in a class which can construct everything in our code (which already sounds bad and monolithic). From there we could also add an interface for this monolithic factory class, such that we could overwrite certain methods of it. The console entry point then only needs to select a concrete factory, and it will be able to construct everything else using the abstract factory.
Dependency injection framework
At this stage we have freed the bulk of the code from the global options
object, have removed a bunch of coupling, increased testability. But the price we pay at the moment is the cumbersome way of building the pieces and binding them together at runtime. We have a core of beautiful code, and then there is this mess which allows it to become concrete.
It has a similarity with functional programming, where functions don't have side effects. This sounds really nice and pure, but it means that running your code doesn't have an observable result. Somewhere you need to get real and cause a side effect, like writing a result to an output stream. You can of course use an external runner where you just pass your function, and the dirty side-effect ridden code is not part of your code base.
Similarly one can use a dependency injection (DI) framework. They seem to be uncommon in Python, and Jörg W. Mittag argues that the Java DI frameworks supply you with a scripting language outside of Java. And the language is bad. So in Python one already has a superb scripting language, and therefore doesn't need an additional interpreter around all that.
I still want to give it a try and have found the Dependency Injector library. Interestingly the pitch on the front page is pretty much the same of the this article here, and the real pitch starts exactly where we are now with the code.
So let us try this out. This is new to me too, so I'd be very grateful if you could point out shortcomings such that I can learn something as well!
After reading a bit of the documentation and applying it to our example, we have the following new code. I omit the interfaces and classes because they haven't changed in the last few iterations. And they still don't change, because that was the whole point of the dependency inversion that we have applied. This is the code:
import dependency_injector.containers import dependency_injector.providers import dependency_injector.wiring class Container(dependency_injector.containers.DeclarativeContainer): config = dependency_injector.providers.Configuration() request_lib = dependency_injector.providers.Singleton( CoolRequestLib, auth_token=config.auth_token, ) marketplace = dependency_injector.providers.Singleton( ShinyMarketplace, request_lib=request_lib, ) strategy = dependency_injector.providers.Factory( TrivialStrategy, marketplace=marketplace, volume=config.volume, ) @dependency_injector.wiring.inject def use_case( strategy: Strategy = dependency_injector.wiring.Provide[Container.strategy], ) -> None: strategy.execute() def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") options = parser.parse_args() container = Container() container.config.auth_token.from_value(options.auth_token) container.config.volume.from_value(options.volume) if options.requestlib == "Crappy": container.override_providers( request_lib=dependency_injector.providers.Singleton( CrappyRequestLib, auth_token=container.config.auth_token, ) ) if options.marketplace == "Cheap": container.override_providers( marketplace=dependency_injector.providers.Singleton( CheapMarketplace, request_lib=container.request_lib, ) ) if options.strategy == "Complex": container.override_providers( strategy=dependency_injector.providers.Factory( ComplexStrategy, marketplace=container.marketplace, volume=container.config.volume, ) ) container.wire(modules=[__name__]) use_case()
We now have Container
, which is a dependency injection container. It knows how to fabricate all the objects that we need. It can also store the objects, and one can specify whether they should be created each time (Factory
) or whether there should be only one of them (Singleton
). It contains the way that the objects need each other. It also contains its own configuration object where I can fill in the values from the command line parser.
Then one uses the inject
decorator and a special default value such that it injects the object from the container. One can still easily override the value by explicitly passing a different instance, but this takes care of the default case.
In my main
function I set up the command line arguments and the container. Depending on the options passed, I replace factories in the container with other ones, such that they construct some other implementation of the interfaces.
Looking at this code, it is more complex than the version before. One also needs to know yet another framework. If one isn't careful, the framework will infest the whole code; something that we'd like to avoid to in the Clean Architecture. At this point, I don't really see the benefit. Perhaps I would have to try this framework in a more realistic example to see whether it is better than a plain Python factory method/class.
Self-written container
Now that I have learned what a DI container is, I can try to write one myself. The container code would then look like this:
class Factory: def __init__(self, options: argparse.Namespace): self.options = options self._request_lib: RequestLib = None self._marketplace: Marketplace = None def request_lib(self) -> RequestLib: if self._request_lib is None: self._request_lib = CoolRequestLib(self.options.auth_token) return self._request_lib def marketplace(self) -> Marketplace: if self._marketplace is None: self._marketplace = ShinyMarketplace(self.request_lib()) return self._marketplace def strategy(self) -> Strategy: return TrivialStrategy(self.marketplace(), self.options.volume) def main() -> None: parser = argparse.ArgumentParser() parser.add_argument("--volume", type=float, default=25.0) parser.add_argument("--auth-token", type=str, default="test-auth-token") parser.add_argument("--marketplace", type=str, default="Shiny") parser.add_argument("--requestlib", type=str, default="Cool") parser.add_argument("--strategy", type=str, default="Trivial") options = parser.parse_args() factory = Factory(options) if options.requestlib == "Crappy": factory.request_lib = types.MethodType( lambda self: CrappyRequestLib(self.options.auth_token), factory ) if options.marketplace == "Cheap": factory.marketplace = types.MethodType( lambda self: CheapMarketplace(self.request_lib()), factory ) if options.strategy == "Complex": factory.strategy = types.MethodType( lambda self: ComplexStrategy(self.marketplace(), self.options.volume), factory, ) strategy = factory.strategy() strategy.execute()
I have created a factory, which also has singletons for the marketplace and the request library. There is no magic wiring going on, everything is explicitly visible. We don't have a dependency on an external framework for this. Replacing of the methods depending on the on the options is a bit cumbersome with the re-binding to the object. If one would inherit, this would not be necessary.
This factory class is more versatile as the monolithic factory that only produced the Strategy
object and did not expose the intermediate objects. One could expose the intermediate objects as a dictionary or a new class, but then one would be rather close to the factory class that I have now.
Likely the biggest advantage of a factory class is that one can subclass it and then write a different cohesive set of concrete classes to use. This might make sense if one has a separate testing and production environment with different dependencies, and one wants to specify the interplay of the components just once. For instance in testing one might want to swap out the PostgreSQL database with SQLite, and at the same time also swap the requests
library for a mock of that library.
Conclusions
The first half, where we have applied the dependency inversion principle seems to be a clear win to me. We have decoupled the code and enabled testing. It became possible to instantiate objects without having to get the real instances of all the dependencies. This alone is an improvement.
Then we have pushed out the options
object from all the code, there is no complicated control flow in our core classes any more. The control flow now happens in the dynamic binding of the instances, for which we might need a debugger to trace. The argparse
library is part of the Python standard library, so I don't have qualms to depend on that directly. And we could always convert the argparse.Namespace
object into a data class under our control if that would become an issue. If the options are provided by a third party library (for instance Click) or read from a configuration file, one certainly would want an abstraction around that.
The cost of pushing out the options
has been the monolithic factory. And that seems bad, but it is not at all worse than the code which we had previously. The monolith is still there, but it has been pushed into a different layer of the architecture. That doesn't make the monolithic part more reusable, but it frees the core classes to be used and tested in many ways.
The attempts to reduce the factory further do not seem that great to me. In Python I don't really see the difference between having factory functions or classes and having a dependency injection container from an external library. The amount of code that I need to write seems to be roughly the same. The complexity is still there, because the interplay of the classes is not trivial. One needs the other to construct, so the DI container needs to know about that too. I am not sure whether the DI framework could completely decouple everything if I put every little piece into its own container and then have it automatically wire the containers against each other. In that case it will likely become hard to specify cohesive sets of dependencies like “production” and “testing”. From what I gathered from the documentation, one still has to specify the wiring somewhere. You can have one multiple containers spread out over multiple modules, but one still has to set up the wiring somewhere. It will be distributed over modules, but there will still be a main container that contains the module containers. This would be equivalent to having one factory per component.
In total I think that one should get to the point where all the dependency construction is in main()
, and all other code just depends on interfaces. Once one is there, one has already decoupled almost all the code. And then one can write a second main()
with a DI framework, without having to change the first one. If it turns out better, then take it. Otherwise just revert, no harm done.