Dependencies that you own
April 22, 2023
The more time I am spending these days thinking about architecture, one question that for a long time weighs on my mind and keeps coming back. How often do we think about what we include, for granted, from “the internet” in our own project?
Judging by the number of discussions I had on the subject, not much for a long time already. From search engine queries to Stackoverflow posts, to new dependencies within the project in minutes. Even for the trivalent of things, there is a package out there. Question is, how well do we understand what are the impact and possible consequences of going “package” first? Even when our projects are already so dependent on N products that you don’t even own. That is the cost of doing the business and on-premise solutions are something that is becoming an extinction level event of the software development world. With some good reasons that I am not going to spend time on.
Back to that small piece we do control and are paid for. Ask yourself, how often did you consider possible implications by just pulling in a random “package” from the internet? Or how much time did you spend looking into its source code or researching any side effects that could come with it? From my personal experience and conversations I had with people about it, the majority assume “all is good”. But in recent years, we can see that this is not the case. There are many malicious software packages, that on the first or even 10th glance seem valid. And just one click, this could be within an hour in your production environment running. I am not even joking, saw it happen. Bug, search engine, package, production. Under an hour. What could possibly go wrong…
What was also interesting in this particular scenario, the package in question was not even being fully utilized for its benefits, not even 1% of it. It was included to do some specific string formatting. A single function. Of 4 lines of code. And now you own the entirety of the code base that comes with it. I understand that people often say don’t reinvent the wheel, but if you need a single screw why are you carrying the entire car with it? And nowadays, with micro/pico/nano/whatever services-oriented architecture, this literally, in my opinion, defeats the intended purpose. If you’re not willing to write the code, maybe just spend an hour or two to see how others did it. Without including the package, if you’re not gonna use it. Copy. Give credit. Write tests. And you already decreased your “problem” surface by (code_in_package - copied_code) amount.
And if the package is the way to go, do spend some time to understand what is that you’re now going to own. And yes, I mean it. You own it now. It is in your solution and consequently, it is your problem. So any problems with it, the original author will not be waking up at 2 AM due to some “unforeseen” side-effect that it had on the rest of the ecosystem. So invest time to put some borders around that package, maybe a bit of abstraction for the pieces of functionality you need. With some monitoring and logging just in case things go wrong, you can always be able to pinpoint where stuff could potentially go wrong.
I know that picture looks bleak, and that is not the case. There are so many people tha with no compensation solved your problem. And the system is broken, as we take this for granted. So if you ever had been saved by using some open source package, the least you can do is buy a coffee. Or send a thank you note. It is all about passion and it will make the developer(s) feel great about their contribution to the world. But on the other hand, don’t blame any issues you have on them. Or it breaks something on your end. They don’t work on your project and they didn’t build it specifically for your problem. And we’re all humans, bugs are made all the time. So own it and if able, contribute to the solution after that post-mortem. Pay it forward.
Back to the topic at hand, a broken open-source community appreciation is a problem for another time. I mentioned how you can guard by thinking about the packages and code you include as your own. There are also some automated solutions out there that can help you in your CI/CD pipelines. Some are paid, but there are really good open-source alternatives supported by the likes of GitHub and Google, for example. You can easily integrate them into your projects and reap the benefits of automatic scanning and threat detection. Tools I tried and played around with, and used on projects, are GitHub Dependabot and Google OSV. They’re great tools and they can be easily implemented within the projects, within an hour. So that at least gets you covered that you are not being added to the crypto mining industry and things like it.
I wanna cry every time I see the number of packages even the basic scaffolded JS project… Anyhow, this helps you at least be aware of possible attack surfaces your project has. This list is just a part of the entire scan, on some old example code from a year or two back. And it was loooong. And these projects support a multitude of other platforms, so feel free to check them out and experiment with them.
Hopefully, I have raised some eyebrows. So when the next pull request comes along across your table and you see some newly defined dependency, ask some of these questions. Why? Can it be copied? Etc. And next to that, run some of the tools above, to help you see the current state of your project.
See you next time, crypto miners.