Recent ESLint hack or how can we protect ourselves from installing malicious npm packages?
Three words: Supply Chain Management. Except that in our case the "supply" is dependency or "third party libraries".
This isn't a unique problem to npm. This a general problem in software development. It's the equivalent of googling for a manufacturer of "screws" then use the first manufacturer of "screws" you see, don't ask about the specs of the "screws", then use these "screws" and hope they'll do the job right and don't break, erode/rust too fast and can handle the strain... without ever having checked whether those screws have been tested, have proper quality control. We can do this because usually no people die if your software is buggy or vulnerable. If a structure collapse because the screw were buggy... that's a different story. There's accountability there. There's no such accountability in software... except when you're working in a few industries where lives are actually at stake when software bugs happen or when huge and expensive machinery breaks (and/or then endangeres lives) if your control software is buggy. And I mean that dead serious. You can't just use a third party library without some rigirous checking on an embedded device controlling dangerous machines if you must be able to uphold realtime communication to control some process in real time and everything goes boom if the third party library you use produces a CPU fault because it's buggy... because then your machine is going to get damaged for real.
The problem with software development is that we "google" for "some library" providing "some functionality" and then we just "npm install", "go get", "pip install" it and we're happy. There's no verification process in there. How do you know the package provides what it claims? Do you know how well tested it is? Do you know if it's still actively maintained? Do you know what the API stability guarantees are? You can "go get" some package and the package is completely broken a week later because even "officially" looking go code might not have any API stability guarantees.
What version constraints do you use? foo >= 1.5
. What if foo >= 1.6
introduces a backdoor? Somebody compiling your code with the newest versions will have this backdoor in it. What if you use foo == 1.5
but there are different mirrors and the mirror your colleague uses to build your software containes a malicious version of the package? And don't forget: If you install/use a package/library? You not only need to verify that package/library... you need to verify the whole dependency tree.
What we'd need is a certification process. Once a company/developer thinks their package is ready they publish it and then independent code review and security companies review the package and once there's a reasonable degree of confidence that this package is "secure" to use they'll sign it. Then you're only allowed to use dependencies that are signed and you use hashes in addition to versions such as foo == ("1.5", "abc734defef373f..")
.
How do you apply this to npm?
Don't install npm packages without review. Who's the author? What's the general code quality? When was it last updated? What were the last few commits? What does it's install do? Does it have proper tests? What's the code/test coverage? How many people are using it? Who's using it?
Software development is focused on providing updates, new things, updates, new things. Proper testing and proper "third party code management" costs A LOT and most usually neither the companies nor the consumers are willing to pay that price. Afterall, it's always "risk vs cost". If the cost of a security vulnerability is some users having stolen their credentials... nobody cares. They can't sue you.. at worst it costs you a bit in "public image". If the worst is a buggy software that doesn't work for some users... you can afford a few pissed of users who were using the free version anyway... This would probably change if there were such a thing like "warranty" for software and accountability for software bugs (which kinda exists for when you pay someone to write something for you) but since a lot of software is open source which comes with exactly zero warranty/guarantees (that's actually one major drawback of this) or it's software that you're using for free this is just not going to happen for that kind of software.
This problem also exists for using third party javascript on your website. Same thing. The problems are almost completely the same regardless of what language, frameworks, technology you use: If you want to be secure... you need to verify third party code.
One other thing:
There are packages/libraries that provide only little utility... things you could've written yourself in 3h... maybe a day. This isn't a problem if you don't care about more dependencies (and more dependencies complicates your dependency management) but if you do... it's sometimes actually less effort to just not use such simple libraries and write the functionality yourself so that you have full control over it and can properly test it. Rule of thumb is that if it takes you longer to verify someone elses's code than it takes you to write it yourself... then write it yourself. This also applies if you only need one or two functions from a whole framework. It's much easier to properly check your two versions of some function rather than including and verifying a whole framework because frameworks themselves have dependencies and you'll have to verify them as well.
Other than that: ... you could block traffic to non-whitelisted websites during the installation process which means that this exacty specific attack wouldn't work but that doesn't work against malicious install scripts that just delete random files or insert backdoors into existing files or whatever. You'll have to carefully read through the installation scripts.
Supply chain management is the right answer in theory, however, despite the efforts of commercial entities like snyk and many others, there is no solution to this problem in the node ecosystem in practice.
Node's supply chain record, with no disrespect intended to the numerous folks working to make it better, is uniquely awful. King of that infamous mountain used to be wordpress modules, but it's been node now for at least the last few years. Security fails are just a cost of doing business.
In theory you can:
- read all the code of dependencies before you install them
- read all the code of their dependencies
- install in sandboxes with server whitelists
- whitelist dependencies, requiring a bake period of weeks or more
- bring in vulnerability feeds, scanners and analyzers
- assume node engineer workstations are always compromised, use secrets management and don't allow them to see passwords, or directly access production systems or data
- have a CI/CD that can find and redeploy all applications consuming a specific dependency when fixes are announced
Or you can get some work done and hope for the best until the ecosystem improves (which it will, eventually).
Basically the issue here is that 3rd party software tried to steal private information and send it out over the network. This issue is not unique to npm, any software running as your computer users could really do the same since there's nothing to stop it from reading your user data.
As a line of defense against such an attack, you might consider using an outbound firewall.
There are numerous such products, mostly commercial, but for the fairly-well-known Little Snitch (of which I am not affiliated with), if the Node process attempts to connect to a host which the firewall has not been configured to give permission to access, the user would be prompted to decide if that connection should be allowed to be established, like this:
Naturally if you saw something like this while installing an npm module it should trigger some red-flags, and you would be able to stop it at this point by denying access until you audit the source of this connection.
Obviously this is no substitute for auditing the source code, but it does offer an extra layer of defense against rogue software trying to exfiltrate your private data system-wide, and would likely be very effective in stopping an attack like this (assuming you don't give the binaries the attack uses overly-generous permissions).