What does yarn --pnp?
I designed and implemented PnP, so I can talk hours about it ð
tl;dr: We only write the .pnp.js
and .pnp
folders (on top of the regular Yarn cache). We don't store configuration anywhere else.
Without Plug'n'Play
When you run yarn install
(even without PnP), a few things happen:
- If you use the offline mirror feature, we download the tarballs from the registry and store them within the offline mirror folder
- Regardless of whether or not you use the offline mirror, we unpack all the tarballs downloaded and store their files in the Yarn cache
- We then figure out which files from the cache should be copied into which location in the
node_modules
- We apply the computed changes (a bunch of
rsync
operations, basically)
With Plug'n'Play
With PnP, the workflow becomes like this:
- No changes, we download the tarballs from the registry in the offline mirror (if enabled)
- No changes, we still unpack them into the Yarn cache
- We generate a
.pnp.js
file¹
And that's it. There is no other generated file than the .pnp.js
file (and the cache, but it already was there before).
¹ As you mentioned, we also generate a .pnp
folder (.yarn
as of Yarn 2) in the project. This folder is meant to contain two types of data:
Unplugged packages are packages that must be local to the project. Typically, those are the packages with postinstall scripts (we cannot store them into the cache, as the generated artifacts might be different from a project to another).
Virtual packages, which are symlinks created for each package in your dependency tree that lists peer dependencies. Without going into the details, they are a necessary part of the design, and are required to makeThose files don't exist anymore as of Yarn 2 ðrequire.resolve
work as before.
How does it work?
The .pnp.js
file contains information similar to the following:
[email protected] -> /cache/webpack-1.0.0/
-> it depends on [email protected]
[email protected] -> /cache/lodash-1.0.0/
-> no dependencies
By having those information, the resolution can correctly infer that when a file within /cache/webpack-1.0.0
makes a require call to lodash
, then the required files must be loaded from /cache/lodash-1.0.0
. It's a bit more complex in practice (we keep an inverse map for improved perfs, we use relative paths to ensure portability, etc), but the general concept is there.
Bonus round: With Plug'n'Play+Zip loading (Yarn 2)
Bonus: With Yarn 2, we're about to improve this workflow even more. This is what it will look like:
- We download the tarballs from the registry and store then into the cache (no more distinction between offline mirror and cache - they are the same)
- We generate the same
.pnp.js
file as before
And that's it! As you can see we don't unpack the packages anymore (instead, we use a Node loader to read them from the package archives at runtime).
Doing this has a very interesting property: if both your cache and .pnp.js
files are there, you don't need to run yarn install
for your application to work! And to ensure you have those files, you just have to add them to your repository and version them as you would with everything else.²
It's very useful, as you don't need to remember to run yarn install
after git rebase
, git pull
, or git checkout
, and your CI systems become faster and stabler as they don't need special setup - just clone your application and it'll just work.
² Before someone mentions it - checking-in binary files within a repository is perfectly fine. The reason why node_modules
were a very bad thing to check-in within your repository was because of the exponential number of text files, which was putting a huge strain on Git - technically, but also philosophically as code reviews were made impossible.
In the case I described we don't suffer from the same problem, because the number of files is constrained (exactly one file for each package), and reviewing them is very easy - in fact, it's better in that you can clearly see how many new packages are added to your project by a PR!
It imports only the parts of a package you are going to use, making the bloated node_modules
folder much, much leaner.
Think about for example having relative big libraries like lodash or ramda when you use only 4-5 functions from them - how much you could save getting only the actually used minimum.
I believe it is not yet 100% fully stable, but still a nice option to keep on your radar :)