Is there any asymmetrically encrypted file system?
Short concise answer
To the best of my knowledge, such a scheme is not available in a general purpose filesystem in a general purpose operating system using full block level RSA-only non-hybrid crypto. This is because operations like RSA in software are generally slower than their symmetric counterparts and require a lot of implementation extras, which are considered unnecessary when userland applications more than suffice for this purpose. The linux kernel and other kernels are concerned with supporting a wide range of general purpose applications in user space, so any non-necessary to your average general purpose user feature doesn't generally get included. However, this is not to say you cannot do it. People have implemented web servers in kernel space. You could do this if you wanted to.
Just to cover all my bases; obviously the kernel optionally supports a lot of hardware including some which may provide hardware assisted cryptography. In this case, your life is made considerably easier.
A hybrid scheme which PGP/TLS etc implement has a weakness in that the symmetric key is by nature read-write on the encrypted data. If you can intercept this, you can read the data. This is hard to do, no doubt, but possible. In a filesystem, your life is made easier because the read/write system calls will contain that unencrypted data in their buffer, which you can intercept, if you are root.
Long winded answer
I believe you're asking why on earth given that public key cryptography allows you to technically implement this form of access control (write in, no read out) why on earth it hasn't been done. Yes, I am trying to answer why you don't do block-by-block RSA. So, in two sections:
On why RSA hasn't been included in the kernel
You might be surprised to know there isn't an implementation of RSA in your linux kernel (or probably in the Windows kernel - I'd bet that it is in userspace).
Firstly, let's look at what is in the kernel: here's the crypto folder from Linus's main tree. Not an asymmetric algorithm in sight. Remember, we're still discussing filesystems here.
Next step, how do we implement RSA. At a register level, a x64 (as in, AMD64/EM64T compatible) processor can hold exactly 64 bits per register. This is why you have types like uint32_t
and uint64_t
in stdint.h
and why integer overflow occurs.
If you're not quite following why this is at all relevant, well, an RSA key requires that you produce a modulus of a certain bit length; 2048 is recommended I believe although factoring 1024 is going to be hard enough (search Thomas Pornin's posts for a discussion on factoring both here and on crypto). So let's say 2048 bits. You might have noticed that in a single register, there's no way we can actually store that number. So we're a little stuffed for loading it into a register and manipulating it then, aren't we?
Ok so how then do we process integers of such a size as we require for RSA. Chain lots of 64-bit blocks together is basically the answer. x86 processors support operations such as adc
which is short for add with carry. What happens when you add two registers who have the most significant bit set together and will therefore overflow is that the carry flag is set and the next adc
you run will have an additional 1 added to the least significant bit.
This very fact allows us to start approaching arbitrary precision arithmetic. Now, all we need is 32 64-bit blocks of memory to store one side of our multiplication and the result and a further x
lots of 64-bit for the other number. That gets us to actually generating our modulus.
Let's speed this theory up a bit or we'll be here all night. Addition is basically O(N) whichever way you look at it in a bignum library. Things like multiplication and exponentiation can be reduced to O(N) if one of the operators is only a single limb (64-bit chunk; limb is the bignum term for it); however, for two arbitrary numbers operations such as multiplication and division are always worse. Monumental effort has gone into speeding that up. The best summary that I know of of algorithms and their runtimes is available here: Modern Computer Arithmetic. However, I'd say it's actually highly theoretical, algorithm-listy and that Seminumerical Algorithms by Don Knuth contains a more accessible introduction.
Now, bignum operations are fast. Not just fast, like stupidly fast. Amazingly fast. Loop-unrolled hand optimised per platform assembly fast. GNU MP is widely regarded as the fastest general purpose bignum library on the planet and is used by Mathematica, Maple, Pari/GP and everyone else who does arbitrary precision arithmetic. It turns out neither GPG or OpenSSL use this in favour of their own bignum implementations, but that's their choice.
In contrast, AES, once you expand the SBox and the key expansion, is really two things: a hash table you never add to (the SBox)(and is therefore always O(1)) and a lookup table for the key expansion (also O(1)) and some xor-ing.
So, I think it's fair to say relative to AES, RSA is pretty slow, right? It's more computationally expensive per operation.
Edit I've just woken up with another reason in my mind as to why you don't do this. See this link for this:
Linux kernel code and data are not swappable and are never moved to swap.
So now you're forcing physical memory to always contain your bignum, RSA implementation and keys, leaving you less room for userland applications, forcing more swapping.
These are huge problems for the kernel. The kernel's job is to be as efficient as it can be. Very simple. This actually adds several other restrictions to kernel development, like having a fixed stack size and the fact that most of libc is considered excessive for kernel space. You also shouldn't use floating point operations in the kernel either. Let's be clear on what I'm saying: it is entirely technically possible to implement RSA in the kernel, but on a practical level, nobody does it because it is better off in userland, for which all these efficiency saving sacrifices are ultimately made.
RSA is also poorly suited to a block layer storage - before anyone starts, it is not obvious necessarily that this is a bad idea. Let's actually read the discussion I linked to last time. Directly quoting Thomas's answer there:
Nobody in their right frame of mind does such RSA-with-CBC encryption. Among the reasons are:
- the construction has not received sufficient scrutiny to be deemed secure;
- the size overhead implies that the encrypted message will be about 9.4% longer than the plaintext message;
- RSA encryption is not very fast (a modern CPU would still succeed in encrypting, say, 2 MBytes worth of data per second that way).
This last reason is the most often quoted, but, in my opinion, the least compelling of the three.
I have dealt mainly with the last issue because that's the most obvious drawback, but assuming that's not a problem and you're happy with being relatively slow on a block level, then you've also got the fact that there's a large overhead per byte you want to store. Not a massively efficient use of your storage device.
I caveat all of this with if you want to do it, it is technically possible. There is nothing stopping you doing this at all. I'm only trying to actually explain why it hasn't been done.
So, that's RSA-per-block out of the way. As everyone else has got around to yelling, the alternative is to use RSA for the small amount of really valuable data - an AES or other symmetric key. This works. This is how TLS works. This is how PGP/GPG/OpenPGP and just about every other cryptographic protocol on the internet works. The idea is that you use slow RSA to encrypt the key, transmit it to the recipient (the only person who can decrypt it) and then you both have the symmetric key and can do high volume data transfer with a significantly reduced overhead, assuming correct usage of all the algorithms.
On why crypto and keys on the same OS is ok, but not perfectly safe
So, now let's talk about operating systems. Your computer is like a house with housemates in it and a live in landlord. The landlord is the kernel, the housemates are programs. Now, to achieve your system, Bob is you and wants to make sure this happens. So, he buys two safes and instructs Alice and Carroll to manufacture padlocks and also buy a safe each. He, meanwhile, invents a combination for one of his safes, sets it and puts the safe in the hallway. Now, to protect that combination, he writes it down and asks each of Alice, Carroll (and himself) to put a copy of that combination in their own safe and lock each of their own safe with their own padlock. So, to open the safe in the hallway, Alice must first open her own safe, to which only she has the key, then read the combination, lock her safe and go and open the safe in the hallway.
Eve is an evil housemate and is very angry nobody has told her what's in the safe. She can't enter anyone else's room, but she can persuade the landlord to do it, because he can go anywhere. She only needs to:
- Observe the making of one of those padlocks.
- Observe the code being written down for the safe in the hallway.
- Observe said code being used, or whatever is in the safe.
Simple as that. The best bit from my original answer was this:
The battle for your OS is theoretically lost as soon as someone can get to root, or Administrator, simple as that
Now, by contrast, a network service works like this. Bob wants to post a message to George. So, he gets George to send him a padlock in the post. George lives with a landlord they both trust totally. So Bob sets the combination for the safe in the hallway and writes it down, puts it in his safe, puts Goerge's padlock on it and gives both to the postman. He doesn't trust the postman one bit, but he's not going to invite the postman in any time soon, just up to the door. Unless something he does on the doorstep gives him away, or unless there's a problem with the safe or the padlock, or unless the postman can persuade him to reveal anything, his message should get to George safely.
That is what I was trying to get to. Even if you use GPG, or a userland-based filesystem such as FUSE (totally possible) you must still trust the system you're making those keys and doing that encryption on. Otherwise, the best you can hope for is fairly hard to intercept. You can use public key cryptography and symmetric cryptography together but they're not achieving what you think they are - even disk encryption as access control is worthless vs someone with physical access to your system; the only reason to use it is because it makes forensic analysis of your hard disk 6 months after you've chucked it away a lot harder.
Cryptography only works so long as the key and the data are separate. It's that simple. Which is why TLS across the internet works (because an attacker doesn't have the key) but crypto locally, where the key is going, at some point, to touch the system, doesn't. So, by all means implement an encrypted filesystem and protect the keys used to encrypt it with asymmetric keys. This is the most secure you can get with such a scheme, but:
- Your requirement for write in all, read out one user will be enforced by the OS in terms of access control. There exists no current cryptography capable of doing this unless you do everything block-by-block RSA, which is slow and intensive relative to the hybrid implementation.
- Nothing you can possibly do will ever be totally, 100% immune to root/Administrator/the guy who walks by your desk every day.
A write-only file system is doable, even cryptographically, but it has little extra functionality over what you can already do with, say, GnuPG.
It would take some effort at defining precisely what you want. There are many behavioral options:
- Should it be possible for anybody to see the files metadata (name, length, date of last modification...) ?
- If files can be deposited, who can erase and/or replace them ? Anybody, only the directory owner, or both the owner and the file sender ?
- Should it be possible to append data to an encrypted file ?
- What about subdirectories ?
Having the system as a kernel-powered filesystem makes sense only if you want to be able to write the files through the normal API for files (open()
, write()
...), which in turn constraints the answers to the questions above. Supporting the "normal API" is for the benefit of applications which are unaware of your specific auto-encrypting filesystem. On the other hand, if you accept that users who wish to deposit a file in such a directory may have to invoke a specific command, then there is little reason to implement the feature in the kernel. A purely userspace solution would work equally well, and probably better since userland programming has less constraints than kernel programming. An existing userland solution already exists, and is called "e-mail", with GnuPG.
Also, if root is your enemy then you are doomed. Root is almighty on a Unix system; he can change the kernel as he sees fit, and thus activate a key logger, or just obtain a copy of every byte which is written to a file. On the other hand, if root is not an enemy, then normal non-crypto security features (e.g. file access rights) of the system ought to be enough to enforce data confidentiality; see the other responses for that.
With cryptography. If you don't want to rely upon filesystem permissions, you can solve this problem straightforwardedly with public-key encryption.
Details. Here's how. Radia, the recipient of all these messages, can use GPG to create a public-key keypair. Radia should then communicate her public key to everyone: for instance, she might store the public key in some read-only file that everyone can read. Radia should store her private key safely in a file that only she can read.
To send a message, Alice can import Radia's public key into her GPG keyfile, then encrypt her message to Radia's public key using GPG, and send the encrypted file to Radia. It doesn't really matter how the encrypted file gets to Radia; it could be transmitted by cleartext email, stored in a world-writeable directory, or any other insecure medium.
To receive the message, Radia can use GPG to decrypt it (using Radia's private key).
Only Radia will be able to decrypt the message. Bob doesn't know Radia's public key, so even if Bob intercepts the encrypted file, Bob can't read the message that Alice was trying to send.
Discussion. This is all a totally straightforward application of public-key cryptography. Rather than implementing public-key crypto yourself, I recommend that you use some existing, time-tested, heavily-vetted file encryption software. GPG comes highly recommended for this purpose. You could also use PGP's commercial product, if you prefer to pay money for it. S/MIME-encrypted email would also be a perfectly reasonable implementation. This solution doesn't require any special support from the filesystem; instead, the security is implemented at the application layer.
Caveat. Root will be able to read all files and read all messages. You said you don't want this. However, what you want is unachievable. On any existing operating system, if root is malicious, root can spy on everything that every user does. (You said "Let's assume the system is secure against keyloggers", but there is no way to do that, if root is the one trying to do keylogging.) Therefore, this achieves your goals to the extent that they are achievable.