Confused by "groups" and the Linux "permission model"
But I don't really get the concept of a group. From what I've gleaned a group is just a bunch of users. (But does that mean a user can belong to multiple groups?)
Yes, and yes.
A group as I understand is basically meant to make it easier to set/unset rwx permissions wholesale for a bunch of users (i.e. the group).
One way it makes setting permissions much easier is when group needs access to many files or folders scattered around the system.
For example, whenever a new user joins, you would otherwise need to find all those files one by one and try to guess things like "if users A,B,C have access, then user D should be added". But if those permissions simply list a group name, then you don't need to update them at all – instead you just add the user to a group, done.
(It's not limited to just file permissions – some system services may also be configured to grant access to a group instead of individual users, with the same advantages as here.)
Does that mean that in addition to a file having an owner (i.e. the userid of the person who created the file) there is also a "group owner" for that file? And is this group typically one of the groups the file-owner belongs to?
Yes. Each user has a "primary group", and newly created files belong to that group. However, even non-root users can use chown/chgrp to reassign their own files to any group that they currently belong to.
(There's an exception: If the directory has the 'setgid' bit set, then newly created files in it inherit the directory's group, not the creator's. This is closer to how Windows NTFS works by default.)
Of course, this "group owner" system is a bit limiting when files can only have one group at a time. See the next section about that.
What if I have three groups A,B,C on my system and want to set permissions rw-, -wx, r-x respectively on the system?
Then you use another feature called "ACLs" (access control lists), which – as the name implies – allows you to specify an arbitrary list of users and groups to give access to.
Linux supports the POSIX ACL format, which is mostly a straightforward extension of the existing model. That is, if you first rewrite the existing permissions as:
user::rwx, group::r-x, other::---
now you can use setfacl
or chacl
to add your three additional groups as:
group:Family:rw-, group:Friends:-wx, group:Coworkers:r-x
Note to avoid confusion: POSIX ACLs try to remain compatible as much as possible with traditional chmod, but this leads to a surprising feature. As soon as you add ACLs to a file, the "group" field in ls -l
will instead start showing something called a 'mask', and a command like chmod g-w
will deny write access to all ACL entries, not just to the "owner group".
Why does Linux, or even Unix, use the 'owner/group/other' categorization if it could just use ACLs instead? It does because this simple categorization predates ACL support by decades.
Unix originally went for the simple approach, as most other operating systems did at the time – either due to disk space constraints (permission bits fit in just two bytes), and/or deliberate design decision (Multics may have had elaborate ACLs at the time, but plenty of things in Unix were intentionally simplified).
Eventually the APIs became set in stone – new ones could be added, but the existing "chmod" could not be changed, because programs already expected it to work in a certain way. (OpenVMS too had to keep its similar permission-bit system even after adding ACLs.)
In addition to that, unfortunately it's the only system cross-compatible between all of the Unix-like operating systems. Some other Unixes (e.g. FreeBSD, Solaris) may use a quite different ACL format, and yet others (OpenBSD) have no ACL support at all. Compare also to Windows, where all file protections are ACL-based.
The concept of Linux/Unix groups can be confusing. But let's try to unpack that.
Files and directories have both an owner and a group (or a "group owner" as you put it.) They also have three sets of rwx
permission bits, one for user, one for group and one for other. Additionally, they have three more bits of permissions: setuid, setgid and sticky. The user and group of a file or directory are internally stored as an UID and a GID, which are unsigned integer numbers that serve as internal identifiers for users and groups.
Users in the system have a UID and a GID (typically set in the /etc/passwd
file), the GID setting from that file is used to indicate the primary group of an user. Additionally, an user may belong to more groups (typically configured in the /etc/group
file, which lists additional users for each group in the system.)
You can check your user's UID, GID, primary group and additional groups with the id
command, which will list all this information for the user running the command.
When you try to access a file or directory, the system will try to validate your access based on the permission bits. In particular, it will start by looking at whether to use the user, group or other bits. If your UID matches exactly the UID of the user accessing the file, then the "user" bits will be used. For the group, if either your primary group matches the group of the file, or if any of the additional groups (as reported by id
) matches that group, then the "group" bits will be used. Otherwise, if none of those match, the "other" bits will be used.
Meaning of permissions for files are fairly straightforward, r
means you can open the file for reading, w
means you can open that file for writing (thus modify its contents) and x
means you can run this file as an executable (whether it's a binary or a script.)
For directories, it's a little more subtle. r
means you can list the files in that directory (for example, with ls /path/to/dir
), w
means you can create new files in that directory (or delete existing files from that directory.) But you need x
to be able to access any of the files in that directory, if you don't have x
on a directory, you can't cd
to that directory and you can't really open files inside that directory, even if you know they exist. (This allows for quirky setups, where with r
but without x
, you can list the filenames but you can't open any of the files, while with x
but without r
you can open files in the directory only if you know their names already, since you can't list the filenames in the directory.)
Assuming you have permissions to create a new file in a directory, a new file that you create will have your user as the owner and by default it will have your primary group as its "group owner". But that's not always the case!
Remember I mentioned the setgid bit earlier on? Well, if a directory has the setgid bid set (you can set it with chmod g+s /path/to/dir
), then new files created in that directory will inherit the group of the directory itself, rather than the primary group of the user that creates it. Furthermore, if you create a new subdirectory under such a setgid-enabled directory, the subdirectory will also have the setgid bit enabled. (That is necessary in order to preserve the group-inheritance property for the whole subtree.)
This setgid bit on directories technique is quite useful to implement shared directories. We'll get to that shortly.
One more note of interest is that the Unix systems in the BSD family (such as FreeBSD, NetBSD, OpenBSD) always behave as the setgid bit is set on a directory. In that way, the primary group of a user is somewhat less meaningful, since being the group during file creation is typically the most visible feature of this group.
Another concept of interest is the "umask", which is a set of bits that gets "masked" when a new file or directory is created. You can check your umask in the shell by using the umask
command and you can also use that command with an argument to modify the current umask. Typical values are umask 002
, umask 022
, umask 027
, etc.
The bits in the umask refer to the rwx
bits and the three octal digits map to the user, group and other bits in a permission mode. So umask 002
will preserve all the bits for user and group (0 means no masking), while they'll block the w
bit for other (2 is w
.) They'll keep files user and group writable, but only readable by others. umask 027
on the other hand, will have writable only by the user, only readable/executable but not writable by group, and no access for other (7 means masking all of rwx
.)
The umask
is used every time a new file is created. Applications typically specify the permissions they would like, normally in the most liberal way, so that the umask can restrict that to more realistic permissions. For instance, normal applications will ask that files are created with 0666 (rw-rw-rw-
) permissions, expecting the umask will drop at least the world-writable bit. Directories will be typically created with 0777 (rwxrwxrwx
), assuming the same.
So how can we put this all together?
The setup typically used by Red Hat based Linux distributions (such as RHEL, CentOS and Fedora) is a pretty flexible one, worth looking into.
For each user that is created, a group of the same name is created too (typically with a GID matching the UID of the user) and that group is set as the primary group of that user. That group is meant to contain only the user by the same name. So files of my user are typically created as filbranden:filbranden
, with my own primary group gating the group permission bits.
Since the group is essentially the same as just the user itself, the umask
is set to 002, which means that all files and directories will be group-writable by default.
So how do you lock down directories to make them private? Simple, just remove the permission bits for "other" from the top level directory. For example, if I use chmod 770 ~
(or 700
is also fine, 770
works because the primary group is my own), no other user will be able to access any of the files under my home directory. That the files inside there have read or execute bits for "other" doesn't matter since by missing the x
bit on the top directory itself means they'll never be able to traverse that one.
So how do you implement shared directories? Simple. Start by creating a group and adding all users who are meant to collaborate on that project to this group. Next, create one (or more) directories for that project. Set the "group owner" of the directories to the group you just created. Finally, enable the setgid bit on these directories. All the members of that group will be able to create files in those directories. Since they all have umask 002
, the files they create will be group-writable. And because of the setgid bit in the top directory, all the files will be owned by the shared group (and not the per-user primary groups.) Which means users in the group will be able to modify files that were created by other members of the group, since they'll have write permissions to those files.
These shared directories can be world-readable (by keeping the r
and x
permissions for "other" in the top directory), or can be private to the group (by removing those permissions.)
That's the gist of it. How Unix/Linux permissions typically work and the rationale for why they work in this way.
There are, of course, many caveats. Many of these settings (such as the umask
) exist in different sessions and it's possible they get out of sync. Adding a user to a group means they typically need to log in again for the change to have effect. While creating a file in a setgid-bit enabled directory causes the group of the directory to be inherited, moving an existing file into that directory doesn't typically change ownership (so you may end up with files in the group share that are not modifiable by other members of the group.) Semantics around deleting files can be somewhat tricky as well.
Modern Unix/Linux systems keep all the logic behind users, groups, file ownership. But they typically also include additional mechanisms to enforce permissions, such as extended file ACLs, which can be much more granular in allowing read/write access to directory trees and do not suffer from many of the issues with basic permissions listed above.