List explicitly installed packages

Just the code

aptitude search '~i !~M' -F '%p' --disable-columns | sort -u > currentlyinstalled.txt
wget -qO - http://mirror.pnl.gov/releases/precise/ubuntu-12.04.3-desktop-amd64.manifest \
  | cut -f1 | sort -u > defaultinstalled.txt
comm -23 currentlyinstalled.txt defaultinstalled.txt

Explanation

One way to think about this problem is to break this into three parts:

  • How do I get a list of packages not installed as dependencies?
  • How do I get a list of the packages installed by default?
  • How can I get the difference between these two lists?

How do I get a list of packages not installed as dependencies?

The following command seems to work on my system:

$ aptitude search '~i !~M' -F '%p' --disable-columns | sort -u > currentlyinstalled.txt

Similar approaches can be found in the links that Gilles posted as a comment to the question. Some sources claim that this will only work if you used aptitude to install the packages; however, I almost never use aptitude to install packages and found that this still worked. The --disable-columns prevents aptitude from padding lines of package names with blanks that would hinder the comparison below. The | sort -u sorts the file and removes duplicates. This makes the final step much easier.

How do I get a list of the packages installed by default?

Note: This section starts out with a 'wrong path' that I think is illustrative. The second piece of code is the one that works.

This is a bit trickier. I initially thought that a good approximation would be all of the packages that are dependencies of the meta-packages ubuntu-minimal, ubuntu-standard, ubuntu-desktop, and the various linux kernel related packages. A few results on google searches seemed to use this approach. To get a list of these dependencies, I first tried the following (which didn't work):

$ apt-cache depends ubuntu-desktop ubuntu-minimal ubuntu-standard linux-* | awk '/Depends:/ {print $2}' | sort -u

This seems to leave out some packages that I know had to come by default. I still believe that this method should work if one constructs the right list of metapackages.

However, it seems that Ubuntu mirrors contain a "manifest" file that contains all of the packages in the default install. The manifest for Ubuntu 12.04.3 is here:

http://mirror.pnl.gov/releases/precise/ubuntu-12.04.3-desktop-amd64.manifest

If you search through this page (or the page of a mirror closer to you):

http://mirror.pnl.gov/releases/precise/

You should be able to find the ".manifest" file that corresponds to the version and architecture you are using. To extract just the package names I did this:

wget -qO - http://mirror.pnl.gov/releases/precise/ubuntu-12.04.3-desktop-amd64.manifest | cut -f1 | sort -u > defaultinstalled.txt

The list was likely already sorted and unique, but I wanted to be sure it was properly sorted to make the next step easier. I then put the output in defaultinstalled.txt.

How can I get the difference between these two lists?

This is the easiest part since most Unix-like systems have many tools to do this. The comm tool is one of many ways to do this:

comm -23 currentlyinstalled.txt defaultinstalled.txt

This should print the list of lines that are unique to the first file. Thus, it should print a list of installed packages not in the default install.


You can use either of these two one-liners. Both yield the exact same output on my machine and are more precise than all solutions proposed up until now (July 2014) in this question. They are a combination of the two answers (1) and (2). Note that I originally posted this answer here.

Using apt-mark:

comm -23 <(apt-mark showmanual | sort -u) <(gzip -dc /var/log/installer/initial-status.gz | sed -n 's/^Package: //p' | sort -u)

Using aptitude:

comm -23 <(aptitude search '~i !~M' -F '%p' | sed "s/ *$//" | sort -u) <(gzip -dc /var/log/installer/initial-status.gz | sed -n 's/^Package: //p' | sort -u)

Very few packages still fall through the cracks, although I suspect these are actually installed by the user, either right after the installation through the language localization setup or e.g. through the Totem codec installer. Also, the linux-header versions also seem to accumulate, even though I've only installed the non version-specific metapackage. Examples:

libreoffice-help-en-gb
openoffice.org-hyphenation
gstreamer0.10-fluendo-mp3
linux-headers-3.13.0-29    

How does it work

  1. Get the list of manually installed packages. For aptitude, the additional sed strips out remaining whitespace at the end of the line.
  2. Get the list of packages installed right after a fresh install.
  3. Compare the files, only output the lines in file 1 that are not present in file 2.

Other possibilities don't work as well:

  • Using the ubuntu-14.04-desktop-amd64.manifest file (here for Ubuntu 14.04) instead of /var/log/installer/initial-status.gz. More packages are shown as manually installed even though they are not.
  • Using apt-mark showauto instead of /var/log/installer/initial-status.gz. apt-mark for example doesn't include the xserver-xorg package, while the other file does.

Both list more packages than the above solution.


According to man apt-mark:

apt-mark showauto
apt-mark showmanual