What does spool mean for printing?
In a nutshell, a spooler consists of:
- a background program
- a directory per printer
- a file per print job
In your case, the foreground program (lpr
) sends its print jobs to cups
, which stores it and then uses serial, parallel, USB, network, ... communication to actually start the printing process.
So that's why nowadays even when the printer runs out of paper you can still continue using your computer, whereas back when I was a kid on CP/M, the whole computer locked up until you added more paper...
Why is it called "spooling"?
Because in those times, large computers used tape to store these kind of files as disks were too expensive, so when you were working inside the machine room data center, the first thing you would hear was the tapes starting to spin up¹ and only after a second or 3-4 the printer would start printing (if you were lucky). ;-)
Note 1: A "spool" is a noun meaning "a cylindrical device on which magnetic tape can be wound", therefore "spooling" is the cylindrical device spinning up and winding up tape...
A print spool is effectively a buffer, managed per job, with a program (the spooler) responsible for receiving jobs from submitting programs and feeding them to one or more printers. The point of a spool is to handle communication between two systems with different speeds, and to control access to shared devices. The former means programs can submit print jobs as fast as they want, and those jobs are dealt with as fast (or slowly) as printers can handle. The latter (as pointed out by RonJohn) ensures that jobs are handled coherently: thus when printing, jobs aren’t mixed up.
Networked printers provide their own spools, and print servers (CUPS, lpd
etc.) also implement spools. Most print systems also handle access control, quotas, banners, print options etc. Spools are used in other contexts; for example, tape-based backup servers now spool backup data from networked hosts on a fast disk-based storage system, so that they can then feed modern tape drives at the tremendous speeds they need to avoid tape shoe-shine.
In the context of the comment, the relevance of a spool is that it decorrelates the print job submission from its fulfillment. Not spooling would mean that the submission would only complete with the print job, and thus your lpr
command would only complete once the job completed. Removing the spool on your computer might not have the desired result though since the printer itself could spool too!
First, let’s begin with the meaning of the term “spooling”: sometimes the size of a document is larger than the printer’s memory, so “printer spooling” allows the sending of multiple documents to a printer and putting all theses documents in a queue.
Now, under Unix there are two printing systems:
- The BSD spooling system uses
lpd
daemon to schedule the print jobs. - The SVR4 spooling system uses
lpsched
as the scheduler.
Jeff Lessem’s USAIL: Unix system administration independent learning has a section on Printing under Unix which provides a good overview of both the BSD and SVR4 systems:
The BSD spooling system
extends well to large, heterogeneous networks allowing many computers to share printers.
Under the BSD spooling system, access to printers is controlled by
lpd
daemon and thelpr
program.lpr
is the only program on a BSD system that can queue files for printing.
lpr
accepts data to be printed, puts it in a spooling directory, and notifies thelpd
daemon. For each print job,lpr
creates two files, a control file (cfxxx) and a data file (dfxxx) in the spool directory, xxx indicating a unique job-id. The control file contains the information for handling the print job, including the identity of the owner. The data file contains the actual data to be printed.The
lpd
daemon checks the/etc/printcap
file to identify the destination printer. If the destination printer is a local device,lpd
makes sure a copy of thelpd
daemon is running on that print queue. Otherwiselpd
opens a connection to the remote host to which the printer is connected and transfers both the control and data file to it.Print jobs are scheduled by
lpd
on a First-In, First-Out (FIFO) basis. However, the system administrator may use the lpc command to alter the priority of the jobs in the print queue.
SVR4 spooling system is used by Solaris and HP-UX. It offers more control and flexibility but was not designed for network printing and is more complicated to set up.
Under SVR4 spooling system, the
lp
command accepts the data to be printed, makes a copy of it in the spool directory associated with the destination. The destination consists of a printer name and an optional specification of a class to which the printer belongs. When the specified printer is busy the job is sent to another printer in the same class. The spool directory is normally/var/spool/lp/request/printer-name
and the print file is given a unique name to identify both the job and the user.Access to the printer is controlled by
lpsched
daemon. It picks up the jobs from the spool directory and sends them to appropriate destination when it becomes available. lpsched also keeps a log, usually in/usr/spool/lp/log
. The log file would indicate any error in processing the print jobs, as well as the user-name,
See also: Printers and printer spooler – lp, lpstat and cancel commands | Tips & Tricks for IT's Blog