Debian Package Management
By Scott Brady
- Package Contents
- How Debian is Partitioned
- Holding a Package
- Syncronizing Two Machines
- apt - Advanced Package Tool
- Cleaning Out the apt Cache
- Custom Debian Kernel Package
- Installing Source Packages
- synaptic: GUI Package Management
In the good old days when men were men, women were women and everyone wrote their own device drivers, when you wanted to install a program you acquired a compressed archive of the source (e.g. program.tar.gz). You probably got it from a floppy, a CDROM or over the network. Today you would get it from the Internet.
Once you got the source code you would decompress and untar the contents. The next step would be to run the "./configure" command, which would inevitably fail after having not found some required library. You would then need another floppy, CDROM or download from the network to get the missing library source code. Another round of "./configure" and compiling would be in order ad infinitum. Once the original program you wanted to install was properly configured you could then commence with compiling it. If your computer was slow or the program very large (e.g. X.org, KDE, Firefox, etc.) you would lose the better part of the day waiting for the compile to finish.
When the Debian developers started up the project they realized this scenario was not feasible for most people (especially desktop users). They decided to create something called a package. A package is a file that contains all the data for a program (binary files, documentation, configuration files, etc.) along with meta data (package name, dependencies, package maintainer, etc.). Debian has two types of packages: binary and source. (Debian source packages are much more sophisticated than vanilla tar.gz files.)
A Debian source package actually consists of three files: one .dsc file, one .tar.gz file, and one .diff.gz file. The .dsc file contains PGP signed meta data about the source package which includes the package name, author, dependencies, etc. It also contains checksums for the two .gz files to guarantee the integrity of the archives.
A Debian binary package is one file that contains meta data for the package along with all the binary files, configuration files and documentation for that package.
Debian package files are named using a standard convention:
package_version-revision_architecture.deb or .dsc
for example: apache_1.3.33-3_i386.deb
The version field can be any version scheme that the package management system can understand. What that means is the package management system must be able to examine the version field and determine if a version comes before or after another version. A package maintainer couldn't designate the first release "z" and the second release "y", but "a" followed by "b" could be used (this is common sense but should be mentioned). The most common version formats are "dot" versioning (e.g. 1.2.3) and date versioning (e.g. 20050516). Notice that the date version wouldn't work if the maintainer were to use the MMDDYYYY format: 05162005 (while a human can sort that just fine, a computer can't).
The revision field indicates that the package has been changed for the same version of the package. If the package maintainer created a bug in the package that someone noticed and filed a bug report about, the maintainer would fix the bug in the package, increment the revision number by one, and release the new package. The change could also be prompted by the discovery of a bug in the program the package provides.
The architecture field indicates the CPU architecture the package is compiled to run on. Debian supports a total of eleven architectures with two more architectures being added in the next release. The current list is:
i386, m68k, sparc, alpha, powerpc, arm, mips, mipsel, hppa, ia64, and s390.
"amd64" is one of the architectures that will be supported in the next stable release. It's important to note that some packages have an architecture of "all". This is used for scripts (perl, bash, etc.) that don't need to be recompiled for every architecture. Debian recently started converting some of the core C programs used in the base operating system over to scripts in order to aid portability. Obviously, this endeavor can only be taken so far because some programs can't or shouldn't be scripts.
A Debian source "package" actually consists of three files. Here are the three source package files for the grep program:
- Signed package meta data and source file hashes.
- The original source code.
- diff file containing the changes to the original source code made by the package maintainer.
Every package will have the same three files but obviously will a different name, version and revision.
A Debian binary package contains all the data for the program in one file. Because the file is a standard archive, the contents can be extracted with the "ar" program:
ar xvf grep_2.5.1.ds1-4_i386.deb
This will produce three files that are named exactly the same for all Debian binary packages:
- debian-binary (package format version number)
- preinst/postinst/prerm/postrm (shell scripts)
- md5sums (hashes of all files)
- control (package meta data: author, package name, etc.)
- all the binary, config, doc, and other files for the package
The "debian-binary" file contains the version number of the package format. This lets the package management system know what format the package is in. When this document was authored the current version was "2.0". The "control.tar.gz" file contains scripts to execute before or after the installation or removal of the package, checksums for all the files to verify their integrity, and the control file that contains package meta data. Finally, the "data.tar.gz" file contains the actual binary, configuration, documentation and other files for the program.
Debian is divided into three branches: stable, testing and unstable. All development starts in unstable and works it's way down to testing and then eventually to stable in that order. All packages start their life in unstable and after adequate debugging are migrated to testing. After extensive debugging in testing, and only after a period where the testing branch is "frozen" and the critical bug count is reduced to zero, a release is deemed ready to be called stable. At that point the testing branch becomes the stable release and a new testing branch is created.
stable testing unstable Slink <- Potato <- Woody <- Sarge <~~ Sid | | /|\ <- unsupported | | supported `- Etch
Debian also utilizes code names that can be thought of as symbolic links to the actual branch they represent. This allows groups of packages that represent a release to flow through the release system under one code name. For example, Sarge is the current code name for the testing branch. When Sarge has been deemed ready for release it will become the new stable. Woody, the current stable release, will move into the land of unsupported leases. Finally, a new code name (Etch) will be attached to the testing branch. Rinse and repeat. The only branch/code name link that never changes is that of unstable and Sid. Sid is always unstable; that is simply the way of the universe.
Debian provides bug fixes and security updates for both the stable and testing branches (and unstable, but you shouldn't be running unstable). When a new stable release is made the previous stable release is supported for a short period of time and then support is dropped. That means you're on you're own for bug fixes and security updates. The good news is that Debian provides an easy way to upgrade between releases (more on that later). I've upgraded between releases multiple times on my servers and never had a problem.
Like many people, you're probably wondering which branch is best for you. This simple chart is your best guide to picking the right branch:
- unstable: Debian developers only
- testing: Desktop
- stable: Server
I want to make it clear that unstable is not for you. The unstable branch is only for serious developers who have extensive knowledge about Debian and are actually developing for Debian. I know a good deal about Debian but I would never run unstable; I value the stability of my system too much.
If you plan on using Debian for a desktop machine then testing is the branch you want. You can keep up to date with new versions of software like X.org, KDE, GNOME, Firefox, Evolution, Open Office, etc. without sacrificing stability.
If you plan on using Debian for a server or any other situation where reliability is crucial, then stable is your best option. Stable doesn't have the latest, hottest, bleeding edge software. It's not suppose to. The software in stable has undergone a long period of review and bugging and therefore is very stable. You're also guaranteed that a major version change of a software package will not occur within the stable release. The only changes will be updates that fix bugs or security holes.
Some of you are probably wondering where these funny names come from. The astute reader will notice that all the code names are characters from the movie Toy Story. Bruce Perens, the successor to Ian Murdock (the first Debian project leader), was employed by Pixar when they were creating Toy Story.
Here's another piece of trivia: ever wonder where Debian got it's name from? Could it be Yiddish for "freedom" or maybe Swahili for "more fish"? Nope. Debian came from the man who founded the project, Ian Murdock. He combined his first name with the first name of his wife, Debra, to get Debian (Deb + Ian).
Debian has three different categories of software in the repository: main, non-free, and contrib. Packages in main meet the Debian Free Software Guidelines (DFSG), packages in non-free fail to meet the DFSG, and packages in contrib meet the DFSG but depend on packages that fail to meet the DFSG.
- main (packages meet DFSG)
- non-free (packages fail to meet DFSG)
- contrib (packages meet DFSG but depend on packages that fail DFSG)
Unless you have political reasons for using completely free software there is no problem with using software from the contrib or non-free categories. You should keep in mind that the software in those categories may put restrictions on the redistribution of binaries or source code. You should read the license that comes with the software if you have any concerns.
Because of draconian encryption export and patent laws in the United States, Debian maintains two archives. One that has to be distributed outside the U.S. (debian-non-us) and one that can be distributed inside the U.S. (debian).
- Almost everything.
- Debian packages that can't be distributed in the U.S. due to software patents or use of encryption.
Software in debian-non-us either violates a U.S. software patent or runs afoul of U.S. encryption export regulations. I am a U.S. citizen yet I use software from the debian-non-us archive and have never heard of any U.S. Debian users going to jail for using software in debian-non-us. Ultimately, you have to decide your own comfort level and exposure to litigation when choosing to use packages from the debian-non-us archive.
The entire file system is fair game for the Debian package manager to override, except for all files and directories in /usr/local and configure files in /etc. The package manager will never touch anything in /usr/local. When a package wants to change a configuration file in /etc, the package manager will ask you how to proceed. You will have the option of keeping the existing configuration file, comparing differences between the files, or overwriting the file with the new version.
off limits: /usr/local
protected: /etc (will ask user before making changes)
The Debian developers are also sane enough to not step on data in other sensitive places, like /home and /var.
dpkg is a medium-level Debian package management utility. It sits on top of dpkg-deb which comprises the lowest level of the package management hierarchy (we won't be covering dpkg-deb in this article). dpkg allows you to gather information about packages, get information on the state of a operating system's package database, and manipulate the package database (e.g. installing packages, removing packages, etc.). You can check on the status of any package in the operating system's database with the following command:
dpkg -s package
The list of possible package statuses are as follows:
- install - the user wants the package installed or upgraded
- deinstall - the user wants the package deinstalled
- purge - the user wants the package to be removed completely, including its configuration files.
- hold - the user wants this package not to be processed, i.e., keep the current version with the current status.
If the package is installed, additional information about the package will also be output, including the package author, dependencies, path to configuration files, and a description of what the package does.
If you want to list the status of every installed or removed package, run the following command:
dpkg -l | less
That will return the status of every package that was at one time, and may still be, installed. If the package was removed the status will indicate the package's removal.
You can use the list of installed packages to search for a specific package by piping the the output of dpkg into grep. You should note that the default column width often truncates package names so you should set the column width variable before running the command. Here's an example:
COLUMNS=150 dpkg -l | grep apache
(note that this is one command)
dpkg allows you to search the package database for package names that match your search string. The following command will search for a package that has a name matching 'apache*':
dpkg -l 'apache*'
Please note that dpkg only searches package names. It won't search package meta data (e.g. the description) for the string.
You can also find what package provided a file. The following command returns the name of the package that installed the file called "grep":
dpkg -S grep
You can supply either just the file name or the full path to the file (e.g. grep or /bin/grep). Supplying just the file name may yield false positives if files of the same name are supplied by other packages in other locations in the file system. Giving the full path will guarantee it gets the right match.
If you've acquired a Debian binary package from some source (floppy, CDROM, the Internet, etc) you can install it on your system with one command:
dpkg -i foo_ver-rel.deb
It's important that you include the path to the file (or just the file name if it's in the current directory) and the file's full name (not just the package name). You're telling it to install a package file, not a package by name. dpkg will handle unpacking the archive, will ask you questions if needed about how to configure the package, and will back up old configuration files (if you chose to update the configuration file). If the package depends on other packages that are not installed, it will give you a list of the packages you need to install. It's up to you to go find them and install them.
dpkg allows you to remove installed packages with one command. Because you are telling dpkg to remove a package by name, it's important that you not supply the name of the package file. All you need is the package name:
dpkg -r apache
The -r option will remove the package's files but will leave any configuration files in place. If you also want to remove the configuration files, you need to tell dpkg to purge the package:
dpkg -P foo
When you install a file, dpkg will ask you questions about how you want the package configured. If you change your mind and want to reconfigure a package, you can issue the following command:
Here are some miscellaneous commands that you may find useful:
contents of installed package: dpkg -L package
contents of package file: dpkg-deb -c foo_ver-rel.deb
extract single file from package: dpkg --fsys-tarfile foo_ver-rel.deb | tar -xf - thefile*
(all of these commands must be run as root)
I'm afraid that many new Debian users have been scared off by dselect. You can really tell this program was designed by a bunch of engineers. The interface is very confusing, cumbersome and overall quite dreadful. If anyone tells you to use dselect get out your garlic and cross, yell "NO!" and run away. Save yourself the misery.
Say you want to prevent the package "apache" from being upgraded. You can change the status of any package to "hold" which will prevent it from being upgraded. First you need to dump the list of selections to a text file:
dpkg --get-selections \* > selections.txt
Now you need to open selections.txt in you favorite text editor (vi, emacs, etc.) and search for the line that says apache (or whatever package you want to hold):
Now replace install with hold:
You now need to read in the modified selections file:
dpkg --set-selections < selections.txt
If you now run "dpkg -s apache" the status of the package will be "hold".
The selections file can also be used to dump the list of installed packages from one computer and insert the list into the package database of another computer. After copying the selections.txt file over to the other computer you would run the following commands:
dpkg --set-selections < selections.txt
While this would cause both computers to have the same installed packages (assuming they are both running the same release), it wouldn't synchronize configuration files. You would still need to copy over any configuration files and any other data to the new machine.
Now that we've learned the basics of dpkg, it's time to go up one layer of the Debian package management system. Sitting on top of dpkg is a powerful set of utilities called "apt" (short for Advanced Package Tool). apt provides the ability to install and remove packages and upgrade the entire system by issuing one command.
apt uses the /etc/apt/sources.list file to determine where to retrieve package files. In order for apt to work your sources.list file must be properly configured. This is when our earlier discussion about Debian branches, code names, categories and archives comes in handy. You need to construct lines in the configuration file that tell apt what server to get the packages from, what branch or code name to use, what categories of packages to use and which archive to use. It's really not all that hard. Especially with the following example file:
# from /etc/apt/sources.list:
# use the US mirror, testing code name, main/contrib/non-free
deb ftp://ftp.us.debian.org/debian/ testing main contrib non-free
# same as above but use non-US packages
deb ftp://non-us.debian.org/debian-non-US testing/non-US main contrib non-free
# same as above but for security updates
deb ftp://security.debian.org/debian-security/ testing/updates main contrib non-free
# again, same as above but for non-US security updates
deb ftp://security.debian.org/debian-non-US/ testing/non-US main contrib non-free
The first thing you'll want to do is determine which mirror to use. A list of mirrors can be found here. You'll also need to use the correct branch or code name for your installed operating system. After that it's up to you which categories and archives to use.
You might be wondering whether to specify a branch or a code name. If you're running a desktop box on testing then you probably want to use the branch name "testing". That way when the testing branch becomes stable you will keep on using testing. On the other hand, if you're running a server you'll want to use the code name (e.g. woody). You don't want a surprise one day when you go to download security updates and because a new stable has been released it wants to install a couple hundred packages. You'll want to upgrade to the new stable release eventually, but at your own time and when you're ready; not by accident.
If editing the sources.list file by hand is too daunting, you can use a curses based program that will guide you through configuring the file. Run the following command for the guided sources.list configuration:
Whenever you change your sources.list file you need to tell apt to download all the package data from the servers. That is done with the following command
You'll also want to run this command periodically to get the latest list of packages on the server. I don't necessarily recommend making it a cron job because that wastes bandwidth. Just get in the habit of running "apt-get update" before you install new packages or upgrade installed packages.
apt maintains a cache of packages that are available on the server. You can search the package names and descriptions by passing the apt-cache utility the "search" parameter followed by the string to search for:
apt-cache search keyword
If you find a package that you want more information about, you can tell apt-cache to show you all the meta data about the package:
apt-cache show package
That will return the package author, description, dependencies, architecture, version, etc.
Once you've found a package you want to install, you can tell apt-get to install the package for you:
apt-get install package
apt-get will then figure out if the package you want to install depends on any other packages that are not already installed. This includes packages that are installed but are the wrong version (e.g. old). apt-get will then connect to the server listed in your sources.list file and download the latest version of the package and all of its dependent packages. It will then install all of the packages and if needed ask you questions about how you want the packages configured. You can install any program in Debian's vast repository by issuing one command. The Debian package management system takes care of the rest for you.
Removing packages is just as easy as installing them. One command can remove any installed package:
apt-get remove package
To prevent you from inadvertently removing a package you don't want to remove, apt-get will prompt you before continuing. It will also let you know how removing the package will effect other packages that depend the package. If you remove a package that is required by another package that other package will also be removed.
A normal remove does not remove any of the program's configuration files. If you also want the configuration files removed, you need to issue the purge command:
apt-get remove --purge package
If for some reason you want apt to go through the process of installing a package again (for example, if you accidentally removed one of the package's files or deleted it's configuration file), you can tell apt to reinstall an already installed package:
apt-get install --reinstall package
If you want to see what apt-get would do without actually changing your system, you can simulate an install or remove by passing apt-get the "-s" switch:
apt-get install -s package
This will print out all the messages as if it were really working but without actually touching the system.
The kernel (file system) in Debian supports replacing files even while they're being used. This means that you can upgrade a running program without trouble. In fact, you can upgrade every package on your system with one command:
apt-get upgrade -u
The "-u" tells apt-get to list all of the packages that will be upgraded. If you don't care what it will upgrade you can leave that switch off and it will still upgrade just fine.
Let's say you've been running the stable release for a while and the Debian project has released a new stable. You're eager to upgrade to the new stable release. First, you should change your sources.list file to make sure it points to the release you want. You'll then need to run "apt-get update" if you made any changes or if you haven't updated in a while. The different with upgrading packages within a release and upgrading from one reason to another is the command you need to issue:
dist-upgrade is smarter than regular upgrade when it comes to sorting out dependencies between two releases.
apt-get dist-upgrade does a pretty good job of upgrading to a new release, but sometimes it gets confused. To fix any inconsistencies, run the following command until the problem is corrected:
apt-get -f install
If that doesn't work, you can also try telling dpkg to clean things up:
dpkg --configure -a
I've upgraded between releases multiple times on my computers and have never had an unsuccessful upgrade.
If your installation of Debian is configured to cache all the packages you download, you will quickly accumulate a large number of unneeded package files. Debian stores all of the package files in /var/cache/apt/archives/. You can tell it to remove all of these packages with the following command:
You can create a Debian package that contains your custom compiled kernel. After you've configured your kernel (set all the drivers to load and other options), issue the following command in your kernel source code directory:
make-kpkg -rev custom.N kernel_image
Replace N with the number of your custom kernel. "1" is a good place to start. When your kernel is done compiling a Debian package file will be waiting to be installed:
dpkg -i ../kernel-image-ver_custom.N_i386.deb
"ver" indicates the version of your kernel (e.g. 2.6.7) and N the number you chose.
To remove a kernel package you have created, issue the following command:
dpkg --purge --force-remove-essential kernel-image-ver
dpkg is smart enough to prevent you from removing the current running kernel.
While we've spent all of our time talking about binary packages, you also have the option of installing packages from source. The first step you need to take is to add deb-src directives to your sources.list file:
# from /etc/apt/sources.list:
deb-src ftp://ftp.us.debian.org/debian/ testing main non-free contrib
deb-src ftp://non-us.debian.org/debian-non-US testing/non-US main contrib non-free
Refer back to the section on configuring your sources.list to get all the options right for your installation. Because you've changed the sources.list file you will need to run "apt-get update". Now, run the following commands:
apt-get source foo
apt-get build-dep foo
dpkg-source -x foo_version-revision.dsc
dpkg-buildpackage -rfakeroot -b
dpkg -i ../foo_version-revision_arch.deb
You'll need to replace the actual package name, version number, revision and architecture for the package you are compiling.
Now that we've gone over all the facets of Debian package management, you're probably wondering if there's an easier way, right? The answer to your question is Synaptic, a graphical front-end to the Debian package management system. With Synaptic you can manage your sources.list file, install and remove packages, and upgrade your system all through a convenient GUI.
- Run synaptic
- Enter root password
- Manage your packages