Search code examples
ubuntucopyright-display

How to list licences of all installed packages in Debian-based distros?


I want to get all the installed packages licenses on my Ubuntu server, I can dump it all by using (this 2013 post):

packages=$( dpkg --get-selections | awk '{ print $1 }' )
for package in $packages; do
  echo "$package: "
  cat /usr/share/doc/$package/copyright
  echo; echo
done > /tmp/licenses.txt
less /tmp/licenses.txt

But the output is a huge useless file with all the copyright data for each package. I need something like:

package: package_name        licence: licence_name

Is there a parser or some other tool to get data like this?


Solution

  • What you are trying is poorly supported at the moment, though there is an effort under way to provide machine-readable information in the file /usr/share/doc/*/copyright files. See for example this excerpt:

    Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
    Upstream-Name: at
    Source: git://anonscm.debian.org/collab-maint/at.git
    Comment: This package was debianized by its author Thomas Koenig
     <ig25@rz.uni-karlsruhe.de>, taken over and re-packaged first by Martin
     Schulze <joey@debian.org> and then by Siggy Brentrup <bsb@winnegan.de>,
     and then taken over by Ryan Murray <rmurray@debian.org>.
     .
     In August 2009 the upstream development and Debian packaging were taken over
     by Ansgar Burchardt <ansgar@debian.org> and Cyril Brulebois <kibi@debian.org>.
     .
     This may be considered the experimental upstream source, and since there
     doesn't seem to be any other upstream source, the only upstream source.
    
    Files: *
    Copyright: 1993-1997,  Thomas Koenig <ig25@rz.uni-karlsruhe.de>
               1993,       David Parsons
               2002, 2005, Ryan Murray <rmurray@debian.org>
    License: GPL-2+
    
    Files: getloadavg.c
    Copyright: 1985-1995, Free Software Foundation Inc
    License: GPL-2+
    
    Files: posixtm.*
    Copyright: 1989-2007, Free Software Foundation Inc
    License: GPL-3+
    
    Files: parsetime.pl
    Copyright: 2009, Ansgar Burchardt <ansgar@debian.org>
    License: ISC 
    
    License: GPL-2+
     This program is free software; you can redistribute it
     and/or modify it under the terms of the GNU General Public
     License as published by the Free Software Foundation; either
     version 2 of the License, or (at your option) any later
     version.
    

    See the specification (linked above) in http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ for details.

    As you can see, the basic assumption that there is necessarily a single license per package is false. There may be multiple licenses per file -- depending on which problem you are trying to solve, it may of course be possible to ignore many of them (for example, if you want to investigate whether or not you have stuff under the Apache license, that should be easy to do, for packages which have transitioned to this new format).

    This is new with Debian Jessie, released in 2015; older versions of Debian do not have anything like this. The best you can do if you need to audit a system with older packages is probably to grep the copyright files for fragments which look like GPL, BSD, MIT etc and then hope you're not missing too much; but hope on top of some flimsy grepping seem anathema to any proper legal work, which I think we can assume is the reason you are attempting this. A better approach might be to find the current copyright files for the packages you are auditing, with the roughly machine-readable information, and hoping (there's that word again) that they are adequate for the older version you have installed, too.

    (For comparison, older versions, too, are available at http://metadata.ftp-master.debian.org/changelogs/main/a/at/ for you to examine.)

    I don't follow Ubuntu very closely any longer, but assume they are picking up this change since a few versions back. Indeed, http://packages.ubuntu.com/xenial/at seems to have the same copyright file.