23.2.2016

Links

Introduction

Petr Bravenec has recently introduced the application GeoSign that aims to facilitate the work of Czech surveyors. GeoSign needs to manipulate documents in PDF. This manipulation is handled by PDF Manipulaction Utility (PDFMU), a utility we created for this purpose. We are releasing the souce code of PDFMU under the AGPL license. The current version of the program is 1.0.

How can you use PDFMU and what can you use it for?

PDFMU can perform various operations on PDF documents, for example adding digital signatures or attachments. It uses a command line interface, so it is suitable for batch processing. Its standard Windows name is pdfmu (or pdfmu.exe). I will use this name in the following sections to denote the PDFMU executable.

Since it is implemented in Java, it is necessary to have JRE (version 7 or newer) installed in order to run PDFMU.

PDFMU supports the following operations on PDF documents:

In the following sections I will describe these operations.

update-version

Every PDF document contains the information about the version of PDF used to encode the document. The PDF versions released so far are backwards compatible, so it is possible to raise the PDF version of a document without damaging the document's consistency. However, changing the version of a document using PDFMU invalidates all the embedded digital signatures so it is recommended to change the version (if necessary) before signing the document.

Usage example
  • pdfmu update-version document.pdf --force --version=1.6: changes the PDF version of the document document.pdf to the value 1.6

PDFMU supports PDF versions 1.2 through 1.7.

update-properties

PDF documents contain so called properties. Every property in a document has a unique name and a value. PDFMU can set and remove any properties in a PDF document with the exception of the properties Producer and ModDate.

Some of the properties have a special support in the PDF specification; we call them standard properties. The operation pdfmu update-properties has special options to set the standard properties:

Name Description PDFMU option
Internal English
Title Title --Title
Author Author The name of the person who created the document. --Author
Subject Subject --Subject
Keywords Keywords --Keywords
Creator Application The name of the product that created the document in the original format. --Creator
Producer PDF Producer The name of the product that converted the document from the original format to PDF.
CreationDate Created The date and time the document was created. --CreationDate
ModDate Modified The date and time the document was most recently modified.
Trapped Has the document been modified to include trapping information? --Trapped

PDFMU sets the properties Producer and ModDate (date and time of last modification) automatically.

Usage examples
  • pdfmu update-properties document.pdf --force --Title="My document": sets the property Title in the document document.pdf to the value "My document"
  • pdfmu update-properties document.pdf --force --set "document owner" "Al Pine": sets the property "document owner" of the document document.pdf to the value "Al Pine"
  • pdfmu update-properties document.pdf --force --clear "document owner": removes the property "document owner" from the document document.pdf

attach

A PDF document may contain other files as attachments. Each attachment has a name and a description. A call to pdfmu attach attaches one file and optionally sets its name (option --rename) and description (option --description).

Usage examples
  • pdfmu attach document.pdf attachment.txt --force: attaches the file attachment.txt to the document document.pdf (under the name attachment.txt)
  • pdfmu attach document.pdf attachment.txt --force --rename="my file.txt" --description="Important information": attaches the file attachment.txt to the document document.pdf under the name my file.txt with the description "Important information"

sign

Adding digital signatures is the most complex operation of PDFMU with respect to both the user interface and the implementation. For a better understanding of digital singatures in PDF documents I recommend the book Digital signatures for PDF documents. In the following paragraphs I will attempt to summarize the basic knowledge necessary to work with digital signatures in PDF.

Digital signature of a file is information that is tied to the content of the file and with the identity of the person or the institution that created the signature. A signature allows us to verify whether the file has not changed since the moment of signing and whether it was really signed by whoever claims to be the signer. Signatures are usually used to confirm authorship or authentication of a document.

Timestamp of a file is information that allows us to verify whether the file already existed in the current form at a time recorded in the timestamp.

Timestamp authority is a server that issues timestamps. If we trust the server, we can also trust the (valid) timestamps issued by this server.

Internal digital signature (or timestamp) of a file is a digital signature (or timestamp) embedded in the file.

A PDF document may contain one or more internal digital signatures. Every signature contains its signing time. If the signing time is recorded in the form of a timestamp, it can be verified.

Usage examples
  • pdfmu sign document.pdf --force --keystore-type=pkcs12 --keystore=cert.p12: signs the document document.pdf using the private key and certificate saved in the file cert.p12
  • pdfmu sign document.pdf --force --keystore-type=pkcs12 --keystore=cert.p12 --tsa-url="http://example.com/tsa" --tsa-username=name --tsa-password=password: same as previous, and additionally adds a timestamp from the timestamp authority http://example.com/tsa, using the username name and password password for authorization
  • pdfmu sign dokument.pdf --force --keystore-type=Windows-MY --key-alias="Al Pine": signs the document document.pdf using the private key and certificate saved in the Windows certificate store under the friendly name "Al Pine"

inspect

The operation inspect prints out PDF version, properties and digital signatures of a document, that is all the information that can be modified using the operations update-version, update-properties and sign.

Usage example
  • pdfmu inspect document.pdf: prints out the information about the document document.pdf

Additional options

Besides the options shown in the examples above, PDFMU offers many more, for example printing in JSON format or timestamp authority authorization using a certificate. You can display all the supported options using the option --help. Calling pdfmu --help displays the basic options and for example calling pdfmu sign --help displays the options specific for the operation sign.

Do you want to try PDFMU out?

Then head to the GitHub repository, download the source code and build the program using Maven. You can find more detailed instructions on the page README.

Implementation

PDFMU is created in the programming language Java. The core functionality of each of the supported operations is implemented in the library iText; PDFMU basically exposes a part of the API of iText in a command line manner.

Other than that, PDFMU uses the library Argparse4j for parsing command line arguments and the library Jackson for formatting the output in JSON.

The source code of PDFMU is freely available in the GitHub repository.

Hobrasoft s.r.o. | Contact