TransWikia.com

How to edit pdf metadata from command line?

Ask Ubuntu Asked by bdr529 on December 10, 2021

I need a command line tool for editing metadata of pdf-files.

I’m using a Aiptek MyNote Premium tablet for writing my notes and minutes on this device, import them later and convert them to pdf automatically with a simple script using inkscape and ghostscript.

Is there any command line tool to add some categories to the pdf’s metadata, so i can find the pdf later (e.g. with gnome-do) by categories?

Update: I tried the solution with pdftk and it works, but it seems that gnome-do doesn’t take care of pdf-metadata. Is there a way to get gnome-do to do that?

7 Answers

I needed to blank out the Author field in a PDF exported from LibreOffice. None of the solutions listed above worked for me, so I used hexedit and overwrote the Author field. Blunt instrument but effective!

In detail:

  1. Run:

    $ hexedit file.pdf
    
  2. Tab to switch to ASCII.

  3. Ctrl+S to search for "Author".

  4. Skip the <FEFF at the start of the field.

  5. Write 0 over all characters (except I preserved three 0x03 characters... YMMV) up to the closing >.

  6. Ctrl+X to save and exit.

Answered by Jonathan on December 10, 2021

This is in the act library so you can edit PDF metadata from the command-line here as well.

$ npm install @lancejpollard/act -g
$ act update input.pdf --title foo --author bar --subject baz -k one -k two

You can also set -p publisher, -c creator, -t0 created date, and -tn updated date.

Answered by Lance Pollard on December 10, 2021

I have extensively tested the functionality of pdftk and exiftool. I have used exiftool both at command line and through a graphical window. These have been tested for small, medium size and very large PDF documents and found to have issues with the largest and most complex PDF documents. In my experience, the pdftk / exiftool have top functionality only for small and for simple-in-formatting PDF documents. For large and complex PDF documents (eg more than 80 pages with multiple fonts) images and/or characters may fall out from the last pages after the metadata has been edited. The solution may be in the use of Ghostview, which I saw just now. No doubt these programs will improve with time.

In the meantime, I have found a solution in using the present form of Wine in Ubuntu with a one-window tiny freeware program, which works also for these large, complex PDF documents: BeCyPDFMetaEdit (available eg from freeware libraries like SoftPedia).

Answered by Aristo T. on December 10, 2021

To elaborate on the pdftk method, which is nice because it shows you everything that's being set, at the same time as allowing you to change anything you like, here is a script (for your .bashrc or other aliases file) to do it with one command. This creates a new version of the file you want to edit, opens your favourite editor with the metadatafile, and then implements your changes and sets the file creation/modification time on the modified PDF file to be the same as the original. To use it, after resourcing your .bashrc file, just type

editPDFmetadata myfile.pdf

Here's the alias:

editPDFmetadata() {
OUTPUT="${1}-new.pdf"
METADATA="tmp${1}-report.txt"
pdftk "${1}" dump_data output "$METADATA"
$EDITOR "$METADATA"
pdftk "${1}" update_info "$METADATA"  output "$OUTPUT"
touch -r "${1}" "${OUTPUT}"
}

Simply place the definition above into the .bashrc file in your home folder, then open a new terminal and it will be ready to use.

Answered by CPBL on December 10, 2021

Using Ghostview

Install ghostscript with:

$ sudo apt install ghostscript

Create a file named pdfmarks with similar content:

[ /Title (Document title)
  /Author (Author name)
  /Subject (Subject description)
  /Keywords (comma, separated, keywords)
  /ModDate (D:20061204092842)
  /CreationDate (D:20061204092842)
  /Creator (application name or creator note)
  /Producer (PDF producer name or note)
  /DOCINFO pdfmark

then combine this pdfmarks file with a PDF, PS or EPS input file:

gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=output.pdf original.pdf pdfmarks

Source: http://milan.kupcevic.net/ghostscript-ps-pdf/

Answered by Serge Stroobandt on December 10, 2021

Give exiftool a try, it is available from the package libimage-exiftool-perl in the repositories.

As an example, If you have a pdf file called drawing.pdf and you want to update its metadata, Use the utility, exiftool, in this way:

exiftool -Title="This is the Title" -Author="Happy Man" -Subject="PDF Metadata" drawing.pdf

For some reason the Subject entered ends up in the keywords field of the metadata in the pdf file. not a problem in some cases, even desirable, however, this may be problematic, evince and the nautilus metadata previewer do not show this but Adobe Acrobat viewer and PDF-XChange viewer does.

The program will create a backup of the original file if you do not use the; -overwrite_original switch, this means a duplicate will exist in the folder where the updated pdf is. From example above; a file named ; drawing.pdf_original will be created.

use the overwrite switch at your own risk, my suggestion is not to use it and script something to move this file to a better location just in case.

Answered by Sabacon on December 10, 2021

You can edit metadata by using pdftk. Check out update_info (or update_info_utf8 if you need accented characters) parameter. As for data file, below is an example:

InfoKey: Title
InfoValue: Mt-Djing: multitouch DJ table
InfoKey: Subject
InfoValue: Dissertation for Master degree
InfoKey: Keywords
InfoValue: DJing, NUI, multitouch, user-centered design
InfoKey: Author
InfoValue: Pedro Lopes

(Source)

Answered by Olli on December 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP