Ask Your Question
2

How do I get document information from the command line? [closed]

asked 2012-02-18 01:00:44 +0200

Armando Ortiz gravatar image

updated 2012-02-18 19:32:22 +0200

cloph gravatar image

I love the command line capability because it allows me to work with documents like never before, however, I was wondering if there was a way to get document info/properties in the same manner.

Maybe something like:

oodocinfo <filenameofdocument>

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2016-02-28 20:34:55.173710

6 Answers

Sort by » oldest newest most voted
1

answered 2012-02-19 16:18:03 +0200

Luuk gravatar image

updated 2012-03-05 14:11:31 +0200

There are probably other who have more experience in writing scripts, but i submitted a script to pastebin.com http://pastebin.com/mpMA1qxM

#!/bin/bash
# documentinfo, gives document info about a LibreOffice document.
# Created: 2012-02-19
#
# uses: xml, from http://xmlstar.sourceforge.net/
#

if [ ! -e $1 ] || [[ "$1" == "" ]]; then
        echo "Usage: `basename $01` <filename>"
        exit 1
fi

if [ -e "meta.xml" ]; then
        echo "Sorry, this cannot be done because some 'meta.xml' already exists"
else
        # extract 'meta.xml' from the inputfile
        unzip -qo $1 meta.xml

        for f in `xml el meta.xml`;
        do
                n=${f/*:}
                if [[ ! "$n" =~ "meta" ]]; then
                        w=`xml sel -t -v "$f" meta.xml`
                        echo "$n: $w"
                fi
        done
        rm meta.xml
fi
edit flag offensive delete link more
1

answered 2012-02-19 22:33:15 +0200

Klau3 gravatar image

updated 2012-02-27 00:23:46 +0200

It's not perfect, but usable:

unzip -c [FILE NAME].odt meta.xml | tr -s " " "\n" | fmt

You could write with this command output a ~/.bashrc function that extracts the desired information, maybe even extend the file command. Not a one-liner but possible.

:)

edit flag offensive delete link more

Comments

That's nice! Also remember ODF files are ZIP files, for which you can obtain detailed information using any ZIP file utility (zipinfo for example in GNU/Linux).

MagicFab gravatar imageMagicFab ( 2012-03-05 14:09:37 +0200 )edit
0

answered 2014-07-06 19:04:27 +0200

bencomp gravatar image

You can use the ExifTool to extract document information from many types of files, including Open Document Format files.

For example, after installing, you can extract all information using the command

exiftool -a file.odt

and get for example

ExifTool Version Number         : 9.67
File Name                       : exiftest.odt
Directory                       : D:/Software
File Size                       : 9.1 kB
File Modification Date/Time     : 2014:07:06 18:39:05+02:00
File Access Date/Time           : 2014:07:06 18:39:05+02:00
File Creation Date/Time         : 2014:07:06 18:39:04+02:00
File Permissions                : rw-rw-rw-
File Type                       : ODT
MIME Type                       : application/vnd.oasis.opendocument.text
Initial-creator                 : Firstname Lastname
Creation-date                   : 2014:07:06 18:37:48.864000000
Editing-cycles                  : 1
Editing-duration                : P0D
Description                     : En een beetje commentaar.
Keyword                         : een
Keyword                         : sleutelwoord
Subject                         : Blaat
Title                           : Test
Date                            : 2014:07:06 18:39:04.738000000
Creator                         : Firstname Lastname
Document-statistic Table-count  : 0
Document-statistic Image-count  : 0
Document-statistic Object-count : 0
Document-statistic Page-count   : 1
Document-statistic Paragraph-count: 1
Document-statistic Word-count   : 3
Document-statistic Character-count: 17
Document-statistic Non-whitespace-character-count: 15
Generator                       : LibreOffice/4.2.4.2$Windows_x86 LibreOffice_project/63150712c6d317d27ce2db16eb94c2f3d7b699f8
User-defined Name               : Department
User-defined                    : My Department
User-defined Name               : Extra
User-defined                    : nog wat.
Preview PNG                     : (Binary data 1365 bytes, use -b option to extract)
edit flag offensive delete link more
0

answered 2013-04-01 21:39:03 +0200

migmruiz gravatar image

updated 2013-04-01 21:40:08 +0200

I ended up with a small script to find out the number of pages. It is based on Luuk's answer and I published it here: https://github.com/migmruiz/opendocument-utils

to use it, just do

wget https://raw.github.com/migmruiz/opendocument-utils/master/documentpages
chmod +x documentpages
./documentpages <filename.od?>

You can see it and adapt if you want

if [ ! -e $1 ] || [[ "$1" == "" ]]; then
  echo "Usage: `basename $01` <filename>"
  exit 1
fi

if [ -e "content.xml" ]; then
  echo "Sorry, this cannot be done because some 'content.xml' already exists"
else
  # extract 'content.xml' from the inputfile
  unzip -qo $1 content.xml

  let "i = 1";

  for f in `xmlstarlet el content.xml`;
  do
    n=${f/*:}
    if [[ "$n" =~ "page-break" ]]; then
      let "i += 1";
    fi
    done
    rm content.xml
    echo $i;
fi
edit flag offensive delete link more
0

answered 2012-02-18 15:00:21 +0200

Luuk gravatar image

$> file test.doc test.doc: Composite Document File V2 Document, Little Endian, Os: Windows, Version 1.0, Code page: -535, Title: this is the title, Subject: this is the subject, Keywords: some keywords here, Comments: and some uninteresting comments here, Revision Number: 1, Total Editing Time: 01:05, Create Time/Date: Fri Feb 17 13:50:36 2012, Last Saved Time/Date: Fri Feb 17 13:51:39 2012

But this only works when saved to doc.....

$> file test.odt test.odt: OpenDocument Text

enter code here
edit flag offensive delete link more
0

answered 2012-02-18 18:46:59 +0200

cloph gravatar image

OpenDokument files are zip-files that contain a bunch of xml and other files - the document info/properties (what in the UI is available via File|Properties) are stored in meta.xml inside that archive.

So when you want to read it from the commandline, you need write a little utility in your language of choice that extracts the info from meta.xml and prints it out.

edit flag offensive delete link more

Question Tools

2 followers

Stats

Asked: 2012-02-18 01:00:44 +0200

Seen: 6,903 times

Last updated: Jul 06 '14