How to remove date stamp from changes and comments?

Lawyers, writers, editors and others dealing with sensitive text documents have a need to remove metadata from changes and comments. This feature seems to be nonexistent in LO Writer.

Example: I am a writer/editor/translator. Clients send me documents for review or editing. The automatic date stamping of changes and comments allows them to monitor my work patterns, time of day when the work was done, and my time usage from the files I send back. None of this is their business.

Is there a way to get rid of change and comment metadata? If not built into Writer, maybe with an external script? This is important, and the lack of this feature is a serious drawback in Writer.

(Running LibreOffice 3.5.7.2 on Linux Mint 13 Maya KDE)

I need too a way to remove these informations from comments, as I usually insert many comments in each page, so without metadata (author name and data) there is more space to insert other comments. I think it’s a serious missing feature.

Not sure if this meets your needs: ‘File’>‘Properties …’>‘Security’ ; ‘Record changes’ off.

Unzip a .odt file, it contains content.xml file. A comment looks like this:

<office:annotation office:name="__Fieldmark__1_514712390">
  <dc:creator>John Smith</dc:creator>
  <dc:date>2014-05-29T14:38:39.846905891</dc:date>
  <text:list text:style-name="">
   <text:list-item>
    <text:p text:style-name="P2">
     <text:span text:style-name="T1">A comment text.</text:span>
    </text:p>
   </text:list-item>
  </text:list>
</office:annotation>

Now all you need to do is replace all <dc:date>2014-05-29T14:38:39.846905891</dc:date> with a dummy date, e.g 2000-01-01T12:00:00.0 and save it back to original .odt file. Or why not use the date when document was created (in meta.xml): <meta:creation-date>2014-05-29T14:38:30.386644381</meta:creation-date>

The following python script does it for you and renames the file as name-clean.odt:

#!/usr/bin/python

import glob, re, os, sys, zipfile

for pattern in sys.argv[1:]:
    for filepath in glob.glob(pattern):
        dirname, basename = os.path.split(filepath)
        root, ext = os.path.splitext(basename)
        newname = '%s-clean%s' % (root, ext)
        outpath = os.path.join(dirname, newname)
        zin = zipfile.ZipFile (filepath, 'r')
        zout = zipfile.ZipFile (outpath, 'w')
        for item in zin.infolist():
            if item.filename == 'mimetype':
                zout.writestr(item, zin.read(item.filename))
        for item in zin.infolist():
            if item.filename == 'meta.xml':
                data = zin.read(item.filename)
                # Find the document creation timestamp.
                timestamp = re.search(r'<meta:creation-date>([^<]+)</meta:creation-date>', data).group(1)
        for item in zin.infolist():
            if item.filename == 'content.xml':
                data = zin.read(item.filename)
                # Replace all occurrences of date in <dc:date>
                data = re.sub('<dc:date>[^<]+</dc:date>', '<dc:date>%s</dc:date>' % timestamp, data)
                zout.writestr(item, data)
            elif item.filename != 'mimetype':
                zout.writestr(item, zin.read(item.filename))
        zout.close()
        zin.close()

This deals only with comments, but looks like you need to save changes without date as well. Feel free to modify the script to your needs.

@mahfiaz your python script is so cool! It did the trick. However, it showed a warning: “remove_time_date_stamps.py:28: UserWarning: Duplicate name: u’mimetype’
zout.writestr(item, zin.read(item.filename))” What this warning means, please?

It warns that file named mimetype is already in the zip (it was added on line 15, because mimetype file has to be the first in zip). It means the zip is kind of malformed, (contains two files with the exact same name). It is no problem for any viewers I have used nor for LibreOffice.

To get rid of the warning you could change line 18 from
else:
to
elif item.filename != ‘mimetype’:

I did this change in the script above.

This does not remove all the metadata from a Writer document. There are still Author names and time stamps on all the comments you add.

Does anyone know how to remove these?

Thanks!

What would “this” refer to?

An enhancement request regarding this feature has now been filed:

https://bugs.documentfoundation.org/show_bug.cgi?id=90401