Comparing documents with OpenOffice and Python
20 Aug 2005
I'm finally able to focus on something other than reading and playing Tales of Symphonia, and I've decided I need some actions before I look at making an action registry in KnowledgeTree.
I've been interested in automatic document conversion, and I've got recipes for OpenOffice.org and Office to generate PDF from word processor documents (obviously OO.org can do more source formats).
But what really seemed cool was the ability to generate diffs between documents, which OpenOffice.org does rather nicely. But it seems that no-one has a simple working example of using this via the OpenOffice.org UNO/URP. But reading up from multiple sources was enough to come up with a short example prototype. You need OpenOffice.org running with the remote socket (such as: /usr/lib/openoffice2/program/soffice.bin -headless -invisible '-accept=socket,host=localhost,port=2002;urp;'), and the Python UNO bridge module. OpenOffice.org v1 seems to not generate a proper PDF, I tested on a OpenOffice.org 2 beta.
#!/usr/bin/env python
import uno
from com.sun.star.beans import PropertyValue
url = uno.systemPathToFileUrl('/home/nbm/360survey-changed.doc')
url_original = uno.systemPathToFileUrl('/home/nbm/360survey-original.doc')
url_save = uno.systemPathToFileUrl('/home/nbm/360survey-diffs.pdf')
### Get Service Manager
context = uno.getComponentContext()
resolver = context.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", context)
ctx = resolver.resolve("uno:socket,host=localhost,port=2002;urp;StarOffice.ComponentContext")
smgr = ctx.ServiceManager
### Load document
properties = []
p = PropertyValue()
p.Name = "Hidden"
p.Value = True
properties.append(p)
properties = tuple(properties)
desktop = smgr.createInstanceWithContext("com.sun.star.frame.Desktop", ctx)
doc = desktop.loadComponentFromURL(url, "_blank", 0, properties)
### Compare with original document
properties = []
p = PropertyValue()
p.Name = "URL"
p.Value = url_original
properties.append(p)
properties = tuple(properties)
dispatch_helper = smgr.createInstanceWithContext("com.sun.star.frame.DispatchHelper", ctx)
dispatch_helper.executeDispatch(doc.getCurrentController().getFrame(), ".uno:CompareDocuments", "", 0, properties)
### Save File
properties = []
p = PropertyValue()
p.Name = "Overwrite"
p.Value = True
properties.append(p)
p = PropertyValue()
p.Name = "FilterName"
p.Value = 'writer_pdf_Export'
properties.append(p)
properties = tuple(properties)
doc.storeToURL(url_save, properties)
doc.dispose()
5 Responses
Yuppie — August 23, 2005 at 10:26 AM.
Al — December 07, 2005 at 08:45 PM.
Rudolf — December 10, 2006 at 11:48 AM.
Louis Legas — January 29, 2007 at 11:45 PM.
Thanks a lot!
Exactly what we needed for our Zope site to compare multiple OO/Word-Docoments.
Louis Legas — January 30, 2007 at 11:36 PM.
Fine! It took a little work, but in the end it worked fine.
I had to comment out the "HIDDEN"-Property, whenI load the first document.
Please don't ask me why. With the HIDDEN included I always got an IOError, when I store the file.
Best regards LL
Have your say