Friday, 23 July 2010

Converting OpenOffice Documents in Bulk

I had a request from a customer earlier this week: they wanted a copy of all the diagrams that are present in a specification I've been writing for them. All those diagrams are in separate OpenOffice Draw (.odg) files. They don't use OpenOffice but were happy to have PDF versions. The only problem is that there are 42 of them so it would take ages to convert them manually. A quick Google later and I found a way to convert documents from the command line.

So, to adapt it to Ubuntu, download the DocumentConverter.py file mentioned in the post above and store it where your documents are. Then, start OpenOffice in headless mode:

$ ooffice -headless -accept="socket,port=8100;urp;" &

As the version of Python installed on Ubuntu already includes the UNO bindings, there is no need to use a special OpenOffice version of Python to do the job, the standard one will do. The script takes two arguments, the input file and the output file, and works out the formats based on the extensions. Doing the bulk convert is therefore extremely easy:

$ ls *.odg | while read f; do
> echo $f
> python DocumentConverter.py $f ${f%.*}.pdf
> done

Job done! It took 5 minutes for my laptop to convert all 42 files, during which time I made coffee rather than repetitively click on UI buttons and I even have enough time left to blog about it. Oh and I have a happy customer: that's the most important.

No comments: