Tuesday, February 12, 2013

Bulk Convert Office Documents into PDF using unoconv

There are many methods availabel to convert bulk amount of document files into PDF files. Using Using Open Source libarary/tool we can able to do it programatically. (ie. Open Office headless service, unoconv and etc.,).

The drawback of using Open Office headless service in the case of files are in sub directories too is it will output the files in current working directory instead of actual source file path. In that case we can use unoconv tool. It supports all the formats that Open Office supports.

Unoconv

       Unoconv is a command line utility that can convert any file format that LibreOffice can import, to any file format that LibreOffice is capable of exporting. To know what are the types it is supporting,
$ unoconv --show

Example

       To convert the list odt files from path and its sub directories use the below code. This can be modified to different input and output formats as required.

#!/bin/bash
#To avoid errors due to spaces in file names
IFS="$(printf '\n\t')"
#Finds odt files in current directory and its sub directories and process one by one
for file in $(find . -name "*.odt" -type f); do
  echo "Processing File : $file ...";
  unoconv -d document -f pdf "$file";
Done

No comments:

Post a Comment