HomePage: http://pypi.python.org/pypi/wildcard.pdfpal

Author: Nathan Van Gheem

Download: https://pypi.python.org/packages/source/w/wildcard.pdfpal/wildcard.pdfpal-0.7b6.zip

This package provides some nice integrations for PDF heavy web sites.

* Generates thumbnails from PDF
* Adds folder view for pdfs so it can use the generated thumbnail
* Adds OCR for PDF indexing
* Everything configurable so you can choose to not use thumbnail gen or OCR
* Ability to create searchable PDFs with HOCR
* use the `@@async-monitor` url to monitor asynchronous jobs that have yet to run


OCR requires Ghostscript to be installed and Tesseract. Just you package management
to install these packages:

  # sudo apt-get install ghostscript tesseract-ocr

This will install tessact 2 not tesseract 3.

Searchable PDFs

Requires svn checkout of tesseract version 3.01 or 3.00 with the hocr configuration in place.
Take a look at this thread to find out how to configure hocr http://ubuntuforums.org/showthread.php?t=1647350

In addition, you'll need exactimage and pdftk installed

  # sudo apt-get install exactimage pdftk libtiff-tools

To not use the latest tesseract version to will have to add this in your
instances declaration:

  environment-vars += AUTHORIZE_OLD_TESSERACT_VERSION true

Plone 3

* Requires hashlib


You can convert all at once by calling the url `@@queue-up-all`.


0.7b6 ~ 2012-04-20

-fix uninstall

0.7b5 ~ 2012-04-19

- do not run conversion if documentviewer is installed

- add better uninstall support

0.7b4 ~ 2012-04-09

- fix image url for album view.

0.7b3 ~ 2012-04-05

- fix content type spec for thumbnail response

- display image thumb urls in in album view

0.7b2 ~ 2011-04-12

- more checks on reading files
- provide button to manually index document
- add ability to split pdf up into multiple PDFs

0.7b1 ~ 2011-01-06

- fixes for quality and size issues

0.6b2 ~ 2011-01-04

- fix async monitor view to work with plone.app.async = 1.0
  It changed the order of some args in the job.

0.6b1 ~ 2011-01-04

- added ability to make PDFs searchable and make it work seamlessly
  if wc.pageturner is installed so flex paper is created with the searchable
  PDF version.

0.5b5 ~ 2010-12-07

- did not conditionally import plone.app.async

0.5b4 ~ 2010-12-06

* better info on async monitor

* only reindex searchabletext when doing OCR so the modification date on
  the object does not get set.
* make sure to catch exceptions so it doesn't leave around files after a bad conversion

* add colorbox for pdf folder view

0.5b3 ~ 2010-12-02

* add ability to queue up all pdf files

0.5b2 - 2010-12-02

* fix async monitor view

0.5b1 - 2010-12-02

* Initial release