wildcard.pdfpal

HomePage: http://pypi.python.org/pypi/wildcard.pdfpal

Author: Nathan Van Gheem

Download: https://pypi.python.org/packages/source/w/wildcard.pdfpal/wildcard.pdfpal-0.7b6.zip

        Introduction
============
This package provides some nice integrations for PDF heavy web sites.

* Generates thumbnails from PDF
* Adds folder view for pdfs so it can use the generated thumbnail
* Adds OCR for PDF indexing
* Everything configurable so you can choose to not use thumbnail gen or OCR
* Ability to create searchable PDFs with HOCR
* use the `@@async-monitor` url to monitor asynchronous jobs that have yet to run


OCR
---

OCR requires Ghostscript to be installed and Tesseract. Just you package management
to install these packages:

  # sudo apt-get install ghostscript tesseract-ocr

This will install tessact 2 not tesseract 3.

Searchable PDFs
---------------

Requires svn checkout of tesseract version 3.01 or 3.00 with the hocr configuration in place.
Take a look at this thread to find out how to configure hocr http://ubuntuforums.org/showthread.php?t=1647350

In addition, you'll need exactimage and pdftk installed

  # sudo apt-get install exactimage pdftk libtiff-tools

To not use the latest tesseract version to will have to add this in your
instances declaration:

  environment-vars += AUTHORIZE_OLD_TESSERACT_VERSION true


Plone 3
-------

* Requires hashlib


Extra
-----

You can convert all at once by calling the url `@@queue-up-all`.


Changelog
=========

0.7b6 ~ 2012-04-20
------------------

-fix uninstall
 [vangheem]

0.7b5 ~ 2012-04-19
------------------

- do not run conversion if documentviewer is installed
  [vangheem]

- add better uninstall support
  [vangheem]


0.7b4 ~ 2012-04-09
------------------

- fix image url for album view.
  [vangheem]


0.7b3 ~ 2012-04-05
------------------

- fix content type spec for thumbnail response
  [vangheem]

- display image thumb urls in in album view
  [vangheem]


0.7b2 ~ 2011-04-12
------------------

- more checks on reading files
  [vangheem]
  
- provide button to manually index document
  [vangheem]
  
- add ability to split pdf up into multiple PDFs
  [vangheem]

0.7b1 ~ 2011-01-06
------------------

- fixes for quality and size issues
  [vangheem]


0.6b2 ~ 2011-01-04
------------------

- fix async monitor view to work with plone.app.async = 1.0
  It changed the order of some args in the job.
  [vangheem]

0.6b1 ~ 2011-01-04
------------------

- added ability to make PDFs searchable and make it work seamlessly
  if wc.pageturner is installed so flex paper is created with the searchable
  PDF version.

0.5b5 ~ 2010-12-07
------------------

- did not conditionally import plone.app.async


0.5b4 ~ 2010-12-06
------------------

* better info on async monitor

* only reindex searchabletext when doing OCR so the modification date on
  the object does not get set.
  
* make sure to catch exceptions so it doesn't leave around files after a bad conversion

* add colorbox for pdf folder view


0.5b3 ~ 2010-12-02
------------------

* add ability to queue up all pdf files


0.5b2 - 2010-12-02
------------------

* fix async monitor view

0.5b1 - 2010-12-02
------------------

* Initial release