djvu/ocrodjvu

log

age author description
2 months Jakub Wilk Sometimes Cuneiform returns files with broken encoding or with control characters. Let's fix it.default tip
2 months Jakub Wilk utils: add sanitize_utf8().
2 months Jakub Wilk Prepare to release 0.4.6.
2 months Jakub Wilk Added tag 0.4.5 for changeset 5bfe1d3396c7
2 months Jakub Wilk Release 0.4.5.0.4.5
2 months Jakub Wilk Fix order in which text with inline markup is read.
2 months Jakub Wilk tests: update to match current output.
2 months Jakub Wilk Properly handle hOCR with inline formatting.
2 months Jakub Wilk Fix handling of ‘deu’ and ‘rus-eng’ languages.
3 months Jakub Wilk djvu2hocr: add ocr-system and ocr-capabilities meta information, as required by hOCR specification.
3 months Jakub Wilk Prepeare to release 0.4.5.
3 months Jakub Wilk Added tag 0.4.4 for changeset 6a8a19ea3e4c
3 months Jakub Wilk Release 0.4.4.0.4.4
3 months Jakub Wilk Don't remove temporary directory if ocrodjvu crashed.
3 months Jakub Wilk ocrodjvu: adjust indentation.
3 months Jakub Wilk Document that ocrodjvu honours TMPDIR environment variable.
3 months Jakub Wilk Standardise how and when to sanitize locale.
3 months Jakub Wilk Prepare to release 0.4.4.
4 months Jakub Wilk be: remove.
4 months Jakub Wilk tests/djvu2hocr/run-test: make it actually fail if necessary.
4 months Jakub Wilk Include *.djvused in the upstream tarball.
4 months Jakub Wilk Added tag 0.4.3 for changeset 4889b7ac20b9
4 months Jakub Wilk Release 0.4.3.0.4.3
4 months Jakub Wilk Document how djvu2hocr deals with non-XML characters.
4 months Jakub Wilk be: target for XMP metadata extraction is now 0.5.0.
4 months Jakub Wilk tests: reorganize, part 4.
4 months Jakub Wilk tests: test if djvu2hocr is preserving non-XML characters.
4 months Jakub Wilk tests: add an empty 1000×1000 DjVu file, which might be useful for future tests.
4 months Jakub Wilk tests: reorganize, part 3.
4 months Jakub Wilk hgignore: update.
4 months Jakub Wilk tests: reogranize, part 2.
4 months Jakub Wilk tests: reorganize.
4 months Jakub Wilk be: target is 0.4.3.
4 months Jakub Wilk Update homepage URL.
4 months Jakub Wilk Update the changelog.
4 months Jakub Wilk Add pointers to Debian bug reports.
4 months Jakub Wilk Fix -h/--help.
4 months Jakub Wilk Prepare to release 0.4.3.
4 months Jakub Wilk hocr: break with a more helpful error is number of bboxes doesn't match text length.
4 months Jakub Wilk Rename a few variables.
4 months Jakub Wilk Added tag 0.4.2 for changeset 4c420d3da87e
4 months Jakub Wilk Added tag 0.4.1 for changeset 95fd1851ed2e0.4.2
4 months Jakub Wilk Release 0.4.2.
4 months Jakub Wilk be: XMP metadata extraction won't be fixed in this release.
4 months Jakub Wilk Add support for Cuneiform 0.9.
4 months Jakub Wilk hocr: relax parsing of bbox sequences.
4 months Jakub Wilk hocr: fix a typo.
4 months Jakub Wilk hocr: strip trailing whitespace from Cuneiform output.
4 months Jakub Wilk doc: clarify that tesslanguage does not affect non-OCRopus engines.
4 months Jakub Wilk Add basic support for hOCR generated by Cuneiform 0.9.
4 months Jakub Wilk Fix a typo.
4 months Jakub Wilk Fix off-by-one error in text area coordinates.
4 months Jakub Wilk New bugs.
4 months Jakub Wilk doc: explain --list-languages shows only languages for the currently selected OCR engine.
4 months Jakub Wilk New options for ocrodjvu: --render=mask, --render=foreground, --render=all.
4 months Jakub Wilk Update my e-mail address.
6 months Jakub Wilk Release 0.4.1.0.4.1
6 months Jakub Wilk Include test DjVu files in the tarball.
6 months Jakub Wilk Be stricter when reading hOCR produced by OCRopus 0.3.1.
6 months Jakub Wilk Prepare to release 0.4.1.