Package Release Info

tesseract-ocr-5.3.4-bp155.3.6.1

Update Info: openSUSE-2024-67
Available in Package Hub : 15 SP5 Update

platforms

AArch64
ppc64le
s390x
x86-64

subpackages

libtesseract5
libtesseract5-64bit
tesseract-ocr
tesseract-ocr-devel

Change Logs

* Fri Jan 19 2024 ecsos <ecsos@opensuse.org>
- Update to 5.3.4
  - Fixes for autoconf, clang and sw builds
  - Send output of combine_tessdata -d to stdout instead of stderr.
    Fixes #4149 in #4150
  - Move bail_out function before libtoolize check in #4151
  - Improve OCR for an image URL
  - Fail on curl download errors
  - Add new parameter curl_cookiefile for curl_easy_setopt in #4156
  - Set User-Agent: header field in HTTP request for curl downloads
  - Force TCP v4 for socket to ScrollView server. Fixes #3000 in #4162
  - Fix some compiler warnings and avoid unnecessary conversions
    from std::string to char pointer in #4174
  - Fix a tiny typo in publictypes.h in #4178
  - Fixes for autoconf, clang and sw builds
  - Other small improvements for code and documentation.
* Fri Dec 15 2023 ecsos <ecsos@opensuse.org>
- Update to 5.3.3
  - Disable -mfpu=neon for aarch64 in #4098
  - Fix build without git clone in cloned directory in #4099
  - Fix some issues which were reported by Coverity Scan in #4097
  - Update ScrollView.java in #4103
  - Fix some code comments in #4113
  - Optimize function ImageFind::FindImages in #4114
  - Rename BibTex file to please GitHub in #4115
  - Fix Broken URLs in citations.bib in #4118
  - initDSProfile: correct std::vector usage in #4124
  - Fix typo in stepblob.h in #4133
  - Fix regression in layout detection since 5.0.0 (fixes issue #4014) in #4136
  - Update ScrollView.java in #4104
  - Fix loading of sublangs (regression) in #4141
- Update to 5.3.2
  - fix: Fix snap package building in #4043
  - Support for Sgaw and W Pwo Karen languages in the Myanmar validator. in #4065
  - Replace bool array by more compact vector in #4067
  - Replace deprecated sprintf in #4068
  - Improve format of logging from lstmtraining in #4066
  - Clean code in #4071
  - Abort with error message if OSD is requested with LSTM-only model in #4073
  - Fix typos in #4096
Version: 5.3.1-bp155.3.3.1
* Thu Jun 08 2023 Ondřej Súkup <mimi.vx@gmail.com>
- update to 5.3.1
- revert back to autoconf build as upstrem doesn't support CMAKE
  outside windows
  * Bugfixes for special case scenarios
* Wed Mar 08 2023 Martin Pluskal <mpluskal@suse.com>
- Build AVX2 enabled hwcaps library for x86_64-v3
* Sun Feb 26 2023 malcolmlewis@opensuse.org
- Define TESSDATA_PREFIX during build to point at /usr/share
  (since it's the prefix) rather than package name, tessdata
  suffix is automatically added.
* Thu Jan 05 2023 Martin Pluskal <mpluskal@suse.com>
- Move unversioned libraries to main package
* Mon Jan 02 2023 Markus Ebner <info@ebner-markus.de>
- Update to version 5.3.0:
  * Fix memory issues in ScrollView::MessageReceiver
  * autotools: Add rule for svpaint executable
  * Replace call of exit function by return statement in main function
  * Fix the build on CodeQL/Analyze by @arseniy-sonar in #3888
  * CI: Remove Ubuntu 18.04
  * configure.ac: fix build on aarch64_be
  * SW CI: Add paths filter
  * Create .mailmap
  * Fix tesseract.pc from cmake to match autotools
  * Update README.md
  * Fixed 2 errors
  * fix issue #3940 - remove colormap before thresholding
  * Update upload-artifact action
  * Update checkout action to version 3
  * Fix Markdownlint
  * Fix broken links in CONTRIBUTING.md
  * pdfrenderer.cpp: Ignore non-text blocks
  * lstm.train: allow .box from .raw.png too
  * Fix a number of performance issues (reported by Coverity Scan)
  * Fix training tools for legacy engine (issue #3925)
  * Fix function tesseract::WriteFeature (issue #3925)
  * Modernize function ObjectCache::DeleteUnusedObjects (fix issue with s…
  * More fixes for issue #3925
- Fixed packaging to include missing shared libs:
  * libcommon_training.so
  * libunicharset_training.so
* Fri Sep 16 2022 Markus Ebner <info@ebner-markus.de>
- Switched to new CMake buildsystem
- Update to version 5.2.0:
  * Improvements and fixes for continuous integration, autoconf and cmake builds
  * Set /Os for some 32 bit MS compilers
  * Improve comments and other documentation
  * Add initial support for Intel AVX512F
  * Fix for very large PDF files on 32 bit hosts
  * Fix NEON detection on FreeBSD
  * Fix regression with UZN files
  * Fix calling delete[] for memory allocated by malloc in C API
  * Add an API function to init tesseract with traineddata from memory
  * Replace direct access to Leptonica internal data structures by function calls and
    support latest releases of Leptonica.
  * Replace std::regex by std::string functions.
  * Use compiled-in TESSDATA_PREFIX also on Windows
  * Add new parameter 'invert_threshold', change the default threshold from 0.5 to 0.7
    and mark parameter 'tessedit_do_invert' as deprecated
- Update to version 5.1.0:
  * Handle image and line regions in output formats ALTO, hOCR and text.
  * New parameter curl_timeout for curl_easy_setop.
  * Build fixes and improvements.
  * Catch nullptr in PageIterator::Orientation to improve robustness.
  * Remove unused code.
- Update to version 5.0.1:
  * Add SPDX-License-Identifier to public include files.
  * Support redirections when running OCR on a URL.
  * Lots of fixes and improvements for cmake builds.
  * Distributions should use the autoconf build.
  * Fix broken msys2 build with gcc 11.
  * Fix parameter certainty_scale (was duplicated).
  * Fix some compiler warnings and clean code.
  * Correctly detect amd64 and i386 on FreeBSD.
  * Add libarchive and libcurl in continuous integration actions.
  * Update submodule googletest to release v1.11.0.
- Update to version 5.0.0:
  * Enable fast float32 LSTM by default
  * Switch to NFC normalisation everywhere
  * Remove banner message
  * Disable music staff detection and removal
  * Add new command line option --loglevel
  * Fix regression for OCR with more than one model file
  * Optimizations
  * Improve training messages
  * Add RowAttributes getter to PageIterator
  * Limit BCER to interval [0,1]
  * Improved build process
  * Cleaned code
- Update to version 4.1.3:
  * Fix broken autoconf build
- Update to version 4.1.2:
  * Allow line images with larger width for training
  * Bugfixes
  * Build updates and fixes
- Removed tesseract-ocr-no-cpudetection.patch
  Obsolete with the use of CMake build system instead of Makefiles
Version: 4.1.1-bp155.2.15
* Fri Jan 03 2020 Tomáš Chvátal <tchvatal@suse.com>
- Require libarchive in the devel package
Version: 4.1.1-bp154.1.77
* Thu Mar 26 2020 Bernhard Wiedemann <bwiedemann@suse.com>
- Add tesseract-ocr-no-cpudetection.patch
  to avoid crashing on older CPUs
  and to make package build reproducible (boo#1159231)
* Fri Jan 03 2020 Tomá? Chvátal <tchvatal@suse.com>
- Require libarchive in the devel package
* Fri Dec 27 2019 Ismail Dönmez <idonmez@suse.com>
- Update to version 4.1.1
  * Bugfixes
* Fri Dec 13 2019 Martin Pluskal <mpluskal@suse.com>
- Packaging Cleanups
- Update dependencies and enable openCL
* Fri Dec 13 2019 hiwatari.seiji@gmail.com
- Update to 4.1.0
  * Added a new output option formatted in the ALTO standard
  * SIMD optimization
  * Bugfixes
- Update to 4.0.0
  * New OCR engine based on LSTMs
  * Removed Cube OCR engine
  * Updated build system
  * Cleanups and fixes
Version: 3.05.01-bp150.2.4
* Tue Feb 20 2018 jweberhofer@weberhofer.at
- Update to 3.05.01
  * Fixed several build issues
  * Fixed C-API
  * Backport pdfrenderer changes
  * Code clean up
- Spec file cleaned up
* Fri Feb 17 2017 idonmez@suse.com
- Update to 3.05.00
  * Made some fine tuning to the hOCR output.
  * Added TSV as another optional output format.
  * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout()
    method.
  * text2image tool - Enable all OpenType ligatures available in
    a font. This feature requires Pango 1.38 or newer.
  * Training tools - Replaced asserts with tprintf() and exit(1).
  * Improved multipage tiff processing.
  * Improved the embedded pdf font (pdf.ttf).
  * Enable selection of OCR engine mode from command line.
  * Changed tesseract command line parameter '-psm' to '--psm'.
  * Added new C API for orientation and script detection, removed
    the old one.
  * Fixed many compiler warning.
  * Fixed memory and resource leaks.
* Fri Feb 19 2016 idonmez@suse.com
- Update to 3.04.01
  * No changelog upstream
* Fri Oct 02 2015 asterios.dramis@gmail.com
- Update to version 3.04.00:
  * Added OpenCL support (experimental).
  * Many bug fixes.
  From version 3.03.00:
  * Added new training tool text2image to generate box/tif file
    pairs from text and truetype fonts.
  * Added support for PDF output with searchable text.
  * Removed entire IMAGE class and all code in image directory.
  * Tesseract executable: support for output to stdout; limited
    support for one page images from stdin  (especially on Windows)
  * Added Renderer to API to allow document-level processing and
    output of document formats, like hOCR, PDF.
  * Major refactor of word-level recognition, beam search,
    eliminating dead code.
  * Refactored classifier to make it easier to add new ones.
  * Generalized feature extractor to allow feature extraction from
    greyscale.
  * Improved sub/superscript treatment.
  * Improved baseline fit.
  * Added set_unicharset_properties to training tools.
  * Many bug fixes.
  * More training source data included.
- Added new build requirements cairo-devel, doxygen, libicu-devel
  and pango-devel.
- Recommend tesseract-ocr-traineddata-english instead of
  tesseract-ocr-traineddata-american (based on new (3.04.00)
  tesseract-ocr traineddata files).
* Mon Sep 14 2015 asterios.dramis@gmail.com
- Fix Recommends: entry to tesseract-ocr-traineddata-american.
* Sat Jun 20 2015 mailaender@opensuse.org
- rename to match upstream tarball and fix boo#900303
* Sat Jun 22 2013 asterios.dramis@gmail.com
- Split library into separate package (libtesseract3).
- Removed debuginfo package (not needed).
- There is no need anymore to regenerate the build system (removed automake and
  libtool build requirements).
- Added pkg-config build requirement (fix for rpmlint error
  "no-pkg-config-provides"). Removed also not needed
  "Provides: pkgconfig(%{name})" entry.
* Mon May 06 2013 idonmez@suse.com
- Update license, some files are GPL-2.0+ licensed
* Mon Oct 29 2012 jw@suse.com
- Update to version 3.02.02
  * untested
- Notable features:
  * Hebrew with BiDi support.
  * More languages.
- removed upstreamed patch0
* Mon Jun 25 2012 asterios.dramis@gmail.com
- Update to version 3.01:
  * Removed old/dead serialise/deserialze methods on *LISTIZED classes.
  * Total rewrite of DENORM to better encapsulate operation and make
    for potential to extract features from images.
  * Thread-safety! Moved all critical globals and statics to
    members of the appropriate class. Tesseract is now
    thread-safe (multiple instances can be used in parallel
    in multiple threads.) with the minor exception that some
    control parameters are still global and affect all threads.
  * Added Cube, a new recognizer for Arabic. Cube can also be
    used in combination with normal Tesseract for other languages
    with an improvement in accuracy at the cost of (much) lower speed.
    There is no training module for Cube yet.
  * OcrEngineMode in Init replaces AccuracyVSpeed to control cube.
  * Greatly improved segmentation search with consequent accuracy and
    speed improvements, especially for Chinese.
  * Added PageIterator and ResultIterator as cleaner ways to get the
    full results out of Tesseract, that are not currently provided
    by any of the TessBaseAPI::Get* methods.
    All other methods, such as the ETEXT_STRUCT in particular are
    deprecated and will be deleted in the future.
  * ApplyBoxes totally rewritten to make training easier.
    It can now cope with touching/overlapping training characters,
    and a new boxfile format allows word boxes instead of character
    boxes, BUT to use that you have to have already boostrapped the
    language with character boxes. "Cyclic dependency" on traineddata.
  * Auto orientation and script detection added to page layout analysis.
  * Deleted *lots* of dead code.
  * Fixxht module replaced with scalable data-driven module.
  * Output font characteristics accuracy improved.
  * Removed the double conversion at each classification.
  * Upgraded oldest structs to be classes and deprecated PBLOB.
  * Removed non-deterministic baseline fit.
  * Added fixed length dawgs for Chinese.
  * Handling of vertical text improved.
  * Handling of leader dots improved.
  * Table detection greatly improved.
- Removed the various languages traineddata subpackages (to be included in a
  separate package "tesseract-traineddata").
- Changed License to Apache-2.0 (SPDX style).
- Removed libtiff-devel build dependency (not needed anymore).
- Added new build dependency liblept-devel, required now by the package.
- Added automake and libtool build dependencies in order to regenerate the
  build system because of missing Makefile.in.
- Removed tesseract-traineddata-deu from recommended entries.
- Removed nonvoid.patch (fixed upstream).
- Added a patch (svutil.cpp_fix.patch) to fix compilation due to missing
  includes (taken from upstream).
- Disabled compilation of static libraries.
* Mon Oct 25 2010 prusnak@opensuse.org
- fixed missing returns in nonvoid functions (nonvoid.patch)
- added missing post/postun scripts calling ldconfig
* Thu Sep 23 2010 michal.smrz@opensuse.cz
- update to tesseract-3.00
- added plenty od new supported languages
- created tesseract-package-creator.py which will, hopefully, make future
  updates easier
* Fri Jul 10 2009 puzel@novell.com
- update to tesseract-2.04
  * Integrated bug fixes and patches and misc changes for portability.
  * Integrated a patch to remove some of the "access" macros.
  * Removed dependence on lua from the viewer, speeding it up
    dramatically.
  * Fixed the viewer so it compiles and runs properly!