What's New in Ayfie Saga 5.4.0

Ayfie Saga 5.4.0 has been released. Here are the highlights:

UI Improvements

General UI improvements have been applied to make the user experience more pleasant and consistent. Fonts have been standardized across the entire website, the design of all dialogs have been updated, and the appearance of inputs and buttons have been enhanced.

Automatically Warming Caches after Commits

Solr’s caches provide an essential way to improve query performance, see https://solr.apache.org/guide/solr/latest/configuration-guide/caches-warming.html .

The caches are cleared after commits and need to be re-populated before their benefit can be seen again. To counteract this, caches can be "warmed" before a new searcher is considered opened by automatically populating the new cache with values from the old cache. In the Locator search the most important cache to warm is the filterCache that is used in the query to get the refiners. In Ayfie Saga 5.4.0, the autowarmCount for the filterCache is set to 100%, which means 100% of the old cache is copied to the new cache during the “warming”. In additon, the filterCache size is increased from 512 to 2048, that means the cache can hold more entries and cause less evictions.

Query Refiners in Parallel Instead of Serially

Solr’s parameter facet.threads is by default set to 4 in Ayfie Saga 5.4.0. This parameter will cause loading the underlying fields used in faceting to be executed in parallel instead of serially using 4 threads.

Processing of PDFs in the Converter Has Been Reworked

In earlier versions, images in PDFs were not always OCR’ed as the converter relied on the PDF text layer. This resulted in that text from images in PDFs were not always searchable. In other situations, the converter would duplicate the PDF text layer. In the current version, the PDF processing is done by first reading the text layer and then OCR'ing every image in a single operation. This avoids any text layer duplication while at the same time ensuring that all images are processed.

Option to Control Processing in Converter.

New environment variable AYFIE_CONVERTER_CONVERSION_TYPE has been introduced which accepts values Standard, NoOCR or AlwaysOCR. This option controls document processing in the converter.

  • Standard - default value, conversion consist of extracting the text and later running OCR on images.

  • NoOCR - OCR is disabled, only extracting the text

  • AlwaysOCR - same as Standard above with the one difference that PDFs are OCR'ed from scratch (any existing text layer is completely ignored)

Other Improvements

See Release Notes for exhaustive list:

https://ayfie-dev.atlassian.net/wiki/spaces/SAGA/pages/3339943954