Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Version 2.8.15

March 18, 2022, Supported Locator version(s): 2.11

Bugs Fixed

  • CFD-4300 Web connector - too many threads are being created by fetch service
  • CFD-4521 Web connector - slow fetch


Version 2.8.14

September 03, 2021, Supported Locator version(s): 2.10, 2.11

Bugs Fixed

  • CFD-4301 Web connector - the crawler is crawling content that should be ignored by it's settings

Tasks Completed

  • CFD-4271 Web connector - add support for TLS 1.2 in preview plugin


Version 2.8.13

August 09, 2021, Supported Locator version(s): 2.10, 2.11

Bugs Fixed

  • CFD-4264 Web connector - fetch job is using different CrawlDecisionMaker than discovery.
  • CFD-4257 Web connector - invalid redirects handling in fetch job
  • CFD-4250 Web Connector - connector is constantly deleting items per run


Version 2.8.12

July 16, 2021, Supported Locator version(s): 2.10, 2.11

Bugs Fixed

  • CFD-4238 Web Connector - custom plugins are loaded from default directory path if you specify incorrect directory path
  • CFD-4230 Web Connector 2.8.11 does not crawl documents
  • CFD-3956 Web Connector - Attempts to parse PDFs as HTML


Version 2.8.11

May 07, 2021, Supported Locator version(s): 2.10, 2.11

Bugs Fixed

  • CFD-4148 Web connector - Saving pages and links to the disk is not working
  • CFD-4147 Web Connector - The crucial objects like WebCrawler are not releasing unmanaged resources and managed objects.

Tasks Completed

  • CFD-4132 Web Connector Parser Plugin - optimize memory usage
  • CFD-4130 Web connector - update Abot libraries



Version 2.8.10

February 12, 2021, Supported Locator version(s): 2.10, 2.11, 3.0

Improvements

  • CFD-4096 Web Connector -Improve handling exceptions regarding custom plugin

Bugs Fixed

  • CFD-4095 Web Connector - Unassigned configuration property "CurrentSeedUrl"



Version 2.8.9

November 25, 2020, Supported Locator version(s): 2.10, 2.11, 3.0

Improvements

  • CFD-3751 Connectors - Add metadata text and document text fields merging rule

Bugs Fixed

  • CFD-3956 Web Connector - Attempts to parse PDFs as HTML



Version 2.8.8

June 23, 2020, Supported Locator version(s): 2.10, 2.11, 3.0

Bugs Fixed

  • CFD-3844 Web Connector - The sitemap configured for Web Connector appears in search results as a document
  • CFD-3830 Web Connector - Fetches some of the documents with incorrect title
  • CFD-2259 Web Connector - Crawler not indexing links
  • CFD-1896 Web Connector - Connector not keeping crawler state



Version 2.8.7

April 29, 2020, Supported Locator version(s): 2.10, 2.11, 3.0

Bugs Fixed

  • CFD-3742 Web Connector - Should not access HTML properties for non-HTML items

Tasks Completed

  • CFD-3423 Move Web Connector to Rapid



Version 2.8.6

Bugs

  • CFD-2756 Web Connector - Not all pages removed from sitemap are removed from index



Version 2.8.5

Tasks

  • CFD-2750 Publish Connectors to Connectors 2.9 Feed based on SDK 1.5
  • SDK-280 Publish Connectors to Connectors 2.10 Feed based on SDK 1.6



Version 2.8.4

Improvements

  • CFD-2574 Web connector - Add Platform Date
  • CFD-2146 Web Connector - Add support for new hit fields in RestService version 6
  • CFD-2654 Universal Web connector - proxy support



Version 2.8.3

Bugs

  • CFD-2435 Web crawler - 301 redirect links that shouldn't be index are indexed anyways.
    (To avoid indexing redirects - set IsHttpRequestAutoRedirectsEnabled to False)  
  • CFD-2434 Web crawler does not handle robots with ending /.
    (Bug confirmed and reported to Abot - the third party crawler. Need to be fixed temporarily with configuration setup changes)
  • CFD-2367 Web - The sign '?' working as designed in robots.txt Disallow. 
    (Bug confirmed and reported to Abot - the third party crawler. Need to be fixed temporarily with configuration setup changes)
  • CFD-2228 Web connector - Canonical URL not working
  • CFD-2078 Web connector - Only tries to crawl with protocol TLS 1.0

Tasks

  • CFD-2314 Web Connector - Release with new branding



Version 2.8.2

Improvements

  • CFD-1758 Web connector - Expose and add setting "IsRespectHttpXRobotsTagHeaderNoFollowEnabled" and 3 other missing config values

Bugs

  • CFD-1953 Web connector crashed after 5 days of discovery
  • CFD-1894 Web connector - crawler indexes "http://virtualworks.com" when specifying "http://virtualworks.com/contact" as seedurl
  • CFD-1817 Web connector - meta robots = "nofollow" not working
  • CFD-1771 Web connector - "The directory is not empty" when crawling

Tasks

  • CFD-1954 Web connector - Investigate high cpu/disk/memory usage
  • CFD-1932 Web connector - Release version 2.8.2
  • CFD-1914 Web connector - Crawl of single page not checking for canonical



Version 2.8.0

Summary

  • Hidden settings now added to the Admin Wizard and database
  • New custom setting for only building pages where the rel canonical link is equal to the page url (this is turned off by default)
  • Bug fixes and update the third party crawler api. 

Bugs

  • CFD-1813 Web connector - Bugs in admin customs settings.
  • CFD-1812 Web connector - Won't remove page after e.g. reducing crawl depth
  • CFD-1366 Web connector - Unable to add new settings in wizard

Tasks

  • CFD-1884 Web connector - Update the Abot Crawler
  • CFD-1876 Web connector - Add "tracking" to detect deleted/excluded pages
  • CFD-1810 Web connector - Publish Data Sheet
  • CFD-1759 Web connector - Add feature to only crawl canonical url's
  • CFD-1797 Web connector - Release next version



Version 2.7.8

To enable preview for web pages, please change the REST service's web config. Add "html" to the extension list for the document previewer, like this

Code Block
<add AppName="document" Action="preview" Script="DisplayLink" ExtList="txt;doc;docx;dotx;docm;docxm;dot;pdf;cs;css;js;fax;xml;xls;xlsm;xlsx;xlsxm;xlt;xltm;xltx;xps;msg;html" DocTypeList="" SkipRootExtList="" SkipRootDocTypeList="" Priority="20"></add>

Features

  • CFD-1251 Web connector - Add support for preview of files



Version 2.7.6

...