Web Connector Release Notes
Version 2.8.17
June 03, 2022, Supported Locator version(s): 2.11
Bugs Fixed
- CFD-4637 Web connector - Pages with robots meta tag "noindex" are indexed
Version 2.8.16
May 10, 2022, Supported Locator version(s): 2.11
Bugs Fixed
- CFD-4615 Web connector - not supporting robots meta tag "noindex"
Version 2.8.15
March 18, 2022, Supported Locator version(s): 2.11
Bugs Fixed
- CFD-4300 Web connector - too many threads are being created by fetch service
- CFD-4521 Web connector - slow fetch
Version 2.8.14
September 03, 2021, Supported Locator version(s): 2.10, 2.11
Bugs Fixed
- CFD-4301 Web connector - the crawler is crawling content that should be ignored by it's settings
Tasks Completed
- CFD-4271 Web connector - add support for TLS 1.2 in preview plugin
Version 2.8.13
August 09, 2021, Supported Locator version(s): 2.10, 2.11
Bugs Fixed
- CFD-4264 Web connector - fetch job is using different CrawlDecisionMaker than discovery.
- CFD-4257 Web connector - invalid redirects handling in fetch job
- CFD-4250 Web Connector - connector is constantly deleting items per run
Version 2.8.12
July 16, 2021, Supported Locator version(s): 2.10, 2.11
Bugs Fixed
- CFD-4238 Web Connector - custom plugins are loaded from default directory path if you specify incorrect directory path
- CFD-4230 Web Connector 2.8.11 does not crawl documents
- CFD-3956 Web Connector - Attempts to parse PDFs as HTML
Version 2.8.11
May 07, 2021, Supported Locator version(s): 2.10, 2.11
Bugs Fixed
- CFD-4148 Web connector - Saving pages and links to the disk is not working
- CFD-4147 Web Connector - The crucial objects like WebCrawler are not releasing unmanaged resources and managed objects.
Tasks Completed
- CFD-4132 Web Connector Parser Plugin - optimize memory usage
- CFD-4130 Web connector - update Abot libraries
Version 2.8.10
February 12, 2021, Supported Locator version(s): 2.10, 2.11, 3.0
Improvements
- CFD-4096 Web Connector -Improve handling exceptions regarding custom plugin
Bugs Fixed
- CFD-4095 Web Connector - Unassigned configuration property "CurrentSeedUrl"
Version 2.8.9
November 25, 2020, Supported Locator version(s): 2.10, 2.11, 3.0
Improvements
- CFD-3751 Connectors - Add metadata text and document text fields merging rule
Bugs Fixed
- CFD-3956 Web Connector - Attempts to parse PDFs as HTML
Version 2.8.8
June 23, 2020, Supported Locator version(s): 2.10, 2.11, 3.0
Bugs Fixed
- CFD-3844 Web Connector - The sitemap configured for Web Connector appears in search results as a document
- CFD-3830 Web Connector - Fetches some of the documents with incorrect title
- CFD-2259 Web Connector - Crawler not indexing links
- CFD-1896 Web Connector - Connector not keeping crawler state
Version 2.8.7
April 29, 2020, Supported Locator version(s): 2.10, 2.11, 3.0
Bugs Fixed
- CFD-3742 Web Connector - Should not access HTML properties for non-HTML items
Tasks Completed
- CFD-3423 Move Web Connector to Rapid
Version 2.8.6
Bugs
- CFD-2756 Web Connector - Not all pages removed from sitemap are removed from index
Version 2.8.5
Tasks
- CFD-2750 Publish Connectors to Connectors 2.9 Feed based on SDK 1.5
- SDK-280 Publish Connectors to Connectors 2.10 Feed based on SDK 1.6
Version 2.8.4
Improvements
- CFD-2574 Web connector - Add Platform Date
- CFD-2146 Web Connector - Add support for new hit fields in RestService version 6
- CFD-2654 Universal Web connector - proxy support
Version 2.8.3
Bugs
- CFD-2435 Web crawler - 301 redirect links that shouldn't be index are indexed anyways.
(To avoid indexing redirects - set IsHttpRequestAutoRedirectsEnabled to False) - CFD-2434 Web crawler does not handle robots with ending /.
(Bug confirmed and reported to Abot - the third party crawler. Need to be fixed temporarily with configuration setup changes) - CFD-2367 Web - The sign '?' working as designed in robots.txt Disallow.
(Bug confirmed and reported to Abot - the third party crawler. Need to be fixed temporarily with configuration setup changes) - CFD-2228 Web connector - Canonical URL not working
- CFD-2078 Web connector - Only tries to crawl with protocol TLS 1.0
Tasks
- CFD-2314 Web Connector - Release with new branding
Version 2.8.2
Improvements
- CFD-1758 Web connector - Expose and add setting "IsRespectHttpXRobotsTagHeaderNoFollowEnabled" and 3 other missing config values
Bugs
- CFD-1953 Web connector crashed after 5 days of discovery
- CFD-1894 Web connector - crawler indexes "http://virtualworks.com" when specifying "http://virtualworks.com/contact" as seedurl
- CFD-1817 Web connector - meta robots = "nofollow" not working
- CFD-1771 Web connector - "The directory is not empty" when crawling
Tasks
- CFD-1954 Web connector - Investigate high cpu/disk/memory usage
- CFD-1932 Web connector - Release version 2.8.2
- CFD-1914 Web connector - Crawl of single page not checking for canonical
Version 2.8.0
Summary
- Hidden settings now added to the Admin Wizard and database
- New custom setting for only building pages where the rel canonical link is equal to the page url (this is turned off by default)
- Bug fixes and update the third party crawler api.
Bugs
- CFD-1813 Web connector - Bugs in admin customs settings.
- CFD-1812 Web connector - Won't remove page after e.g. reducing crawl depth
- CFD-1366 Web connector - Unable to add new settings in wizard
Tasks
- CFD-1884 Web connector - Update the Abot Crawler
- CFD-1876 Web connector - Add "tracking" to detect deleted/excluded pages
- CFD-1810 Web connector - Publish Data Sheet
- CFD-1759 Web connector - Add feature to only crawl canonical url's
- CFD-1797 Web connector - Release next version
Version 2.7.8
To enable preview for web pages, please change the REST service's web config. Add "html" to the extension list for the document previewer, like this
<add AppName="document" Action="preview" Script="DisplayLink" ExtList="txt;doc;docx;dotx;docm;docxm;dot;pdf;cs;css;js;fax;xml;xls;xlsm;xlsx;xlsxm;xlt;xltm;xltx;xps;msg;html" DocTypeList="" SkipRootExtList="" SkipRootDocTypeList="" Priority="20"></add>
Features
- CFD-1251 Web connector - Add support for preview of files
Version 2.7.6
Task
- [CFD-1250] - Web connector - Filename and filext have semi-colon on the end
Version 2.7.5
Bug
- CFD-1240 - Not all specified MIME types were downloaded.
Version 2.7.4
Bug
- [CFD-538] - Web connector: Deployment issue - missing authentication plug-in file
- [CFD-539] - Web connector: AuthRealm is required to be configured
- [CFD-784] - Some of the web links aren't crawled
Task
- [CFD-540] - Web connector: Change SDK version from 1.1 to SDK 1.2
ayfie