Web Connector: Adding a New Connection
Please note: As there is no globally enforced standard for Web resources that the Web connector could use to obtain the last update timestamp for a given document, the connector will treat every item in the repository as "updated" in order to ensure all changes are covered.
Added to the fact that crawl settings default to continuous crawls, this can result in high traffic between your fetch server and the configured Web repositories.
Unless your Web data source is updated multiple times a day, we recommend to schedule your Web connections to be crawled only within a set time period, once a day.
Page 1
- Proxy settings
- Address: Enter the FQDN or IP address of the proxy server
- Port: Enter the port number, the proxy server listens on
- Login details
Note: Please either leave all below fields empty (if proxy server doesn't require you to authenticate) or provide information in each of them- Username: Enter the user name
- Password: Enter the password
- Domain: Enter the domain name
Page 2
- Connection name
- Add URLs to index
- Enable or disable sitemap mode (in sitemap mode indexed pages are removed from the index not only when they are gone, but also in case they are not referenced by some other page)
Page 3
- Settings (no reason to change unless Sitemap mode is enabled - in that case consider setting Max crawl depth to 1)
Page 4
- Add MIME types with comma (no spaces). Full list: https://www.sitepoint.com/web-foundations/mime-types-complete-list/
Page 5
ayfie