SharePoint Connector: On-premise Document Connection

Server Page

Setting NameDescription
SharePoint Server SiteSharePoint Server address: Enter the address of the SharePoint server using the DNS name.
  • Note:  In order to utilize Active Directory Single Sign-On, you cannot use an IP address as a SharePoint server address, you must use the DNS name.
Display Name

Enter the display name for the connection

Username

Enter the index user account name in the form of either domain\username or username@domain.com

PasswordThe password of the index user.
Ecspand RepositoryEnables Ecspand plugin for the connection. If the connection you are configuring is an Ecspand repository, please make sure to check this box. Otherwise, leave it unchecked. Requires license to the Ecspand plugin.

Site Collection Page

The Site Collection Configuration window allows you to choose to index all site collections or only the site collection at the specified server URL.

Setting NameDescription
Index all 'Site Collections' on this server

The setting "Index all "Site Collections" on this server" is available for on premise SharePoint 2013 or 2016 deployments. Checking this box will give you the choice to index all path-based site collections or host-named site collections in the default SharePoint web application (port 80 or 443). Non-standard ports are not supported.

If you enable this option the SharePoint connector will query the SharePoint Search REST API to get the list of SharePoint collection. This option requires a running SharePoint Search Service.

Include Path-Base Site Collections

When selecting this option the URLs of the Site Collections will be matched with the configured SharePoint Server address for this connection (See Server Page). Site Collections not matching the path will be ignored. The matching is performed comparing the Uri Authority of the URLs.

Newly created site collections within the default web application matching the path will automatically be indexed when this setting is enabled.

Include Host-Named Site Collection

If indexing Host-Named Site Collections, choose which host name to index from the drop down. Only one host-named site collection can be configured for each SharePoint connection.

Newly created site collections within the default web application matching the path or host name will automatically be indexed when this setting is enabled.

Custom property nameSelect or provide property name that all included site collections should have. All site collections without this property will be excluded. You can press Auto Discover Properties button to perform lookup for available properties*.
Custom property valueSelect or provide value for the custom property. All site collections that don't contain property with configured name and value will be excluded. You can press Auto Discover Properties button to perform lookup for available properties*.

* If your server contains a lot of site collections this action can take a while. In that cases there has been added feature that lets you pause the lookup at any time after first site collection has been processed and work with limited data. 

The “Index all Site Collections” setting can also be used to index users' MySites.  Each MySite (which contains a library called OneDrive for Business) is considered its own site collection in SharePoint Server. Locator can index these OneDrive libraries by using the setting "Index all "Site Collections" on this server" as long as the MySites are hosted on the default web application.  Individual MySites can be excluded from indexing by checking them on the subsites page of the SharePoint wizard.

Site Collection Sets Page

This window allows you to divide discovery of multiple site collections into multiple connections (sets) that would index it in parallel. Each set can have a different index user.

Setting NameDescription
Index set of 'Site Collection'If you enable this option the SharePoint connector will index only one set of site collection.
Total number of setsNumber of sets. Each of the set must be created in new connection.
Current set number

Set for which current connection is being created.

  • Important Note: Each set should get connection even if it would be currently empty. When a new site is created it has a chance to be assigned to the empty set.
Details

Informations that help determine optimal number of sets/index users.

SubSites Page

After clicking ‘next’, what’s displayed on the SharePoint Subsites window depends on your selections from the previous window:

  •          If “Index all “Site Collections” is not checked, Locator will display all subsites found under the specified server URL
  •          If “Index all “Site Collections” is checked using Path-Based Site Collections, Locator will display a list of site collections found in the default web application that match the path of the specified server URL. 
    •        Clicking the + next to a site collection will query for the subsites under that site collection.
  •          If “Index all “Site Collections” is checked using Host-Named Site Collections, Locator will display a list of site collections found in the default web application that match the selected host name.
    •        Clicking the + next to a site collection will query for the subsites under that site collection.

Note: When using the “Index all “Site Collections” setting, if the index user doesn’t have the necessary permissions to a site collection, that site collection will be excluded by default and disabled (greyed out).

Setting NameDescription
Subsites

By default, all sub-sites will be indexed. You can choose to exclude sites by checking the box to the left of the site name.

Exclude items marked as NoCrawl

You can choose to exclude items marked as NoCrawl in SharePoint.
  • The NoCrawl flag is controlled in the "Search and Offline Availability" settings for the SharePoint List.
  • If "Allow this site to appear in search results" is set to No, the list will have the NoCrawl flag.
Exclude items from Hidden listsYou can choose to exclude items from hidden lists.
  • A list is hidden when the "Hide from Browser" flag is checked in SharePoint Designer.
Include SharePoint Web SitesThe default setting is not to include the SharePoint Sites .

Content Type Page

This window allows you to choose to exclude specific data content types from being indexed.  Just as you can with sites, simply check the box to the left of the content type name for those content types you wish to exclude from the index.

Setting NameDescription
Content Type

You can also choose to exclude specific data content types from being indexed.  Just as you can with sites, simply check the box to the left of the content type name for those content types you wish to exclude from the index.

Content Type Advanced Configuration Page

The link “Advanced configuration” allows you to exclude and include items that are of specific content type. More specific content types have always higher priority. More specific content types can be recognized by the length of the ID. Content type with ID 0x010201 is more specific that 0x0102. For example if exclude Item (0x01) content type and include Event (0x0102) content type then this connection will index only Event items.
After confirming new configuration Content Type Page will be updated and it won't show excluded content types.

File Type Page

Setting NameDescription
File Type to index

Select file types Only the most common file types are included by default.

The file extension to index. The options are:

  • Index all file types
    • This one will by default index ALL files, regardless of the file format
    • For file types associated with the built-in DocFilter content filter, all text content available will be indexed
    • For other file types, only available metadata (like file name and path) will be indexed
  • Index selected file types only (default)
    • Will only index the file types listed in the dialog
  • Exclude selected file types
    • Will index everything - except the file types defined in the list


Finalize Page

The next wizard screen will allow you to finalize the connection and select whether you want to utilize the "change sets" feature.  

Setting NameDescription
SharePoint ChangeSets

Utilizing change sets is a more efficient method for Locator to determine what has been changed on the SharePoint server since the previous "crawl".

  • Important Note For SharePoint Connector versions older than 2.9.0.0: In order to use this feature the index user is required to access the  SiteData web service, thus the index user must be a site collection administrator.
    Since SharePoint Connector version 2.9.0.0 SiteData web service is no longer used to retrieve changes and index user no longer has to be site collection administrator to use ChangeSet functionality.
Include SharePoint ListsThe default setting is to include the SharePoint Lists.
Include SharePoint CommentsCheck this option if you want to index comments under pages in SharePoint. Default setting is to exclude comments.

Upon completion of the Locator SharePoint Connection, the Locator Server will be ready for scheduling the SharePoint connection for indexing.  As soon as the connection wizard is finished, you will return to the SharePoint connections overview in the Management Console.  You can click the "Schedule" option to configure the schedule for the connection.

ayfie