Locator Dashboard Tutorial
Introduction
This tutorial aims at familiarize the reader with the the Locator Dashboard. Upon completing this tutorial, the reader should know the following:
How to obtain document ids
How to enable hidden menu options
How to manage repositories (adding, deleting, disabling, scheduling)
How to see when a repository scan started, when it terminated and if a scan is currently ongoing
How to see how many changes that were made to the index and what type (documents added, deleted or updated)
What the relationship between a source reference id and a document id is
How to refetch a specific document
Prerequisites
This tutorial assumes that one already has a Locator 5.x instance up and running. If needed, check out https://ayfie-dev.atlassian.net/wiki/spaces/SAGA/pages/2400714758 if that is not the case.
Download the Test Data
Create directory C:\test_data and extract the content of the docs.zip file below into the newly created directory:
After the extraction the directory should contain these two files:
doc.docx
doc.zip - which again contains these files:
doc.pdf
doc.xlsx
doc.txt
Index the Test Files
Install the File Server Connector as instructed in section Connector Management of the https://ayfie-dev.atlassian.net/wiki/spaces/SAGA/pages/2400714758.
Once installed, create a new file server connection for extracting the files in the test_data directory we just created above. Also make sure to enable all 5 file types listed above (including the .zip file type) when prompted about that in the wizard (see bottom right part of the screenshot below).
After a few minutes, verify that one is able to successfully search for the five files with a * wildcard search in the Locator search frontend. The screenshot below is from an installation that has been set up using a hidden self-signed engineering certificate option that makes it only possible to access the search frontend locally via the URL https://localhost/search. Depending on your installation, you might have to use another URL.
The search result can be shown in either of 3 different views: details, cards and list. The search result above is shown using the card view option in the middle of the 3 red circled options up to the right. Try out each of the 3 options to see the effect.
In later sections we will be need to know the document id of a few of the documents. That is done by adding debug=true as a last parameter to the URL. The document id will appear at the bottom of the search result cards as shown here:
Accessing the Locator Dashboard
So far we have used the Locator Management Console to configure a connector connection and the Locator search frontend to search for documents and to obtain document ids. We will now access the Locator Dashboard at https://localhost/Dashboard (again, depending on your setup, you are likely to have to use another FQDN then localhost):
The opening page contains several statistics as well as displaying these main menu options at the top:
Repositories
System
Analytics
Index
Configuration
In addition to these 5 there is also a hidden Rules option that we will learn how to make visible in the next section below.
Hidden Menu Options
Just like we earlier used a special URL parameter to obtain more information in the search page, there is also the special URL parameter se=1 to obtain the following hidden menu options:
The Rules option in the main menu in the top bar
The Create Refiner and Create Index Field options on the page reached from the Index option in the main menu
We will not be using the hidden menu options in this tutorial. However, in the https://ayfie-dev.atlassian.net/wiki/spaces/ACADEMY/pages/3120463873 we will be using these all the time when working with Rule Engine to add new refiners to the search page.
Repositories
The left most main menu option is Repositories. A repository corresponds to a single connector connection. In the graphic below we have 5 repositories (connector connections) across 3 connectors (Exchange, SharePoint and WorkSite):
Managing Repositories
The Locator Dashboard only shows the repositories and report their performance and status. To do some changes to them like adding, deleting, disabling the connection, one has to use the Locator Management Console:
Experiment with the file server connection that was made earlier in this tutorial to answer the questions below. Do not worry that the index data may get lost as in this case one very quickly add it back again.
What does it mean when a connection has a orange background color in the Dashboard as we can see for the SharePoint connector in the previous section?
What type of connection scheduling is it possible to do?
If one disables indexing for a connection in the Management Console, does that indeed disable the indexing or just the fetching of more documents?
What happen to documents that have already been fetched but yet not indexed when one disable indexing?
Is it still possible to search for documents from a repository that has had indexing disabled?
Is there a way to recover documents from a repository that has been deleted?
What happens if one change the connection configuration, for instance remove one of the file types to be retrieved? Will the index be automatically updated according to the new configuration?
Repository Scanning
To see more information about each repositories, click the repository name as we have done here for one of the two WorkSite repositories:
Notice how we have red circled the last two scan, one that was completed and one that started after that and is still ongoing. Notice also how we are informed how long time each scan take and what actions that have taken place (documents added, updated or deleted) and if there were any errors.
How many scans have taken place for the test data for which a connection was created earlier?
How long does each scan take?
What actions do we see for each scan?
Do something to the data so that one end up with one line of scan information that shows 1 delete, 1 add and 1 upgrade and that without that we loose any of the 5 original files.
Full Scan or Change Set
There are two kinds of connector scan operations:
Full Scan - all items are inspected one by one to check for changes (including documents that have been removed)
Change Set - only items that have changed are inspected
All connectors except for the Exchange Online connector support full scan and use this method. The scanning can be scheduled, but if not it will constantly repeat itself over and over again with a 5 minutes pause between the end of one scan and the start of the next.
Not all connectors support change set. However, for those that do one can configure them to do change sets in addition to full scans.
In the graphic below we see how the dashboard shows two tables, one for the full scans and another one for the change sets.
Source Reference ID vs. Document ID
The repository page has two lookup boxes, the top one for the source reference id and the bottom one for the document id.
What is the difference between the two types of IDs? And why does it say that we have sent in only 3 documents when we know we sent in 4 (docx, pdf, xlsx, text) and 5 if we count the zip file holding 3 of the files?
If we follow the click route below we will see that the directory we made to hold the test files, C:\test_data, has source reference id 1 and the docx file and the zip file has source reference id 15 and 16 respectively (these are sequence number, so one will have other numbers)
If one now click on either of the the two references (15 - docx or 16 - zip) one will see that they do not have children (try it with your own data). This means that our test data consists of a 3 sources linked to each other like this:
source ref id 1: directory C:\test_data
source ref id 15: doc.docx
source ref id 16: doc.zip
This also means that the column title in the Dashboard UI is misleading as it should have said Sources instead of Documents.
If we now start exploring one of the sources, in this case the zip file, we see that the zip file also has a document id (25) different from the source reference id (16) and we see that each of the file contained by the zip file is listed below it, first the doc.txt file (26), then next the doc.xlsx file (27) and finally (out of view in the screenshot below) the pdf file. Even without seeing it in the screenshot below, we know its id from the very beginning of this tutorial when we looked up ids in the search front end. What is the document id of the pdf file and are the document id in the search frontend the same as below for the other files?
By using the lookup functionality in the Dashboard UI we are able to find all document ids by using the 3 source references ids we identified above.
source ref id 1: C:\test_data (doc id 1)
source ref id 15: doc.docx (doc id 23)
source ref id 16: doc.zip (doc id: 25)
doc.txt (doc id: 26)
doc.xlsx (doc id: 27)
doc.pdf (doc id: 28)
The ids are sequencial so they will be different for different installation. Produce the same table as above for your own installation and when done, confirm that all documents ids matches the one that one obtain in search front end using the debug=true parameter.
Refetching a Document
Sometimes one ends up needing to refetch a specific document. The motivations for why we would want to do that we will cover in other tutorials. Here we will just learn how that can be done. Imagine we for whatever reason need to refetch the pdf file above. The first thing we need to do is to obtain the document id. That is normally done with the method we have already learn were add the debug=true parameter at the end of the search page URL.
Notice how Locator on its own maps the pdf file with document id 28 to the corresponding source reference 16 (the source reference number derived and displayed automatically by Locator).
By refreshing the page one can see if the fetch request has been executed or not, and if it has, one can see when that was and how long time it took.
We see there are 3 search boxes to the left in the screenshot above. We are now familiar with the two top lookup boxes. The third box, the validation box, we will learn about in the https://ayfie-dev.atlassian.net/wiki/spaces/ACADEMY/pages/3120463873.
System
TBD
Analytics
TBD
Index
TBD
Configuration
TBD
Â