Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

The objective of this document is to provide an architectural overview of all ayfie Locator.

...

In the following sections we will go through all these components one by one and outline their functions.

Connectors

The ayfie Locator has a large number of connectors that support a wide variety of data sources. Out of the box only 3 of these (File Server, Exchange and SharePoint) are part of the Locator installation. The rest of them have to be installed individually.

...

  1. The connector retrieves document information, that is the id/path/file name and the document type (file extension) as well as ACLs (Access Control Lists).
  2. It is possible to add an extension DLL with customized operations to be performed on the meta data for each document.
  3. The connector stores information about any document of a "qualifying file type" in the database. It also add the document to the document fetch queue
  4. The index builder detects new or updated rows of data (check time stamps) and stores the name and type of the document (but not the content that has not yet been retrieved)

    The fetch phase consists of steps 5-10 described below:

  5. The connector SDK retrieves the document path/id/etc from the document fetch queue and passes this on to the relevant connector
  6. The connector downloads the document
  7. The same external custom DLL as in step 3 (if any) is now run on the fetched document content
  8. The content is extracted by one of the converters (for instance OCR conversion takes place here)
  9. The connector stores the converted content in the database under the doc id created during discovery (or earlier)
  10. The index builder picks up and indexes any new or updated documents from the database  

Converter Services

All files, regardless of file type or file format, are passed on to the DB by some Connector as shown below. First the file is copied from the file's location at the source to a temp folder managed by the Convertor Services. In addition to the copy operation, the Connector will also provide the Converter services with the full file name. The Conversion Services will use the file name extension to determine which converter to use. Currently we have three converters available of which two, OmniPage and Tesseract, are Optical Character Recognition (OCR) converters and are used for converting scanned documents. These two converters are never used together as Locator is either configured to use one or the other. All other files are handled by the DocFilter Converter.

Database

By default, ayfie Locator comes with a PostgreSQL database. ayfie Locator can also be set up to use Microsoft SQL Server. The motivation for some customers to choose to replace PostgreSQL with Microsoft SQL Server is either performance or stability reasons, or simply unfamiliarity with the PostgreSQL database.

The database is used to to store the incoming data in its original form, including meta and security data (ACLs). It is also used for storing configurations, output from running analytics, user settings and more.

Index Builder

The job of the Index Builder is to keep the search index in sync with the documents that are stored in the database. Hence, any added, deleted or altered document in the database will be detected by the Index Builder upon which it will do a re-index. The way the index Builder detects the changes is by monitoring two specific database tables (doc.document and the doc.document_tombstone) for any added, altered or deleted documents. If it finds any row with a time stamp newer than more recent than it's last visit, it will process that row.

Lingustic Services

Linguistic Service (a.k.a Lingo) is a module that does not come with the out of the box ayfie Locator installation, but has to be installed seperately. It's purpose is to extract entities from the incomming data and use to populate fields created and dedicated to that those particular values. The Linguistic Service is utilized by the Index Builder.

Index Service

By default, the ayfie Locator comes with a SOLR search engine that is pre-configured with 3 shards. A shard is a index fragment. Any indexed document will be placed in one shared only. The motivation for using 3 shards is that this has been found to give the best overall performance trade off between indexing and search for a single node Locator installation.

...

  1. Incoming documents are indexed by the Index Builder and passed on to SOLR.
  2. SOLR consults with Zookeeper to know in which shard to place each document.
  3. At some later time there is an incoming search query that is passed in from the IIS contained Rest Service to SOLR.
  4. SOLR again consults with Zookeeper. This time to know which shards to search within. For a single node installation, this step has no added value as all three shards needs to be searched and the 3 results merged into one. However, for a multi node installation and/or a installation with failover, this last step is crucial for the operation.

Query & Result Processing

Below we see how an incoming query and the corresponding search result propogate through Locator:

...

  1. The user sends in a query, be it via the ayfie Locator front end or some other application (we have in this case used SharePoint as an example). The query is passed in via the Search API.
  2. The user is is authenticated and identified. This is done by an ayfie or a customer developed plugin and is normally done towards Microsoft Active Directory as well as often towards one or more target source systems.
  3. The original query is expanded with user's ACLs that was obtained in the previous step.
  4. The ACLs of the items in the search result is compared to the user's ACLs obtained in step 2 and all items to which the user does not have access is removed from the result
  5. The search result is modified according to the rules stored in the search result rule engine
  6. The ayfie, custom or third party front end presents the result and provide item preview (via an ayfie plug-in)
  7. If the end user has downloaded the ayfie Document Handler, any result clicked will be opened in its native application

The ayfie Document Handler

The ayfie Document Handler is a native ayfie utility for Windows that is installed separately from ayfie Locator and is used to open links in the ayfie Locator search result using the relevant Microsoft Office or other supported application. Without this tool the end user will have to first download the file for then next to open it using the same relevant application.

The SharePoint App Plug-In

The ayfie Locator Application for SharePoint is an add-on package to SharePoint that allows users of SharePoint to search with ayfie Locator from within the SharePoint GUI.

Data Enrichment Services (Rules Engine)

It is possible to use the Rules Engine to alter the out of the box behavior of ayfie Locator. There are two places where the rules of the Rules Engine come into play:

...

  1. The ayfie Locator administrator uses the Dashboard web site to create or update rules
  2. Index side rules are uploaded to the Index Builder
  3. Query and result side rules are uploaded to the Rest Service
  4. Data is fed and the rules are consulted by the Index Builder during indexing
  5. The user passes in a query
  6. The search result is processed based on Query Result Rules before return the user

License Service

The ayfie Locator license service is consulted before at connector startup (# 5 in graphic below) and user login (# 6).

...