Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Markdown
- [Introduction](#introduction)
  - [Configuration Philosophy](#configuration-philosophy)
- [Prerequisites](#prerequisites)
- [Determine Configuration Values](#determine-configuration-values)
- [Configure the Azure OpenAI Models](#configure-the-azure-openai-models)

# Introduction
This guide describes how to configure Azure to produce the values required to configure the Ayfie Personal Assistant feature.

## Configuration Philosophy
As we will learn in the next section, one is to set Personal Assistant up with three or four models: an Embeddings model, a Main model, a High Quality model, and an Embeddingsoptionally, a High Quality Plus model. The Embeddings model is always the same and the two High Quality modelmodels isare normally easy to determine; itthey isare whichever modelmodels that, at the time, isare considered to produce the most correct and accurate responses to user prompts. Deciding which model to use as the Main model can be a bit trickier. One may want to select a model with a lower cost or, if the model to be used as the High Quality modelmodels is a bit slow, then one may opt for a model with a faster response time as the Main model.

This is further complicated by the fact that various factors, such as pricing and model performance, are subject to change over time. New models or new versions of existing models are also constantly being released. This means that the conditions that led to one configuration at installation time might be very different a few months later. For an overview of released models and their regional availability, as well as links to pricing information, check out [Azure OpenAI Service Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview).

The configuration recommendation provided in the following section reflects the current advantages of GPT-3.54o mini, which include its lower cost and faster response time over the more correct and accurate GPT-4 model and GPT-4o models. We advise periodically reviewing this documentation to determine if evolving conditions warrant a reconfiguration.

# Prerequisites
These are the prerequisites that must be fulfilled before setting up Personal Assistant:
- **Obtain an Azure Subscription**   - One must have an active Azure subscription. If one don’t have one, one can sign up for an Azure subscription on the [Azure website](https://azure.microsoft.com/en-us/).
At times, Microsoft may impose subscription type specific limitations on their OpenAI services, particularly concerning the amount of data (referred to as the token quota) that can be exchanged during chat interactions. Please see this Microsoft documentation on [token quotas- **Get Azure OpenAI Approval** - The Azure subscription needs to be approved for [Azure Open AI](https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limitsoverview). forHow to detailsdo onthat whichis subscriptiondescribed typesin may[Ayfie havePersonal restrictions.Assistant - **GetHow Azureto OpenAI Approval**
  - The Azure subscription needs to be approved for [Azure Open AIRequest Access to the Azure OpenAI Service](https://learnayfie-dev.microsoftatlassian.com/en-us/azure/ai-services/openai/overview). How to do that is described in [Ayfie Personal Assistant - How to Request Access to the Azure OpenAI Service](https://ayfie-dev.atlassian.net/wiki/spaces/SAGA/pages/3443523634/Ayfie+Personal+Assistant+-+How+to+Request+Access+to+the+Azure+OpenAI+Service).

 Once the prerequisites above have been completed, one can then start configuring Ayfie Personal Assistant.

# Determine Configuration Values
One needs to determine the following configuration values:
- **Deployment Name**
- **API Address**
- **API Key**

for each of the following 3 Azure OpenAI models:
- **Main Model**
- **High Quality Model**
- **Embeddings Model**

The intended difference between the Main Model and the High Quality Model is the quality of the chat responses when users toggle between the two in the Personal Assistant UI. The normal approach is to set the Main Model up with GPT-3.5 and the High Quality Model with GPT-4. The reason for having two options and not just a single High Quality Model option is that GPT-3.5 can, depending on the current offerings, be faster and/or lower cost than GPT-4.

The last listed model is for creating embeddings. Embeddings are numerical representations of words that are learned from large amounts of text data. Currently, there is only one supported model.

Each of the 3 models requires an API address and an API key. However, unless one chose to spread the models across geographical regions, all 3 models will be reached via the same API address and an API key.

Given the limited options for each setting, the configuration of the prerequisites in Azure described in the next section, has a very predictable outcome:
- Main Model Deployment Name: ***gpt-35-turbo***
- High Quality Model Deployment Name: ***gpt-4***
- Embeddings Model Deployment Name: ***text-embedding-ada-002***
- API Address: *the same one for all three*
- API Key: *the same one for all three*

# Configure the Azure OpenAI Models
For more information, consult [Azure OpenAI Service Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview).

Follow these steps to set up the deployments for the 3 Azure OpenAI models:
- Log in to the Azure portal at [portal.azure.com](https://portal.azure.com/)
- Make sure the account that is logged in has at least one subscription
- Go tonet/wiki/spaces/SAGA/pages/3443523634/Ayfie+Personal+Assistant+-+How+to+Request+Access+to+the+Azure+OpenAI+Service).

 Once the prerequisites above have been completed, one can then start configuring Ayfie Personal Assistant.

# Determine Configuration Values
One needs to determine the following configuration values:
- **Deployment Name**
- **API Address**
- **API Key**

for each of the following 3 Azure OpenAI models:
- **Main Model**
- **High Quality Model**
- **Embeddings Model**

and optionally for the following Azure OpenAI model:
- **High Quality Plus Model**

The intended difference between the Main Model, High Quality Model and the High Quality Plus Model is the quality of the chat responses when users toggle between the three in the Personal Assistant UI. The recommended approach at the time of this writing, is to set the Main Model up with *GPT-4o mini*, the High Quality Model with *GPT-4* and High Quality Plus Model with *GPT-4o* and to set either the High Quality Model or High Quality Plus Model as the default model in the UI.

The last model is for creating embeddings. Embeddings are numerical representations of words that are learned from large amounts of text data. Currently, only the *text-embedding-ada-002* model should be used.

Each of the 4 models requires an API address and an API key. However, unless one chose to spread the models across geographical regions, all 4 models will be reached via the same API address and can use the same API key.

These are the currently recommended settings:
- Main Model Deployment Name: ***gpt-4o-mini*** (only available in some regions)
- High Quality Model Deployment Name: ***gpt-4***
- High Quality Plus Model Deployment Name: ***gpt-4o*** (only available in some regions)
- Embeddings Model Deployment Name: ***text-embedding-ada-002***
- API Address: *the same one for all*
- API Key: *the same one for all*

# Configure the Azure OpenAI Models
For more information, consult [Azure OpenAI Service Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview).

Follow these steps to set up the deployments for the Azure OpenAI models:
- Log in to the Azure portal at [portal.azure.com](https://portal.azure.com/)
- Make sure the account that is logged in has at least one subscription
- Go to *Azure OpenAI*
  - Click *Create*, to create a Resource
    - In *Project Details*, select *Subscription*
    - In *Project Details*, select *Resource Group*
    - In *Instance Details*, select *Region*.  Not all models are available in all regions, consult with [Azure OpenAI Service models](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models) for availability. EU country customers should for legal reason select a region that is within EU. *Sweden Central* would normally be a good choice as this region currently offers most models. Please note that the standalone Personal Assistant purchased from Azure Marketplace and the Locator integrated version addressed in this documentation, compete for the same token per minutes quotas if they share both subscription and region. Please contact Ayfie Support for advise on which region to select in case of quota conflicts.
    - In *Instance Details*, set the *Name*
    - In *Instance Details*, select *Pricing Tier*
    - Click *Next* to go to the *Network* tab
    - In *Type*, select *All networks, including internet can access this resource.*
    - Click *Next* to go to the *Tags* tab
    - Click *Next* to go to the *Review + submit* tab
    - Click *Create*
  - When Resource is created, select the resource in *Azure OpenAI*
  - Click *Create*, to create a Resource
    - In *Project Details*, select *Subscription*
    - In *Project Details*, select *Resource Group*
    - In *Instance Details*, select *Region*.  Not all models are available in all regions, consult with [Azure OpenAI Service models](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models) for availability. EU country customers should for legal reason select a region that is within EU. To avoid any data quota conflict with a current or a future deployment of the standalone version of the Personal Assistant, it is recommended to **not** select the regions *Sweden Central*, *UK South* and *Canada East*.
    - In *Instance Details*, set the *Name*
    - In *Instance Details*, select *Pricing Tier*
    - Click *Next* to go to the *Network* tab
    - In *Type*, select *All networks, including internet can access this resource.Keys and Endpoint* in the left menu under *Resource Management*
    - **Copy the value of *Endpoint*, it will be required later as the API Address**
    - **Copy the value of *KEY 1*, it will be required later as the API Key** (optionally *KEY 2*, both keys are valid)
  - Click *Model deployments* in the left menu under *Resource Management*
  - Click *Manage Deployments* (this will open a new portal). This guide is based on the old look. Make sure to switch to it in the top menu.
  - Click *Create new deployment* to create Main model
    - Select the model (*gpt-4o-mini* recommended)
    - Select the Model Version (Set to *2024-07-18 (Default)*)
    - Set the *Deployment Name* (must be same as model name)
      - **Copy the *Deployment Name*, it will be required later**
    - ClickSet *NextDeployment Type* to go to Standard (will ensure that the data stays and is processed within the *Tags* tabselected region)
    - ClickSet *Next*Tokens toper goMinute toRate theLimit (thousands)*Review to +the submit*maximum tabvalue
    - ClickSet *CreateContent Filter* to the -appropriate Whenvalue Resource is created, selector use the resource in *Azure OpenAI*default value
  - Click *KeysCreate andnew Endpointdeployment* to increate theHigh leftQuality menumodel
    - **CopySelect the valuemodel of *Endpoint*, it will be required later as the API Address**(*gpt-4* recommended)
    - Select the Model Version (*1106-Preview* recommended)
    - Set the *Deployment Name*Copy the(must valuebe ofsame *KEY 1*, it will be required later as the API Key** (optionally *KEY 2*, both keys are valid)
  - Click *Model deployments* in the left menu
  - Click *Manage Deployments* (this will open a new portal)
  - Click *Create new deployment* to create Main model
    - Select the model (recommended *gpt-35-turbo*)
    - Select the Model Version (Set to *Auto-update to default*)
    - Set the *Deployment Name* (must be same as model name)
 as model name)
      - **Copy the *Deployment Name*, it will be required later**
    - Set *Deployment Type* to Standard (will ensure that the data stays and is processed within the selected region)
    - Set *Tokens per Minute Rate Limit (thousands)* to the maximum value
    - Set *Content Filter* to the appropriate value or use the default value
  - Click *Create new deployment* to create Embeddings model
    - **CopySelect the *Deployment Name*, it will be required later**model (must be *text-embedding-ada-002*)
    - In *Advanced Options*, set *Tokens per Minute Rate Limit (thousands)* to maximum value.Select the Model Version (*2 (Default)* recommended)
    - ClickSet the *CreateDeployment new deploymentName* to(must createbe Highsame Qualityas model name)
      - Select the model (recommended *gpt-4*) **Copy the *Deployment Name*, it will be required later**
    - Select the Model Version (recommended *1106-Preview*)
    - Set the *Deployment Name* (must be same as model name).
      - **Copy the *Deployment Name*, it will be required later**
    - In *Advanced Options*, set *Tokens per Minute Rate Limit (thousands)* to maximum value.
  - ClickSet *Deployment Type* to Standard (will ensure that the data stays and is processed within the selected region)
    - Set *Tokens per Minute Rate Limit (thousands)* to the maximum value
    - Set *Content Filter* to the appropriate value or use the default value.
  - If one are to configure the High Quality Plus Model, click *Create new deployment* to create Embeddingsthe High Quality Plus model
    - Select the model (must be *text-embedding-ada-002**gpt-4o* recommended)
    - Select the Model Version (*2024-05-13* recommended)
    - Set the *Deployment Name* (must be same as model name).
      - **Copy the *Deployment Name*, it will be required later**
    - In *Advanced Options*, setbe required later**
    - Set *Deployment Type* to Standard (will ensure that the data stays and is processed within the selected region)
    - Set *Tokens per Minute Rate Limit (in thousands)* to the maximum value
    - Set *Content Filter* to the appropriate value or use the default value