Okahu

#Okahu concepts

Components

Components are the core building blocks of the user’s AI ecosystem. Every component has a unique identifier, a friendly name, description and type. A component can optionally include a set of attributes relevant to that component type eg. Triton server will specify the connection endpoint or Model will include parameters etc.

Infrastructure component

An infrastructure component is an instance of a service eg NVIDIA Triton server instance or Kubernetes cluster instance or Postgres database server etc. Every instance of a service type is a separate component eg RDS Postgres Finance-Service-EastUS and Sales-Service-WestUS are both Postgres servers and two different components.

ogical component

A logical component is a piece of code or data eg workflow code, GPT 3.5 Turbo model, vector data set etc. Every copy of such code or data is separate component.

Dependencies between components

A notion of dependency exists between components. A component can be ‘hosted’ by another component eg a Model is hosted on inference server; the inference server is hosted on Kubernetes. Note that a logical component can be hosted on an infrastructure component but can’t host any other component.

Application

An application is a business concept. It is a set of different logical components combined by some code/workflow to deliver some business KPI. Each of these logical components will be hosted on an infrastructure component. The application will have a unique name/id, friendly name and description. The application could include additional business metadata.

Linking logical components within an application

Each application will have a set of logical components stitched together. The output of one logical component is consumed by another, eg Langchain workflow will use a foundational model to consume the inference. Note that a given logical component could be used in more than one application eg a GPT 3 model is used by two different applications. ##Traces Traces are the full view of a single end-to-end application KPI eg Chatbot application to provide a response to end user’s question. Traces consists of various metadata about the application run including status, start time, duration, input/outputs etc. It also includes a list of individual steps aka “spans with details about that step. It’s typically the workflow code components of an application that generate the traces for application runs.

Spans

Spans are the individual steps executed by the application to perform a GenAI related task” eg app retrieving vectors from DB, app querying LLM for inference etc. The span includes the type of operation, start time, duration and metadata relevant to that step eg Model name, parameters and model endpoint/server for an inference request. It’s typically the workflow code components of an application that generate the traces for application runs.

Example - Stitching all together

Application logical view

Consider a LLM ecosystem in Contoso Inc who has deployed two applications, Customer Chatbot and recommendation copilot. Logical components

Application infrastructure view

Logical and Infra components

The above picture illustrates the full view of application and all the components.

System layout

System map view

As one can see, various components are reused in multiple applications. The system map includes the relationship between various components amongst themselves and applications. For example, all applications are using Azure, two different models used by two different applications are hosted on same Triton inference server etc.