Okahu Concepts¶

Components¶

Components are the core building blocks of the user’s AI ecosystem. Every component has a unique identifier, a friendly name, description, and type. A component can optionally include a set of attributes relevant to that component type, for example, a Triton server will specify the connection endpoint or a Model will include parameters, etc.

Infrastructure Component

An infrastructure component is an instance of a service, such as an NVIDIA Triton server instance, Kubernetes cluster instance, or Postgres database server. Every instance of a service type is a separate component; for example, RDS Postgres Finance-Service-EastUS and Sales-Service-WestUS are both Postgres servers but are two different components.

Logical Component

A logical component is a piece of code or data, such as workflow code, GPT 3.5 Turbo model, vector data set, etc. Every copy of such code or data is a separate component.

Dependencies Between Components

A notion of dependency exists between components. A component can be 'hosted' by another component; for example, a Model is hosted on an inference server; the inference server is hosted on Kubernetes. Note that a logical component can be hosted on an infrastructure component but cannot host any other components.

Application¶

An application is a business concept. It is a set of different logical components combined by some code/workflow to deliver specific business KPIs. Each of these logical components will be hosted on an infrastructure component. The application will have a unique name/ID, friendly name, and description. The application could include additional business metadata.

Linking¶

Each application will have a set of logical components stitched together. The output of one logical component is consumed by another, for example, a Langchain workflow will use a foundational model to consume the inference. Note that a given logical component could be used in more than one application, such as a GPT 3 model being used by two different applications.

Traces¶

Traces are the full view of a single end-to-end application KPI, such as a Chatbot application providing a response to an end user’s question. Traces consist of various metadata about the application run, including status, start time, duration, input/outputs, etc. It also includes a list of individual steps, known as "spans," with details about each step.

It’s typically the workflow code components of an application that generate the traces for application runs.

Spans¶

Spans are the individual steps executed by the application to perform a GenAI-related task, for example, an app retrieving vectors from a database or querying an LLM for inference. The span includes the type of operation, start time, duration, and metadata relevant to that step, such as the model name, parameters, and model endpoint/server for an inference request.

It’s typically the workflow code components of an application that generate the traces for application runs.