Core Concepts

AI Task is built around a hierarchical organization of components that work together to process data. Understanding this hierarchy is essential for effective use of the framework.

Organizational Hierarchy

AI Task uses a three-level organizational structure:

Productions: High-level containers that organize related pipelines
Pipelines: Sequences of processing steps
Tasks: Individual operations within pipelines

This hierarchy allows for flexible and powerful workflows while maintaining clean organization.

Working Directory Requirements

Critical Directory Requirements

All AI Task commands, especially ai-partitur, MUST be executed from the project root directory to ensure proper path resolution. This is the base directory containing your partitur/, instruction/, and other directories.

# CORRECT: Run from project root directory
cd /path/to/project_root
ai-partitur profile_document 30727

# INCORRECT: Will cause path resolution errors
cd /path/to/project_root/partitur
ai-partitur profile_document 30727

The expected standard project structure:

project_root/
├── partitur/          # Contains all partitur files
├── instruction/       # Contains all template files
├── function/          # Contains custom functions
├── profile/           # Contains all profile data
└── diagnostic/        # Contains logs and error tracking

Productions

Productions are the highest-level organizational containers in AI Task. They represent collections of related pipelines that follow the same pattern but may process different inputs.

Similar to theatrical or musical productions, an AI Task production encompasses multiple “performances” (instances) of related pipelines. Each instance is identified by a unique ID and maintains its own execution history and outputs.

Productions are configured using a dedicated section in pipeline files:

production:
  name: "profiles"
  home: "/path/to/production_home"
  id_format: "{name}_{id:05d}"  # Format string for instance directories
  settings_dir: "settings"       # Name of the settings directory

For more details, see the Productions section.

Pipelines and Partiturs

The central concept in AI Task is that every AI operation is a pipeline. A pipeline represents a sequence of one or more processing steps (tasks) that transform input data into output data. Even when a pipeline contains just a single task, it’s still conceptually a pipeline, which simplifies the mental model and avoids introducing separate concepts.

Each pipeline is defined in a YAML configuration file called a “partitur” (from the musical term for a complete score), typically with a .yml or .ai extension:

name: "Example Pipeline"
description: "A pipeline that processes text"

# Optional production configuration
production:
  name: "text_analysis"
  home: "./productions"

pipe:
  - type: function
    name: "input_loader"
    function: aisource
    params:
      file: "input.txt"
  
  - type: llm
    name: "processor"
    tmpl: "process"
    model: "claude-3-7-sonnet-latest"
  
  - type: function
    name: "output_writer"
    function: airesult
    params:
      file: "output.txt"
      format: "text"
      
settings:
  function_dir: "function"  # Directory for custom functions
  continue_on_error: false  # Whether to continue on error

Tasks

A task is a single step in a pipeline. Each task takes input data, processes it in some way, and produces output data. AI Task supports several types of tasks:

Function Tasks

A function task executes a Python function that processes the input data. AI Task comes with several built-in functions, and you can also define your own custom functions.

- type: function
  name: "input_loader"
  function: aisource
  params:
    file: "input.txt"

LLM Tasks

An LLM task sends the input data to a large language model (LLM) for processing. The LLM task uses an instruction to format the input data before sending it to the model.

- type: llm
  name: "processor"
  tmpl: "process"
  model: "claude-3-7-sonnet-latest"

Instructions

An instruction (formerly called a template) is a text pattern that defines how to format the input data for an LLM task. Instructions use the Jinja2 templating language and can include variables, conditionals, and loops.

Process the following text and extract the key points:

{{ pipe.pipein-text }}

Executing AI Workflows

The primary way to execute AI Task workflows is through the ai-partitur command-line interface:

Command-Line Interface Usage

# Standard command format (preferred)
ai-partitur <partitur_name> <profile_id>

# Examples
ai-partitur inqua_objektiv 30727
ai-partitur profile_document 34611

# File-specific usage (only when necessary)
ai-partitur --file partitur/specialized_workflow.ai workflow_name profile_id

# Processing multiple profiles
for profile_id in 30727 34611 32101; do
    ai-partitur profile_document $profile_id
done

Remember: Always run commands from the project root directory to ensure proper path resolution.

For programmatic access, you can use the Orchestra class:

from ai_task.orchestra import Orchestra

# Initialize the Orchestra with a partitur file
orchestra = Orchestra(pipe_file="partitur/workflow_name.yml", id="12345")

# Execute the workflow
orchestra.run()

Data Flow

Data flows through the pipeline from one task to the next. Each task receives the output of the previous task as its input. The data is represented as a dictionary with key-value pairs.

The special key pipe.pipein-text is used to pass text data between tasks. When a task produces text output, it stores it in the pipe.pipeout-text key, which will be automatically mapped to pipe.pipein-text for the next task.

Input Dictionary → Task 1 → Output Dictionary → Task 2 → ... → Final Output

Context Variables

In addition to the pipeline data, tasks have access to context variables that provide information about the execution environment:

Pipeline context: Information about the current pipeline
Production context: Information about the current production (if applicable)
System context: Information about the system environment

These contexts can be accessed in instructions and function tasks to customize behavior.

Workflow Variables

AI Task uses standardized variables denoted by double curly braces:

{production}: The name of the production
{id}: The numerical identifier of a profile/performance
{genre}: The category of documents being processed
{no}: The sequential number of a document within its genre

These variables ensure consistency throughout the pipeline and can be referenced from any part of the system.

Troubleshooting

If you encounter issues with AI Task:

Working Directory: Verify you’re running from the project root directory
Path Resolution: Ensure all paths in partitur files are relative to the project root
Function Directory: For function-based partitur files, verify the function_dir setting
File Existence: Check that all referenced files exist
Fallback Method: Try the direct Python approach for more control

Next Steps

Now that you understand the core concepts of AI Task, you can: