AI Partitur Orchestration System Guide

The AI Partitur orchestration system is the core workflow engine of the AI Task framework. It enables the configuration and execution of complex AI-powered workflows through YAML-based definition files.

Key Concepts

Terminology

Partitur: A configuration file that defines a processing pipeline (from the German musical term for a score or orchestration)
Profile: A collection of related files identified by a unique ID (e.g., profile_32101)
Step: A single operation in a processing pipeline (transcribe, analyze, convert, etc.)
Pipe: The complete sequence of steps defined in a partitur file
Instruction: A template containing prompts and instructions for AI models

Components

Partitur Files: YAML configuration defining the workflow
Instruction Templates: Jinja2 templates for AI model prompts
Functions: Processing functions for non-LLM operations
AI Models: Large language models such as Gemini for content processing

Command-Line Interface

The ai-partitur command provides a simple interface to the orchestration system:

ai-partitur <partitur_name> <profile_id> [--overwrite] [--file <direct_path>]

Arguments: - <partitur_name>: Name of the partitur file (without .yml extension) - <profile_id>: Unique identifier for the profile to process - --overwrite: Force regeneration of existing output files - --file: Direct path to a partitur file (bypasses name lookup)

Examples:

# Run the inqua_full partitur for profile 32101
ai-partitur inqua_full 32101

# Force regeneration of outputs
ai-partitur inqua_full 32101 --overwrite

# Use a specific partitur file
ai-partitur --file /path/to/custom.yml 32101

Directory Structure

The partitur system uses a standardized directory structure:

project/
├── partitur/            # Orchestration files (.yml)
├── instruction/         # Instruction templates (.j2)
└── profile/             # Data organized by profile ID
    └── profile_<id>/    # Individual profile directories
        ├── audio/       # Audio input files
        ├── transcription/ # Transcription outputs
        └── analysis/    # Analysis outputs

Global directories also exist at ~/.ai/:

~/.ai/
├── partitur/            # Global partitur files
└── instruction/         # Global instruction templates

File Resolution

Partitur files are resolved in the following order: 1. Direct path (if --file is used) 2. Current directory (./<name>.yml) 3. Local partitur directory (./partitur/<name>.yml) 4. Global partitur directory (~/.ai/partitur/<name>.yml)

Partitur File Format

Partitur files are YAML documents with the following structure:

name: example_workflow
description: "Example workflow for processing audio data"
template_dir: "instruction"  # Directory containing instruction templates

pipe:
  # Step definition format:
  - name: step_name        # Unique name for the step
    type: step_type        # Type of step (llm or function)
    ... step-specific parameters ...

LLM Step Format

- name: transcribe_audio
  type: llm
  model: gemini-1.5-pro    # AI model to use
  tmpl: "template_name.j2" # Instruction template file
  source-file: "path/to/input_file.ext"  # Input file
  result-file: "path/to/output_file.ext" # Output file
  overwrite: true          # Optional: force overwrite existing file

Function Step Format

- name: convert_docx
  type: function
  function: convert_to_docx  # Name of function to execute
  params:                     # Parameters for the function
    source: "path/to/input.txt"
    result: "path/to/output.docx"

Path Handling in Partitur Files

Paths in partitur files can be specified in different ways:

Absolute Paths: Starting with / (e.g., /home/user/data/file.txt)
Relative Paths: Relative to the partitur file location (e.g., profile/data/file.txt)
Parent Directory Paths: Using ../ to reference parent directories (e.g., ../resources/file.txt)
Variable Paths: Using {id} as a placeholder for the profile ID (e.g., profile/profile_{{id}}/file.txt)

Example Complete Partitur File

name: inqua_full
description: "Complete pipeline for INQUA2 processing"
template_dir: "instruction"

pipe:
  # Step 1: Transcribe audio
  - name: transcribe_audio
    type: llm
    model: gemini-1.5-pro
    tmpl: "INQUA_transcribe10_ger.j2"
    source-file: "profile/profile_{{id}}/audio/document_{{id}}.m4a"
    result-file: "profile/profile_{{id}}/transcription/INQUA2_{{id}}_transcription_01.txt"
    overwrite: true

  # Step 2: Convert transcription to DOCX
  - name: convert_to_docx
    type: function
    function: convert_to_docx
    params:
      source: "profile/profile_{{id}}/transcription/INQUA2_{{id}}_transcription_01.txt"
      result: "profile/profile_{{id}}/transcription/INQUA2_{{id}}_transcription_01.docx"

  # Step 3: Generate sequence analysis
  - name: generate_sequence
    type: llm
    model: gemini-1.5-pro
    tmpl: "INQUA_sequence.j2"
    source-file: "profile/profile_{{id}}/transcription/INQUA2_{{id}}_transcription_01.txt"
    result-file: "profile/profile_{{id}}/sequence/INQUA2_{{id}}_sequence_01.txt"
    overwrite: true

Creating Instruction Templates

Instruction templates are Jinja2 templates that provide instructions to AI models. They are stored with a .j2 extension.

Example template for transcription (INQUA_transcribe10_ger.j2):

You are a professional German transcriber. Transcribe the following audio recording accurately.
Use the following format:

[timestamp] Speaker: spoken content

Rules:
1. Use "I:" for the interviewer and "K:" for the interviewee
2. Include timestamps approximately every 1-2 minutes in [MM:SS] format
3. Transcribe word-for-word without summarizing
4. Do not add or remove content
5. Maintain all speech patterns, including filler words
6. Indicate unclear speech with [unclear]
7. Note background sounds in [brackets] when relevant

Begin transcription:

Best Practices

Organize by Project: Keep partitur files, instructions, and profiles together in a project directory
Use Descriptive Names: Choose meaningful names for partitur files and steps
Standardize File Paths: Follow consistent patterns for file organization
Version Outputs: Include version numbers in output file names (e.g., _01.txt)
Use Path Variables: Use {id} placeholders for profile IDs to make partitur files reusable

Troubleshooting

Common Issues

File not found errors: Check that all paths are correct and directories exist
Template errors: Verify that template files are in the correct location and format
Path resolution problems: Make sure paths are relative to the partitur file location
Function not found: Ensure custom functions are properly defined and imported

Debugging Techniques

Use --overwrite to force regeneration of outputs
Check the console output for detailed error messages
Verify file paths by manually checking file existence
Inspect template files for syntax errors