中文EN
ResearchX Docs
English

Execution Platform

Admin read-only view of the deployment-managed execution platform configuration

Execution Platform

Execution Platform shows the deployment-managed runtime backend configuration used to orchestrate containers. The admin page is read-only: runtime mode and Kubernetes settings come from deployment configuration, not from in-app edits.

Overview

The execution platform determines how the system creates and manages containers:

  • Docker mode: Creates and manages containers via Docker Engine
  • Kubernetes mode: Uses the Kubernetes runtime backend and cluster resources
  • Slurm mode: Schedules process actions with Slurm and runs workloads through Apptainer/SIF

Admin View

Entry: Admin → Execution Platform (/workspace/admin/execution-platform)

Current Settings

  • Runtime mode: Selected by EXECUTION_PLATFORM_MODE
  • Kubernetes settings: Read from deployment environment variables when Kubernetes mode is enabled
  • Slurm settings: Read from deployment environment variables when Slurm mode is enabled

Deployment Variables

When EXECUTION_PLATFORM_MODE=kubernetes, configure:

  • K8S_NAMESPACE
  • K8S_SERVICE_ACCOUNT
  • K8S_WORKSPACE_PVC_NAME
  • K8S_WORKSPACE_ROOT_PATH
  • K8S_IMAGE_PULL_SECRET_NAME (optional)
  • K8S_REGISTRY_REPO_PREFIX (optional)
  • K8S_JOB_TTL_SECONDS (optional)
  • RESEARCHX_K8S_EPHEMERAL_JOB_TIMEOUT_SEC (optional, defaults to 600; set to 0 to wait indefinitely when no runtime or step timeout is configured)

K8S_REGISTRY_REPO_PREFIX is applied to Kubernetes runtime images that do not include an explicit registry host. For example, python:3.12-slim becomes <prefix>/python:3.12-slim, while ghcr.io/acme/tool:latest is left unchanged.

For long-running process actions, set the runtime config.timeouts.max_execution_sec or the agent step timeout_sec to the desired wall-clock limit. A value of 0 disables the application-layer wait timeout, which is useful for Kubernetes Jobs that can run longer than 24 hours.

In Kubernetes mode, built-in runtime assets are published into the workspace PVC at startup/version refresh. User-managed global asset repositories default to live workspace-backed storage under AGENT_WORKSPACE_ROOT/global/repositories/* and AGENT_WORKSPACE_ROOT/global/global-skills/repositories, so sync/import/edit operations are visible to runtime pods without rebuilding the built-in asset bundle.

When EXECUTION_PLATFORM_MODE=slurm, configure:

  • RESEARCHX_SLURM_TRANSPORT: local-cli for production/HPC deployments, or ssh-docker-exec for the Docker-hosted development environment
  • RESEARCHX_SLURM_REMOTE_WORKSPACE_ROOT: shared workspace root as seen by Slurm jobs and Apptainer
  • RESEARCHX_SLURM_APPTAINER_CACHE_ROOT: SIF cache root for OCI images converted to Apptainer SIF
  • RESEARCHX_SLURM_DEFAULT_PARTITION (optional)
  • RESEARCHX_SLURM_GPU_GRES_NAME (optional, defaults to gpu)
  • RESEARCHX_SLURM_SIF_CACHE_MAX_IMAGES (optional, defaults to 50)
  • RESEARCHX_SLURM_ACTION_POLL_INTERVAL_MS (optional, defaults to 3000; interval for checking action exit_code files)
  • RESEARCHX_SLURM_ACTION_SQUEUE_INTERVAL_MS (optional, defaults to 10000; minimum interval between action Slurm queue checks)
  • RESEARCHX_RUNTIME_JOB_MISSING_GRACE_MS (optional, generic missing-job grace period)

For local development against the single-container Slurm environment, also configure:

  • RESEARCHX_SLURM_SSH_HOST
  • RESEARCHX_SLURM_SSH_USER
  • RESEARCHX_SLURM_CONTAINER
  • RESEARCHX_SLURM_DOCKER_EXEC_USER=root
  • RESEARCHX_SLURM_WORKSPACE_OWNER=1000:1000

In Slurm mode, AGENT_WORKSPACE_ROOT must point to the same shared filesystem that Slurm jobs see through RESEARCHX_SLURM_REMOTE_WORKSPACE_ROOT. Production deployments should prefer identical shared paths, for example AGENT_WORKSPACE_ROOT=/workspaces and RESEARCHX_SLURM_REMOTE_WORKSPACE_ROOT=/workspaces. Local development can use an SSHFS mount; see slurm/README.md and .env.slurm.conf.

Action and project images may be normal OCI image references, which are converted and cached as SIF files, or direct SIF paths such as /workspaces/shared/images/tool.sif / sif:///workspaces/shared/images/tool.sif. SIF paths must be absolute paths visible to Slurm/Apptainer.

Risk Warning

  • Switching the execution platform affects all persistent environments and future agent process runs
  • Make the change in deployment configuration and restart the service in a low-risk window
  • Verify the target backend is fully configured before restarting the application

Relationship to Other Admin Modules

The execution platform works in conjunction with:

  • Resource Quotas: Manages resource allocation and limits on top of the execution platform
  • Container Management: Project-level containers are created and run on the execution platform
  • Container Mounts: Mount configurations are applied to containers on the execution platform