Device Management

Hardware detection, CPU/NUMA topology analysis, and optimal JAX/XLA device configuration for CPU-bound workloads.

Hardware Configuration

Hardware configuration and CMC backend selection.

This module provides hardware detection utilities for selecting the optimal CMC (Consensus Monte Carlo) backend based on the available resources.

class heterodyne.device.config.ClusterType[source]

Bases: Enum

HPC cluster job scheduler type.

STANDALONE = 'standalone'
PBS = 'pbs'
SLURM = 'slurm'
class heterodyne.device.config.CMCBackend[source]

Bases: Enum

CMC execution backend.

JIT = 'jit'
MULTIPROCESSING = 'multiprocessing'
PBS = 'pbs'
SLURM = 'slurm'
class heterodyne.device.config.HardwareConfig[source]

Bases: object

Hardware configuration for CMC execution.

cpu_info

Detected CPU information.

cluster_type

Type of HPC cluster (if any).

available_cores

Number of CPU cores available for computation.

memory_gb

Available memory in GB.

recommended_backend

Recommended CMC backend based on hardware.

recommended_chains

Recommended number of MCMC chains.

max_parallel_chains

Maximum chains that can run in parallel.

cpu_info: CPUInfo
cluster_type: ClusterType
available_cores: int
memory_gb: float
recommended_backend: CMCBackend
recommended_chains: int
max_parallel_chains: int
__init__(cpu_info, cluster_type, available_cores, memory_gb, recommended_backend, recommended_chains, max_parallel_chains)
heterodyne.device.config.detect_cluster_type()[source]

Detect the HPC cluster scheduler type from environment.

Return type:

ClusterType

Returns:

ClusterType enum indicating the detected scheduler.

heterodyne.device.config.get_available_memory()[source]

Get available system memory in GB.

Return type:

float

Returns:

Available memory in GB, or a conservative estimate if detection fails.

heterodyne.device.config.detect_hardware()[source]

Detect hardware configuration and recommend CMC settings.

Return type:

HardwareConfig

Returns:

HardwareConfig with detected settings and recommendations.

Note

This function considers: - CPU core count (physical cores preferred) - Available memory (for chain parallelism) - Cluster environment (PBS/Slurm for distributed) - NUMA topology (for jit backend)

heterodyne.device.config.get_backend_name(backend)[source]

Get the string name of a CMC backend for configuration.

Parameters:

backend (CMCBackend) – CMCBackend enum value.

Return type:

Literal['jit', 'multiprocessing', 'pbs', 'slurm']

Returns:

Backend name string for use in configuration.

heterodyne.device.config.configure_optimal_device(mode='auto', num_chains=None, strict=False)[source]

Configure device settings for optimal CMC execution.

This function should be called before running CMC to ensure proper CPU threading and JAX device configuration.

Parameters:
  • mode (str) – Configuration mode: - “auto”: Automatically detect and configure - “cmc”: Optimize for CMC (4 chains typical) - “cmc-hpc”: Optimize for HPC CMC (8 chains) - “nlsq”: Optimize for NLSQ (single device)

  • num_chains (int | None) – Override number of chains (None for auto).

  • strict (bool) – If True, raise RuntimeError when JAX was already imported.

Return type:

HardwareConfig

Returns:

HardwareConfig with applied settings.

heterodyne.device.config.get_device_status()[source]

Get current device configuration status.

Return type:

dict[str, object]

Returns:

Dictionary with current device settings and detected hardware.

CPU Detection

CPU detection and HPC optimization utilities.

This module provides hardware-aware configuration for JAX workloads on CPU, including physical core detection, NUMA topology awareness, and optimal environment variable configuration for HPC clusters.

class heterodyne.device.cpu.CPUInfo[source]

Bases: object

CPU hardware information.

physical_cores

Number of physical CPU cores.

logical_cores

Number of logical cores (includes hyperthreading).

numa_nodes

Number of NUMA nodes (memory domains).

architecture

CPU architecture string (e.g., ‘x86_64’, ‘arm64’).

vendor

CPU vendor (e.g., ‘Intel’, ‘AMD’, ‘Apple’).

model_name

Full CPU model name.

has_avx

Whether AVX instructions are available.

has_avx2

Whether AVX2 instructions are available.

has_avx512

Whether AVX-512 instructions are available.

cache_sizes

Cache sizes in bytes (L1, L2, L3).

physical_cores: int
logical_cores: int
numa_nodes: int = 1
architecture: str = ''
vendor: str = ''
model_name: str = ''
has_avx: bool = False
has_avx2: bool = False
has_avx512: bool = False
cache_sizes: dict[str, int]
__init__(physical_cores, logical_cores, numa_nodes=1, architecture='', vendor='', model_name='', has_avx=False, has_avx2=False, has_avx512=False, cache_sizes=<factory>)
heterodyne.device.cpu.detect_cpu_info()[source]

Detect CPU hardware information.

Return type:

CPUInfo

Returns:

CPUInfo dataclass with hardware details.

Note

This function uses platform-specific methods: - Linux: lscpu, /proc/cpuinfo - macOS: sysctl - Windows: wmic (basic support)

heterodyne.device.cpu.configure_cpu_hpc(cpu_info=None, use_physical_cores_only=True, numa_aware=True)[source]

Configure environment variables for HPC CPU optimization.

This function sets environment variables for optimal CPU performance with JAX and underlying libraries (MKL, OpenBLAS, OpenMP).

Parameters:
  • cpu_info (CPUInfo | None) – CPU information (auto-detected if None).

  • use_physical_cores_only (bool) – If True, limit threads to physical cores (recommended for compute-bound workloads).

  • numa_aware (bool) – If True, configure for NUMA-aware memory allocation.

Return type:

dict[str, str]

Returns:

Dictionary of environment variables that were set.

heterodyne.device.cpu.get_optimal_batch_size(cpu_info=None, data_size=1000, element_bytes=8)[source]

Calculate optimal batch size based on CPU cache hierarchy.

This heuristic aims to fit working data in L3 cache while maintaining enough parallelism for efficient vectorization.

Parameters:
  • cpu_info (CPUInfo | None) – CPU information (auto-detected if None).

  • data_size (int) – Size of the input data dimension.

  • element_bytes (int) – Bytes per element (8 for float64, 4 for float32).

Return type:

int

Returns:

Recommended batch size.

heterodyne.device.cpu.benchmark_cpu_performance(cpu_info=None, matrix_size=1000)[source]

Run a simple CPU benchmark for performance profiling.

Parameters:
  • cpu_info (CPUInfo | None) – CPU information (auto-detected if None).

  • matrix_size (int) – Size of test matrices for BLAS benchmark.

Return type:

dict[str, float]

Returns:

Dictionary with benchmark results (GFLOPS, memory bandwidth).

heterodyne.device.cpu.get_jax_cpu_flags(cpu_info=None, num_devices=None)[source]

Generate XLA_FLAGS for optimal JAX CPU execution.

Parameters:
  • cpu_info (CPUInfo | None) – CPU information (auto-detected if None).

  • num_devices (int | None) – Number of CPU devices to expose (default: physical cores).

Return type:

str

Returns:

XLA_FLAGS string to set in environment.

heterodyne.device.cpu.configure_jax_cpu(cpu_info=None, num_devices=None, strict=False)[source]

Configure JAX for optimal CPU execution.

This should be called before importing JAX or at the start of a script.

Parameters:
  • cpu_info (CPUInfo | None) – CPU information (auto-detected if None).

  • num_devices (int | None) – Number of CPU devices (default: physical cores).

  • strict (bool) – If True, raise RuntimeError when JAX was already imported.

Return type:

Mapping[str, str]

Returns:

Dictionary of environment variables that were set.