arch/s390/kernel/hiperdispatch.c

Source file repositories/reference/linux-study-clean/arch/s390/kernel/hiperdispatch.c

File Facts

System: Linux kernel
Corpus path: arch/s390/kernel/hiperdispatch.c
Extension: .c
Size: 12100 bytes
Lines: 431
Domain: Architecture Layer
Bucket: arch/s390
Inferred role: Architecture Layer: implementation source
Status: source implementation candidate

Why This File Exists

CPU and platform-specific kernel glue: boot entry, traps, syscall entry, interrupts, page tables, context switch, and low-level barriers.

CPU and platform-specific kernel glue: boot entry, traps, syscall entry, interrupts, page tables, context switch, and low-level barriers.
Uses kernel synchronization; read lock ordering, sleepability, and interrupt context assumptions before translating.
Defines or uses C structs; map object ownership, embedded links, reference counts, and lock ownership.

Dependency Surface

linux/cpufeature.h
linux/cpumask.h
linux/debugfs.h
linux/device.h
linux/kernel_stat.h
linux/kstrtox.h
linux/ktime.h
linux/sysctl.h
linux/types.h
linux/workqueue.h
asm/hiperdispatch.h
asm/setup.h
asm/smp.h
asm/topology.h
asm/trace/hiperdispatch.h

Detected Declarations

function hd_set_hiperdispatch_mode
function hd_reset_state
function hd_add_core
function hd_update_times
function hd_update_capacities
function hd_disable_hiperdispatch
function hd_enable_hiperdispatch
function hd_steal_avg
function hd_calculate_steal_percentage
function hd_capacity_work_fn
function hiperdispatch_ctl_handler
function hd_steal_threshold_show
function hd_steal_threshold_store
function hd_delay_factor_show
function hd_delay_factor_store
function hd_greedy_time_get
function hd_conservative_time_get
function hd_adjustment_count_get
function hd_create_debugfs_counters
function hd_create_attributes
function hd_init

Annotated Snippet

// SPDX-License-Identifier: GPL-2.0
/*
 * Copyright IBM Corp. 2024
 */

#define pr_fmt(fmt) "hd: " fmt

/*
 * Hiperdispatch:
 * Dynamically calculates the optimum number of high capacity COREs
 * by considering the state the system is in. When hiperdispatch decides
 * that a capacity update is necessary, it schedules a topology update.
 * During topology updates the CPU capacities are always re-adjusted.
 *
 * There is two places where CPU capacities are being accessed within
 * hiperdispatch.
 * -> hiperdispatch's reoccuring work function reads CPU capacities to
 *    determine high capacity CPU count.
 * -> during a topology update hiperdispatch's adjustment function
 *    updates CPU capacities.
 * These two can run on different CPUs in parallel which can cause
 * hiperdispatch to make wrong decisions. This can potentially cause
 * some overhead by leading to extra rebuild_sched_domains() calls
 * for correction. Access to capacities within hiperdispatch has to be
 * serialized to prevent the overhead.
 *
 * Hiperdispatch decision making revolves around steal time.
 * HD_STEAL_THRESHOLD value is taken as reference. Whenever steal time
 * crosses the threshold value hiperdispatch falls back to giving high
 * capacities to entitled CPUs. When steal time drops below the
 * threshold boundary, hiperdispatch utilizes all CPUs by giving all
 * of them high capacity.
 *
 * The theory behind HD_STEAL_THRESHOLD is related to the SMP thread
 * performance. Comparing the throughput of;
 * - single CORE, with N threads, running N tasks
 * - N separate COREs running N tasks,
 * using individual COREs for individual tasks yield better
 * performance. This performance difference is roughly ~30% (can change
 * between machine generations)
 *
 * Hiperdispatch tries to hint scheduler to use individual COREs for
 * each task, as long as steal time on those COREs are less than 30%,
 * therefore delaying the throughput loss caused by using SMP threads.
 */

#include <linux/cpufeature.h>
#include <linux/cpumask.h>
#include <linux/debugfs.h>
#include <linux/device.h>
#include <linux/kernel_stat.h>
#include <linux/kstrtox.h>
#include <linux/ktime.h>
#include <linux/sysctl.h>
#include <linux/types.h>
#include <linux/workqueue.h>
#include <asm/hiperdispatch.h>
#include <asm/setup.h>
#include <asm/smp.h>
#include <asm/topology.h>

#define CREATE_TRACE_POINTS
#include <asm/trace/hiperdispatch.h>

#define HD_DELAY_FACTOR			(4)
#define HD_DELAY_INTERVAL		(HZ / 4)
#define HD_STEAL_THRESHOLD		10
#define HD_STEAL_AVG_WEIGHT		16

static cpumask_t hd_vl_coremask;	/* Mask containing all vertical low COREs */
static cpumask_t hd_vmvl_cpumask;	/* Mask containing vertical medium and low CPUs */
static int hd_high_capacity_cores;	/* Current CORE count with high capacity */
static int hd_entitled_cores;		/* Total vertical high and medium CORE count */
static int hd_online_cores;		/* Current online CORE count */

static unsigned long hd_previous_steal;	/* Previous iteration's CPU steal timer total */
static unsigned long hd_high_time;	/* Total time spent while all cpus have high capacity */
static unsigned long hd_low_time;	/* Total time spent while vl cpus have low capacity */
static atomic64_t hd_adjustments;	/* Total occurrence count of hiperdispatch adjustments */

static unsigned int hd_steal_threshold = HD_STEAL_THRESHOLD;
static unsigned int hd_delay_factor = HD_DELAY_FACTOR;
static int hd_enabled;

static void hd_capacity_work_fn(struct work_struct *work);
static DECLARE_DELAYED_WORK(hd_capacity_work, hd_capacity_work_fn);

static int hd_set_hiperdispatch_mode(int enable)
{
	if (!cpu_has_topology())

Annotation

Immediate include surface: `linux/cpufeature.h`, `linux/cpumask.h`, `linux/debugfs.h`, `linux/device.h`, `linux/kernel_stat.h`, `linux/kstrtox.h`, `linux/ktime.h`, `linux/sysctl.h`.
Detected declarations: `function hd_set_hiperdispatch_mode`, `function hd_reset_state`, `function hd_add_core`, `function hd_update_times`, `function hd_update_capacities`, `function hd_disable_hiperdispatch`, `function hd_enable_hiperdispatch`, `function hd_steal_avg`, `function hd_calculate_steal_percentage`, `function hd_capacity_work_fn`.
Atlas domain: Architecture Layer / arch/s390.
Implementation status: source implementation candidate.
Synchronization appears in or near this file; preserve lock ordering, sleepability, and interrupt-context constraints.

Implementation Notes

This generated page is the file-by-file coverage layer; curated subsystem chapters should link here when they synthesize a multi-file control flow.
Core OS pages should be promoted from atlas-only to deep-reviewed when they explain data structures, invariants, locking, lifecycle, and C implementation snippets.
Driver-family pages are intentionally pattern-oriented unless they are part of the selected PCIe/NVMe representative device path.