LLM Analysis Guide

NodeLens can analyze Hugging Face causal language models at the channel level. The LLM workflow is designed for activation and gradient capture, FFN channel metrics, ablation probes, and structured pruning.

Quick Start

python scripts/run_experiment.py --config configs/examples/gpt2_fast_test.yaml
python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml

The GPT-2 config is a small smoke test. The Llama, Mistral, Qwen, and OLMo configs under configs/prune_llm/ are larger workflows and may require model access, GPU memory planning, and local cache setup.

What The LLM Workflow Computes

FFN activation statistics, including activation magnitude and outlier scores.
Gradient-informed scores such as Taylor, curvature, and SCAR loss proxy.
Supernode-style protected cores when a config asks for top-scoring channels.
Halo and cross-layer diagnostics when enabled.
Structured FFN channel pruning and perplexity evaluation.

Example Config Structure

experiment:
  name: "llama3_8b_analysis"
  type: "llm_alignment"
  device: "cuda"

model:
  name: "hf_causal_lm"
  model_id: "meta-llama/Llama-3.1-8B"
  dtype: "bfloat16"
  device_map: "auto"
  tracked_layers:
    - "model.model.layers.*.mlp.up_proj"
    - "model.model.layers.*.mlp.gate_proj"
    - "model.model.layers.*.mlp.down_proj"

dataset:
  name: "wikitext"
  subset: "wikitext-2-raw-v1"
  split: "train"
  batch_size: 1

calibration:
  num_samples: 128
  max_length: 2048
  batch_size: 4

metrics:
  scar:
    enabled: true
    num_samples: 64
    max_length: 512

supernode:
  enabled: true
  score_metric: "scar_loss_proxy"
  core_fraction: 0.01
  halo_fraction: 0.10
  protect_core: true

pruning:
  enabled: true
  ratios: [0.1, 0.3, 0.5]
  structured: true
  dependency_aware: true
  algorithms:
    - "magnitude"
    - "wanda"
    - "sparsegpt"
    - "scar_loss_proxy"
    - "supernode_protection_score"

Metric Families

Family	Examples
Activation	`activation_l2_norm`, `activation_variance`, `activation_outlier_index`
SCAR	`scar_activation_power`, `scar_taylor`, `scar_curvature`, `scar_loss_proxy`
Alignment	`rayleigh_quotient`, `delta_alignment`
Information	`mutual_information_gaussian`, `average_redundancy`, `pairwise_redundancy_gaussian`
Baselines	`magnitude`, `weight_magnitude`, `wanda`, `sparsegpt`

Structured FFN Pruning

For Llama-style FFNs, structured channel pruning masks the corresponding intermediate channel across gate_proj, up_proj, and down_proj. This asks a channel-level question: which full FFN units can be removed while preserving model quality?

Unstructured weight pruning is a different setting. It can be useful as a compression baseline, but it should be labeled separately from structured channel pruning.

Supernode And Halo Diagnostics

When supernode.enabled is true, NodeLens ranks channels by the configured score_metric and marks the top core_fraction as a protected or analyzed core. The same outputs can be used for ablation, pruning protection, or overlap analysis with activation-defined outliers.

Halo diagnostics are optional. They measure local write-overlap and redundancy around the high-scoring core and are useful when the question is whether neighboring non-core channels behave differently from other channels.

Memory Notes

Use batch_size: 1 for very large models.
Use device_map: "auto" when model parallelism is available.
Use torch_dtype: "bfloat16" or "float16" when supported.
Reduce calibration.num_samples, calibration.max_length, or SCAR sample counts for smoke tests.

Outputs

LLM runs usually write:

per-layer metric arrays and score summaries
pruning and ablation results
perplexity or downstream-task evaluations
plots, tables, and JSON summaries when enabled

Use the copied experiment_config.yaml in each output directory to audit the exact settings for a run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Analysis Guide

Quick Start

What The LLM Workflow Computes

Example Config Structure

Metric Families

Structured FFN Pruning

Supernode And Halo Diagnostics

Memory Notes

Outputs

FilesExpand file tree

llm_guide.md

Latest commit

History

llm_guide.md

File metadata and controls

LLM Analysis Guide

Quick Start

What The LLM Workflow Computes

Example Config Structure

Metric Families

Structured FFN Pruning

Supernode And Halo Diagnostics

Memory Notes

Outputs