Whitepaper-in-Progress (August 28, 2025)
Search across internet content sucks. Users hardly have control over their algorithms, you can't explore what the most simultaneously "sweet" and "war-like" tweets are on a whim. You can't refine your search to particularly mean modern physical combat. Having this problem doesn't make sense anymore, since LLMs (with proper frameworks) are really good at flushing out arbitrary subjective attribute magnitudes, like if some tweet is .8 "embodying modern warfare" or .95. My plan:
- Popularize LLMs doing pairwise ratio comparisons for flushing out arbitrary subjective attribute magnitudes (as seen in the Analytic Hierachy Process). Ship a program that solves the incomplete pairwise ratio comparison matrices and automagically finds prompts with good consistency scores and which match user intuitions.
- MVP: build a UI for exploring all community-archive.org tweets across arbitrary subjective dimensions (e.g. "sweet", "war-like"). The user should be able to view attribute ratings that others have created, decide how much to weight these ratings, and where the dimension they care about doesn't exist, create it themselves by writing a prompt like "evidence of shockingly high voltage thinking". Then our agent, using e.g. GPT-5-nano, validates it on a test set (yielding consistent numbers that the user roughly agrees with), then the user can assent to pay the ~dollar worth of credits to rate a hundred thousand promising tweets, get an email when it's done, and then can explore tweets organized by the whole new dimension. In the future we'll figure out more appealing pricing models, and figure out methods to save lots on API costs.
- Build a clean database+API for hosting subjective priors, facilitating the mapping out of our subjective world. Facilitate LLMs automatically doing the Analytic Hierarchy Process. Build a clean UX for users exploring the database, spawning AHP agents, and sharing their custom views.
Some Key Insights
- Decision problems in messy human space are often handled quite well with a factorization of subjective qualities. This is the Analytic Hierarchy Process.
- LLMs are great at mapping out these subjective qualities. The key insight is we can know how consistent their judgements are, for any arbitrary prompt. When we are getting inconsistent judgements for an attribute, we can alter the prompt and perhaps provide more context on the subtle qualities we are measuring.
- DB lookups are much faster than LLM inference. A more prosperous future involves AIs sorting things out all the time, so it makes sense to cache computation for collective reuse.
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS citext;
CREATE TYPE rater_kind AS ENUM ('human','model');
CREATE TYPE privacy_level AS ENUM ('self', 'public');
CREATE TABLE human_users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
username citext NOT NULL UNIQUE,
email citext NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT username_len CHECK (char_length(username) BETWEEN 4 AND 50),
CONSTRAINT email_len CHECK (char_length(email) <= 254)
);
-- agents (humans or models)
CREATE TABLE agents (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
-- who created and owns this agent definition
owner_id UUID NOT NULL REFERENCES human_users(id),
kind rater_kind NOT NULL,
-- If model rater, store model metadata; else NULL
model_name TEXT,
temperature DOUBLE PRECISION CHECK (temperature >= 0 AND temperature <= 2),
params JSONB, -- e.g. top_p, seed, system_prompt hash, etc.
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT model_requires_name CHECK ((kind='model' AND model_name IS NOT NULL) OR (kind='human')),
CONSTRAINT model_fields_presence CHECK (
(kind='model' AND model_name IS NOT NULL AND temperature IS NOT NULL)
OR
(kind='human' AND model_name IS NULL AND temperature IS NULL)
)
);
CREATE TABLE entities (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
creator_id UUID REFERENCES agents(id),
-- open question about deletability of entities. ideally ensure they're unique/everyone references the same one, allow people to try to delete theirs, but only actually delete it when everyone has deleted that entity
uri VARCHAR(2048),
payload TEXT,
hash VARCHAR(64) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
UNIQUE (hash)
);
CREATE TABLE attributes (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
creator_id UUID NOT NULL REFERENCES agents(id),
name VARCHAR(255) NOT NULL,
version INTEGER NOT NULL DEFAULT 1,
short_description VARCHAR(200) NOT NULL,
prompt TEXT NOT NULL,
-- e.g. USD, probability, utility, kg*s^-2
type_signature VARCHAR(50),
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
UNIQUE (name, version)
);
CREATE TABLE binary_ratio_comparisons (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
agent_id UUID NOT NULL REFERENCES agents(id),
attribute_id UUID NOT NULL REFERENCES attributes(id),
entity_a_id UUID NOT NULL REFERENCES entities(id),
entity_b_id UUID NOT NULL REFERENCES entities(id),
ratio_estimate DOUBLE PRECISION NOT NULL CHECK (ratio_estimate > 0),
weight DOUBLE PRECISION NOT NULL DEFAULT 1.0 CHECK (weight >= 0 AND weight <= 10),
privacy_type privacy_level NOT NULL DEFAULT 'self'::privacy_level,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT different_entities CHECK (entity_a_id != entity_b_id)
);
-- scores will need to be calculated in Rust due to the computational complexity of solving incomplete pairwise ratio comparison matrices
CREATE TABLE computed_agent_scores (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
agent_id UUID NOT NULL REFERENCES agents(id),
entity_id UUID NOT NULL REFERENCES entities(id),
attribute_id UUID NOT NULL REFERENCES attributes(id),
value DOUBLE PRECISION NOT NULL,
total_weight DOUBLE PRECISION NOT NULL CHECK (total_weight > 0),
publicity privacy_level NOT NULL DEFAULT 'self'::privacy_level,
last_updated TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP
);
-- Represent a custom list of (attribute_id, weight) pairings as a JSONB array of two-element arrays:
-- Example: '[["uuid1", 0.5], ["uuid2", 1.2]]'
CREATE TABLE linear_combination_goals (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
agent_id UUID NOT NULL REFERENCES agents(id),
name VARCHAR(255) NOT NULL,
description TEXT NOT NULL,
attribute_weights JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
-- Ensure attribute_weights is an array of arrays, each with two elements
CONSTRAINT attribute_weights_is_array CHECK (jsonb_typeof(attribute_weights) = 'array'),
-- Ensure unique names per agent
CONSTRAINT unique_goal_name_per_agent UNIQUE (agent_id, name)
);
--todo eventually: attribute embeddings table to help LLM agents find relevant attributes
/* too complex to add for now
CREATE TYPE elicitation_type AS ENUM ('ratio','real_value','percentile');
CREATE TABLE single_elicitations (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
rater_id UUID NOT NULL REFERENCES agents(id),
attribute_id UUID NOT NULL REFERENCES attributes(id),
entity_id UUID NOT NULL REFERENCES entities(id),
elicitation_type elicitation_type NOT NULL,
value DOUBLE PRECISION NOT NULL,
weight DOUBLE PRECISION NOT NULL DEFAULT 1.0 CHECK (weight >= 0 AND weight <= 1),
privacy_type privacy_level NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT percentile_range CHECK (
(elicitation_type <> 'percentile') OR (value >= 0 AND value <= 100)
),
CONSTRAINT ratio_not_allowed_in_single CHECK (
elicitation_type <> 'ratio'
);
CONSTRAINT unique_single_judgment UNIQUE (rater_id, attribute_id, entity_id, elicitation_type);
); */
-- ratio-of-absolute-differences
-- |u(A)−u(B)|=x⋅|u(C)−u(D)|.
-- “gap between A and B feels about 3× the gap between C and D”. e.g. B can be C. the difference in fun between a ferrari and a koenigsegg is about 30% that between a ferrari and charger.
-- add complexity but differences can be more intuitive than direct ratios.
/* CREATE TABLE quaternary_comparisons (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
rater_id UUID NOT NULL REFERENCES agents(id),
attribute_id UUID NOT NULL REFERENCES attributes(id),
entity_a_id UUID NOT NULL REFERENCES entities(id),
entity_b_id UUID NOT NULL REFERENCES entities(id),
entity_c_id UUID NOT NULL REFERENCES entities(id),
entity_d_id UUID NOT NULL REFERENCES entities(id),
ratio_estimate DOUBLE PRECISION NOT NULL CHECK (ratio_estimate > 0),
weight DOUBLE PRECISION NOT NULL DEFAULT 1.0 CHECK (weight >= 0 AND weight <= 1),
privacy_type privacy_level NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT different_entities_pair1 CHECK (entity_a_id != entity_b_id),
CONSTRAINT different_entities_pair2 CHECK (entity_c_id != entity_d_id),
-- this allows trinary comparisons, so e.g. B can be C.
CONSTRAINT different_pairs CHECK (
(entity_a_id != entity_c_id OR entity_b_id != entity_d_id) AND
(entity_a_id != entity_d_id OR entity_b_id != entity_c_id)
)
); */
//! optpcm — Optimal Incomplete Pairwise Comparison Solver
//!
//! utilities for solving **weighted, possibly incomplete pairwise ratio matrices**
//! with principled optimization, diagnostics, completion, and query selection.
//!
//! ## Design goals
//! - **Clear math**: log‑least‑squares on edges, with optional Huber robustness and Gaussian node priors.
//! - **Two engines** (mutually exclusive):
//! - `ExactDense` (Cholesky/KKT) for small/medium n.
//! - `IterativeCG` (sparse mat‑vec) for large/sparse graphs with warm starts.
//! - **Deterministic & validated** errors instead of panics (except in `debug_assert!`).
//! - **Configurable accuracy**: user chooses tolerance / iteration caps / mode presets.
//! - **Consistent CR**: single, selectable Random Index table.
//! - **Recommenders**: degree‑impact (cheap) and effective‑resistance (scalable CG approx).
//! - **“Node coordinates”**: we treat `x = log w` as the canonical 1‑D coordinate; completion is `w[i]/w[j]`.
//!
//! ## Math summary
//! We estimate `x ∈ R^n` (log‑weights) minimizing the convex objective
//! ```text
//! J(x) = Σ_{(i,j)∈E} w_ij * ρ( (x_i - x_j) - y_ij ) + Σ_i τ_i * (x_i - μ_i)^2 + ε * ||x||^2
//! ```
//! where `y_ij = log(ratio_ij)`, `w_ij ≥ 0` are edge confidences, `(μ_i, τ_i)` are node priors, `ε` is tiny ridge.
//! `ρ` is L2 or Huber (piecewise L2/L1). Gauge is fixed either by `x_0 = 0` or `Σ x_i = 0` (selectable).
//! We then map to positive weights `w_i = exp(x_i)` and normalize `Σ w_i = 1`.
//!
//! ## Quick start
//! ```rust
//! use exopriors::optpcm::*;
//! let mut s = Solver::new(4);
//! s.add_edge(0, 1, 2.0, 1.0).unwrap();
//! s.add_edge(0, 2, 4.0, 1.0).unwrap();
//! s.add_edge(0, 3, 8.0, 1.0).unwrap();
//! s.add_edge(1, 2, 2.0, 1.0).unwrap();
//! s.add_edge(2, 3, 2.0, 1.0).unwrap();
//! s.options_mut().engine = SolveEngine::IterativeCG(Default::default());
//! s.options_mut().robustness = Robustness::Huber { delta: 1.0, irls_iters: 5, irls_tol: 1e-10 };
//! s.options_mut().gauge = Gauge::ZeroMean; // Try a different gauge
//! let sol = s.solve().unwrap();
//! assert!((sol.completed[(1, 3)] - 4.0).abs() < 0.2);
//! println!("CR={:.3}, Koczkodaj={:.3}", sol.saaty_cr, sol.koczkodaj);
//! ```
use std::collections::HashMap;
// ===============================================================================================
// ===== Public API: Structs, Enums, and Errors
// ===============================================================================================
/// Represents a single pairwise comparison edge `i -> j` with a given ratio and confidence weight.
#[derive(Clone, Copy, Debug)]
pub struct Edge {
pub i: usize,
pub j: usize,
pub ratio: f64,
pub weight: f64,
}
/// A Gaussian prior on a node's log-weight, specified by mean `mu` and precision `τ`.
/// This is equivalent to adding a "soft" constraint `x_i ≈ mu` with confidence `τ`.
#[derive(Clone, Copy, Debug, Default, PartialEq)]
pub struct NodePrior {
/// The expected log-weight, `log(weight)`.
pub mu: f64,
/// The precision (inverse variance) of the prior. A higher value means more confidence.
pub precision: f64,
}
/// The gauge-fixing strategy to ensure a unique solution.
/// The log-weights `x_i` are only defined up to a constant shift (`x_i + c`).
/// A gauge fix removes this ambiguity.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum Gauge {
/// Pins the first log-weight to zero: `x_0 = 0`.
PinX0,
/// Enforces a zero-mean constraint on all log-weights: `Σ x_i = 0`.
ZeroMean,
}
impl Default for Gauge {
fn default() -> Self {
Gauge::PinX0
}
}
/// The table of Random Index (RI) values used for calculating Saaty's Consistency Ratio (CR).
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum RiTable {
SaatyUpTo13Then158,
SaatyUpTo10Then151,
}
impl Default for RiTable {
fn default() -> Self {
RiTable::SaatyUpTo13Then158
}
}
/// The loss function used for measuring residuals, allowing for robust estimation.
#[derive(Clone, Debug, PartialEq)]
pub enum Robustness {
/// Standard L2 (least-squares) loss. Optimal for Gaussian noise but sensitive to outliers.
L2,
/// Huber loss, which behaves like L2 for small errors and L1 for large errors.
Huber {
delta: f64,
irls_iters: usize,
irls_tol: f64,
},
}
impl Default for Robustness {
fn default() -> Self {
Robustness::L2
}
}
/// Parameters for the iterative (Conjugate Gradient) solver.
#[derive(Clone, Debug, PartialEq)]
pub struct IterativeParams {
pub cg_tol: f64,
pub cg_iters: usize,
pub ridge: f64,
}
impl Default for IterativeParams {
fn default() -> Self {
Self {
cg_tol: 1e-8,
cg_iters: 10_000,
ridge: 1e-8,
}
}
}
/// Parameters for the exact (Cholesky decomposition) solver.
#[derive(Clone, Debug, PartialEq)]
pub struct DenseParams {
pub pivot_eps: f64,
pub ridge: f64,
}
impl Default for DenseParams {
fn default() -> Self {
Self {
pivot_eps: 1e-10,
ridge: 1e-12,
}
}
}
/// The core computational engine for solving the optimization problem.
#[derive(Clone, Debug, PartialEq)]
pub enum SolveEngine {
/// A direct solver using Cholesky decomposition. Best for small, dense problems (n < ~1000).
ExactDense(DenseParams),
/// An iterative solver using Conjugate Gradient. Best for large, sparse problems.
IterativeCG(IterativeParams),
}
impl Default for SolveEngine {
fn default() -> Self {
SolveEngine::IterativeCG(Default::default())
}
}
/// Pre-configured accuracy levels for a simpler user experience.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum Preset {
Fast,
Balanced,
Precise,
}
/// Specifies the desired accuracy for the solver.
#[derive(Clone, Debug, PartialEq)]
pub enum Accuracy {
/// Use a pre-configured set of tolerances and iteration caps.
Preset(Preset),
/// Manually specify all accuracy-related parameters.
Manual {
cg_tol: f64,
cg_iters: usize,
pivot_eps: f64,
ridge: f64,
},
}
impl Default for Accuracy {
fn default() -> Self {
Accuracy::Preset(Preset::Balanced)
}
}
/// Strategy for suggesting which missing comparison would be most informative to add next.
#[derive(Clone, Debug, PartialEq)]
pub enum Suggester {
/// A cheap heuristic based on node degrees. It prioritizes connecting low-degree nodes.
Impact,
/// A more principled (but expensive) method based on effective resistance.
/// Note: Not yet implemented; falls back to `Impact`.
EffectiveResistance { max_candidates: usize },
}
impl Default for Suggester {
fn default() -> Self {
Suggester::EffectiveResistance {
max_candidates: 2000,
}
}
}
/// A collection of all configurable options for the solver.
#[derive(Clone, Debug)]
pub struct Options {
pub engine: SolveEngine,
pub gauge: Gauge,
pub robustness: Robustness,
pub accuracy: Accuracy,
pub ri_table: RiTable,
pub suggester: Suggester,
}
impl Default for Options {
fn default() -> Self {
Self {
engine: Default::default(),
gauge: Default::default(),
robustness: Default::default(),
accuracy: Default::default(),
ri_table: Default::default(),
suggester: Default::default(),
}
}
}
/// An error that can occur during the solving process.
#[derive(Debug, PartialEq)]
pub enum Error {
TooSmall,
BadIndex(usize, usize, usize),
BadRatio,
Disconnected,
Linear,
NoEdges,
}
/// Diagnostic information about the last solve operation.
#[derive(Clone, Debug)]
pub struct Diagnostics {
pub n: usize,
pub m_edges: usize,
pub connected: bool,
pub avg_degree: f64,
pub huber_used: bool,
pub iters_last: usize,
pub converged: Option,
pub ri_used: f64,
pub fallbacks_used: Vec,
}
/// The complete solution from a successful solve operation.
#[derive(Clone, Debug)]
pub struct Solution {
pub weights: Vec,
pub log_weights: Vec,
pub completed: Matrix,
pub saaty_cr: f64,
pub koczkodaj: f64,
pub diagnostics: Diagnostics,
}
/// A simple dense row-major matrix.
#[derive(Clone, Debug, Default)]
pub struct Matrix {
n: usize,
data: Vec,
}
impl Matrix {
/// Creates a new `n x n` matrix initialized to zero.
pub fn new(n: usize) -> Self {
Self {
n,
data: vec![0.0; n * n],
}
}
/// Returns the dimension `n` of the matrix.
pub fn size(&self) -> usize {
self.n
}
}
impl std::ops::Index<(usize, usize)> for Matrix {
type Output = f64;
#[inline]
fn index(&self, ij: (usize, usize)) -> &Self::Output {
&self.data[ij.0 * self.n + ij.1]
}
}
impl std::ops::IndexMut<(usize, usize)> for Matrix {
#[inline]
fn index_mut(&mut self, ij: (usize, usize)) -> &mut Self::Output {
&mut self.data[ij.0 * self.n + ij.1]
}
}
// ===============================================================================================
// ===== Solver Definition & API
// ===============================================================================================
/// A more descriptive internal type for edge data storage.
#[derive(Clone, Copy, Debug)]
struct EdgeData {
log_ratio: f64,
weight: f64,
}
/// The main solver object that holds the problem definition and configuration.
#[derive(Clone, Debug)]
pub struct Solver {
n: usize,
edges: HashMap<(usize, usize), EdgeData>,
priors: Vec,
x_ws: Vec, // Warm start log-weights
opts: Options,
}
impl Solver {
/// Creates a new solver for a problem with `n` items.
pub fn new(n: usize) -> Self {
Self {
n,
edges: HashMap::new(),
priors: vec![NodePrior::default(); n],
x_ws: vec![0.0; n],
opts: Default::default(),
}
}
/// Returns a mutable reference to the solver's options for configuration.
pub fn options_mut(&mut self) -> &mut Options {
&mut self.opts
}
/// Adds a pairwise comparison edge `i -> j` with `ratio = w_i / w_j`.
/// Ratios are internally averaged if the same edge is added multiple times.
pub fn add_edge(&mut self, i: usize, j: usize, ratio: f64, weight: f64) -> Result<(), Error> {
if i >= self.n || j >= self.n || i == j {
return Err(Error::BadIndex(i, j, self.n));
}
if !ratio.is_finite() || ratio <= 0.0 {
return Err(Error::BadRatio);
}
// Convert to log-space and canonical order (i < j).
let (a, b, log_ratio) = if i < j {
(i, j, ratio.ln())
} else {
(j, i, -ratio.ln())
};
let new_weight = weight.max(1e-12);
self.edges
.entry((a, b))
.and_modify(|edge| {
// Perform a weighted average of log-ratios if the edge already exists.
let new_total_weight = edge.weight + new_weight;
edge.log_ratio =
(edge.log_ratio * edge.weight + log_ratio * new_weight) / new_total_weight;
edge.weight = new_total_weight;
})
.or_insert(EdgeData {
log_ratio,
weight: new_weight,
});
Ok(())
}
/// Adds a prior belief about an item's weight.
pub fn add_prior(
&mut self,
i: usize,
mu_weight_ratio: f64,
precision: f64,
) -> Result<(), Error> {
if i >= self.n {
return Err(Error::BadIndex(i, i, self.n));
}
if !mu_weight_ratio.is_finite() || mu_weight_ratio <= 0.0 {
return Err(Error::BadRatio);
}
// Priors are defined in log-space.
self.priors[i] = NodePrior {
mu: mu_weight_ratio.ln(),
precision: precision.max(0.0),
};
Ok(())
}
/// Solves the pairwise comparison problem.
/// This method orchestrates the main three-stage pipeline: prepare, solve, and package.
/// This structure separates validation, computation, and post-processing for clarity.
pub fn solve(&mut self) -> Result {
// STAGE 1: Validate inputs and prepare all data for the numerical solver.
let inputs = self.prepare_solve_data()?;
// STAGE 2: Run the core numerical optimization routine. This can fail for singular matrices.
let core_solution = self.run_core_solver(&inputs)?;
// Store the raw result as a warm start for subsequent solves, improving performance.
self.x_ws = core_solution.x.clone();
// STAGE 3: Post-process the raw result into the final user-facing Solution struct.
let solution = self.package_solution(core_solution, &inputs);
Ok(solution)
}
/// Suggests the top `k` most informative missing edges to add next.
pub fn suggest(&self, k: usize) -> (Vec<(usize, usize, f64)>, Vec) {
let missing = missing_edges(self.n, &self.edges);
if k == 0 || missing.is_empty() {
return (vec![], vec![]);
}
let (suggestions, fallbacks) = match self.opts.suggester {
Suggester::Impact => (suggest_impact(self.n, &missing, &self.edges), vec![]),
Suggester::EffectiveResistance { .. } => {
let fallback_msg =
"EffectiveResistance suggester is not implemented, falling back to Impact.".to_string();
(
suggest_impact(self.n, &missing, &self.edges),
vec![fallback_msg],
)
}
};
(suggestions.into_iter().take(k).collect(), fallbacks)
}
}
// ===============================================================================================
// ===== Internal Solver Logic: A Functional-Style Pipeline
// ===============================================================================================
/// A container for all the data needed by the core numerical solver.
struct SolveInputs {
effective_engine: SolveEngine,
m_edges: usize,
connected: bool,
avg_degree: f64,
}
/// The raw output from the core numerical solver.
struct CoreSolution {
/// The raw log-weights vector.
x: Vec,
/// The number of iterations the solver took.
iters: usize,
/// Whether the iterative solver converged to its tolerance.
converged: Option,
}
impl Solver {
/// STAGE 1: Gathers inputs, performs validation, and calculates initial diagnostics.
/// This function is pure; it reads the solver state and returns data or an error.
fn prepare_solve_data(&self) -> Result {
if self.n < 2 {
return Err(Error::TooSmall);
}
if self.edges.is_empty() {
return Err(Error::NoEdges);
}
let connected = is_connected(self.n, &self.edges);
if !connected && !has_anchoring_priors(&self.priors) {
return Err(Error::Disconnected);
}
let m_edges = self.edges.len();
let avg_degree = 2.0 * (m_edges as f64) / (self.n as f64);
// Resolve `Accuracy` settings into concrete engine parameters.
let effective_engine = self.get_effective_engine();
Ok(SolveInputs {
effective_engine,
m_edges,
connected,
avg_degree,
})
}
/// STAGE 2: The computational heart. Dispatches to the correct numerical method based on options.
fn run_core_solver(&self, inputs: &SolveInputs) -> Result {
let (x, iters, converged) = match (&inputs.effective_engine, &self.opts.robustness) {
(SolveEngine::ExactDense(dp), Robustness::L2) => {
(self.solve_dense_l2(dp)?, 1, Some(true))
}
(
SolveEngine::ExactDense(dp),
Robustness::Huber {
delta,
irls_iters,
irls_tol,
},
) => {
let (x, it, conv) = self.solve_dense_huber(*delta, *irls_iters, *irls_tol, dp)?;
(x, it, Some(conv))
}
(SolveEngine::IterativeCG(ip), Robustness::L2) => {
let (x, it, conv) = self.solve_iterative_l2(ip);
(x, it, Some(conv))
}
(
SolveEngine::IterativeCG(ip),
Robustness::Huber {
delta,
irls_iters,
irls_tol,
},
) => {
let (x, it, conv) = self.solve_iterative_huber(*delta, *irls_iters, *irls_tol, ip);
(x, it, Some(conv))
}
};
Ok(CoreSolution {
x,
iters,
converged,
})
}
/// STAGE 3: Transforms the raw numerical result into the final `Solution`.
fn package_solution(&self, core_solution: CoreSolution, inputs: &SolveInputs) -> Solution {
let SolutionPackageInputs {
log_weights,
weights,
completed,
} = self.post_process(&core_solution.x);
// Check for suggester fallbacks even if not requesting suggestions.
let (_suggestions, fallbacks) = self.suggest(0);
let ri_used = random_index(self.n, self.opts.ri_table);
let diagnostics = Diagnostics {
n: self.n,
m_edges: inputs.m_edges,
connected: inputs.connected,
avg_degree: inputs.avg_degree,
huber_used: matches!(self.opts.robustness, Robustness::Huber { .. }),
iters_last: core_solution.iters,
converged: core_solution.converged,
ri_used,
fallbacks_used: fallbacks,
};
let saaty_cr = saaty_cr(&completed, &weights, ri_used);
let koczkodaj = if self.n <= 200 {
koczkodaj_exact(&completed)
} else {
koczkodaj_sampled(&completed, 20_000)
};
Solution {
weights,
log_weights,
completed,
saaty_cr,
koczkodaj,
diagnostics,
}
}
// --- Core Solver Implementations ---
/// Solves the standard L2 (least-squares) problem using a dense Cholesky/KKT solver.
fn solve_dense_l2(&self, dp: &DenseParams) -> Result, Error> {
let (l, b) = build_normal_eq_dense(self.n, &self.edges, &self.priors, dp.ridge);
let x = solve_spd_reduced(&l, &b, self.opts.gauge, dp.pivot_eps).ok_or(Error::Linear)?;
Ok(project_gauge(x, self.opts.gauge))
}
/// Solves the robust Huber problem using Iteratively Reweighted Least Squares (IRLS) with a dense solver.
fn solve_dense_huber(
&self,
delta: f64,
iters: usize,
tol: f64,
dp: &DenseParams,
) -> Result<(Vec, usize, bool), Error> {
let mut x = self.solve_dense_l2(dp)?;
for t in 0..iters {
let wscale = huber_scales(&x, &self.edges, delta);
let (l, b) =
build_normal_eq_dense_scaled(self.n, &self.edges, &self.priors, &wscale, dp.ridge);
let x_new =
solve_spd_reduced(&l, &b, self.opts.gauge, dp.pivot_eps).ok_or(Error::Linear)?;
// Check for convergence of the outer IRLS loop.
if step_inf(&x_new, &x) < tol {
return Ok((project_gauge(x_new, self.opts.gauge), t + 1, true));
}
x = x_new;
}
Ok((project_gauge(x, self.opts.gauge), iters, false))
}
/// Solves the standard L2 problem using the iterative Conjugate Gradient (CG) method.
fn solve_iterative_l2(&self, ip: &IterativeParams) -> (Vec, usize, bool) {
let sys = NormalEq::new(self.n, &self.edges, &self.priors, ip.ridge, self.opts.gauge);
let (mut x, it, conv) = cg_solve(&sys, self.x_ws.clone(), ip.cg_iters, ip.cg_tol);
x = project_gauge(x, self.opts.gauge);
(x, it, conv)
}
/// Solves the robust Huber problem using IRLS with the iterative CG solver.
fn solve_iterative_huber(
&self,
delta: f64,
iters: usize,
tol: f64,
ip: &IterativeParams,
) -> (Vec, usize, bool) {
let mut x = self.x_ws.clone(); // Use warm start as initial guess.
let mut total_cg_iters = 0;
let mut converged = false;
for t in 0..iters {
let scales = huber_scales(&x, &self.edges, delta);
let sys = NormalEq::new_scaled(
self.n,
&self.edges,
&self.priors,
&scales,
ip.ridge,
self.opts.gauge,
);
let (x_new, iters_cg, _) = cg_solve(&sys, x.clone(), ip.cg_iters, ip.cg_tol);
total_cg_iters += iters_cg;
// Check for convergence of the outer IRLS loop.
if t > 0 && step_inf(&x_new, &x) < tol {
converged = true;
x = x_new;
break;
}
x = x_new;
}
(project_gauge(x, self.opts.gauge), total_cg_iters, converged)
}
}
/// Helper struct for passing around post-processed solution components.
struct SolutionPackageInputs {
log_weights: Vec,
weights: Vec,
completed: Matrix,
}
impl Solver {
/// Helper to resolve the final engine parameters based on `Accuracy` settings.
fn get_effective_engine(&self) -> SolveEngine {
let (mut dense_params, mut iterative_params) = match &self.opts.engine {
SolveEngine::ExactDense(dp) => (dp.clone(), Default::default()),
SolveEngine::IterativeCG(ip) => (Default::default(), ip.clone()),
};
let (cg_tol, cg_iters, pivot_eps, ridge) = match self.opts.accuracy {
Accuracy::Preset(preset) => match preset {
Preset::Fast => (1e-4, 500, 1e-9, 1e-10),
Preset::Balanced => (1e-8, 10_000, 1e-10, 1e-12),
Preset::Precise => (1e-12, 50_000, 1e-12, 1e-14),
},
Accuracy::Manual {
cg_tol,
cg_iters,
pivot_eps,
ridge,
} => (cg_tol, cg_iters, pivot_eps, ridge),
};
iterative_params.cg_tol = cg_tol;
iterative_params.cg_iters = cg_iters;
iterative_params.ridge = ridge;
dense_params.pivot_eps = pivot_eps;
dense_params.ridge = ridge;
match self.opts.engine {
SolveEngine::ExactDense(_) => SolveEngine::ExactDense(dense_params),
SolveEngine::IterativeCG(_) => SolveEngine::IterativeCG(iterative_params),
}
}
/// Pure function to post-process the raw log-weights `x`.
fn post_process(&self, x: &[f64]) -> SolutionPackageInputs {
let weights = to_probability(x);
let completed = completed_from_weights(&weights);
SolutionPackageInputs {
log_weights: x.to_vec(),
weights,
completed,
}
}
}
// ===============================================================================================
// ===== Utilities: Graph, Math, Consistency Metrics
// ===============================================================================================
/// Checks if a graph is connected using a simple Depth First Search (DFS).
fn is_connected(n: usize, edges: &HashMap<(usize, usize), EdgeData>) -> bool {
if n < 2 {
return true;
}
let mut g = vec![vec![]; n];
for (&(a, b), _) in edges {
g[a].push(b);
g[b].push(a);
}
let mut visited = vec![false; n];
let mut stack = vec![0];
visited[0] = true;
let mut count = 1;
while let Some(u) = stack.pop() {
for &v in &g[u] {
if !visited[v] {
visited[v] = true;
stack.push(v);
count += 1;
}
}
}
count == n
}
/// Checks if there is at least one node with a non-zero precision prior.
fn has_anchoring_priors(pr: &[NodePrior]) -> bool {
pr.iter().any(|p| p.precision > 0.0)
}
/// Converts log-weights `x` to a probability distribution `w` that sums to 1.
fn to_probability(x: &[f64]) -> Vec {
let max_x = x.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
let mut exps: Vec = x.iter().map(|&xi| (xi - max_x).exp()).collect();
let sum_exps: f64 = exps.iter().sum();
if sum_exps == 0.0 {
return vec![1.0 / (x.len() as f64); x.len()];
}
exps.iter_mut().for_each(|v| *v /= sum_exps);
exps
}
/// Creates a complete `n x n` pairwise comparison matrix from a vector of weights.
fn completed_from_weights(w: &[f64]) -> Matrix {
let n = w.len();
let mut m = Matrix::new(n);
for i in 0..n {
for j in 0..n {
m[(i, j)] = w[i] / w[j].max(1e-18);
}
}
m
}
/// Calculates Saaty's Consistency Ratio (CR).
fn saaty_cr(m: &Matrix, w: &[f64], ri: f64) -> f64 {
let n = m.size();
if n < 3 {
return 0.0;
}
let lambda_max: f64 = (0..n)
.map(|i| (0..n).map(|j| m[(i, j)] * w[j]).sum::() / w[i].max(1e-18))
.sum::()
/ n as f64;
let ci = (lambda_max - n as f64) / ((n - 1) as f64);
if ri > 0.0 {
ci / ri
} else {
0.0
}
}
/// Provides the Random Index (RI) value for a given `n`.
fn random_index(n: usize, table: RiTable) -> f64 {
match table {
RiTable::SaatyUpTo13Then158 => match n {
1 | 2 => 0.0,
3 => 0.58,
4 => 0.90,
5 => 1.12,
6 => 1.24,
7 => 1.32,
8 => 1.41,
9 => 1.45,
10 => 1.49,
11 => 1.51,
12 => 1.48,
13 => 1.56,
_ => 1.58,
},
RiTable::SaatyUpTo10Then151 => match n {
1 | 2 => 0.0,
3 => 0.58,
4 => 0.90,
5 => 1.12,
6 => 1.24,
7 => 1.32,
8 => 1.41,
9 => 1.45,
10 => 1.49,
_ => 1.51,
},
}
}
/// Calculates Koczkodaj's Inconsistency Index by checking all `(i,j,k)` triplets.
fn koczkodaj_exact(m: &Matrix) -> f64 {
let n = m.size();
(0..n)
.flat_map(|i| {
(i + 1..n).flat_map(move |j| {
(0..n).filter(move |&k| k != i && k != j).map(move |k| {
let a = m[(i, j)];
let b = m[(i, k)] * m[(k, j)];
if a > 0.0 && b > 0.0 {
1.0 - (a / b).min(b / a)
} else {
0.0
}
})
})
})
.fold(0.0, f64::max)
}
/// Approximates Koczkodaj's Index by random sampling, for large `n`.
fn koczkodaj_sampled(m: &Matrix, samples: usize) -> f64 {
let n = m.size();
if n < 3 {
return 0.0;
}
let mut rng_state: u64 = 0x5EC0DE1234ABCD;
let mut worst: f64 = 0.0;
let mut next_rand = || {
rng_state = rng_state.wrapping_mul(6364136223846793005).wrapping_add(1);
(rng_state >> 1) as usize
};
for _ in 0..samples {
let i = next_rand() % n;
let j = next_rand() % n;
let k = next_rand() % n;
if i == j || j == k || i == k {
continue;
}
let a = m[(i, j)];
let b = m[(i, k)] * m[(k, j)];
if a > 0.0 && b > 0.0 {
worst = worst.max(1.0 - (a / b).min(b / a));
}
}
worst
}
/// Builds the dense normal equation system `Lx = b`. `L` is the graph Laplacian.
fn build_normal_eq_dense(
n: usize,
edges: &HashMap<(usize, usize), EdgeData>,
priors: &[NodePrior],
ridge: f64,
) -> (Vec, Vec) {
let mut l = vec![0.0; n * n];
let mut b = vec![0.0; n];
edges.iter().for_each(|(&(i, j), edge)| {
let w = edge.weight.max(1e-12);
l[i * n + i] += w;
l[j * n + j] += w;
l[i * n + j] -= w;
l[j * n + i] -= w;
b[i] += w * edge.log_ratio;
b[j] -= w * edge.log_ratio;
});
priors.iter().enumerate().for_each(|(i, p)| {
if p.precision > 0.0 {
l[i * n + i] += p.precision;
b[i] += p.precision * p.mu;
}
});
(0..n).for_each(|i| l[i * n + i] += ridge);
(l, b)
}
/// Builds the dense normal equations for a re-weighted (Huber) problem.
fn build_normal_eq_dense_scaled(
n: usize,
edges: &HashMap<(usize, usize), EdgeData>,
priors: &[NodePrior],
scale: &HashMap<(usize, usize), f64>,
ridge: f64,
) -> (Vec, Vec) {
let mut l = vec![0.0; n * n];
let mut b = vec![0.0; n];
edges.iter().for_each(|(&(i, j), edge)| {
let s = scale.get(&(i, j)).copied().unwrap_or(1.0);
let w = (edge.weight * s).max(1e-12);
l[i * n + i] += w;
l[j * n + j] += w;
l[i * n + j] -= w;
l[j * n + i] -= w;
b[i] += w * edge.log_ratio;
b[j] -= w * edge.log_ratio;
});
priors.iter().enumerate().for_each(|(i, p)| {
if p.precision > 0.0 {
l[i * n + i] += p.precision;
b[i] += p.precision * p.mu;
}
});
(0..n).for_each(|i| l[i * n + i] += ridge);
(l, b)
}
/// Solves a symmetric positive (semi-)definite system `Lx=b` by applying a gauge fix and using an appropriate solver.
fn solve_spd_reduced(l: &[f64], b: &[f64], gauge: Gauge, pivot_eps: f64) -> Option> {
match gauge {
Gauge::PinX0 => {
let n = b.len();
let k = n - 1; // Size of the reduced system
let mut a = vec![0.0; k * k];
let mut rhs = vec![0.0; k];
// Extract the submatrix L_11 and subvector b_1
for r in 1..n {
rhs[r - 1] = b[r];
for c in 1..n {
a[(r - 1) * k + (c - 1)] = l[r * n + c];
}
}
let xr = chol_solve(&a, &rhs, k, pivot_eps)?;
let mut x = vec![0.0; n];
x[1..].copy_from_slice(&xr);
Some(x)
}
Gauge::ZeroMean => solve_dense_kkt(l, b),
}
}
/// A standard Cholesky solver (`Ax=b` where `A` is SPD).
fn chol_solve(a: &[f64], b: &[f64], n: usize, pivot_eps: f64) -> Option> {
let mut l = a.to_vec();
// Decompose A into L*L^T
for i in 0..n {
for j in 0..i {
let s: f64 = (0..j).map(|k| l[i * n + k] * l[j * n + k]).sum();
l[i * n + j] = (l[i * n + j] - s) / l[j * n + j];
}
let s: f64 = (0..i).map(|k| l[i * n + k].powi(2)).sum();
let d = l[i * n + i] - s;
if d < pivot_eps || !d.is_finite() {
return None;
} // Not positive definite
l[i * n + i] = d.sqrt();
}
// Solve Ly = b (forward substitution)
let mut y = vec![0.0; n];
for i in 0..n {
let s: f64 = (0..i).map(|k| l[i * n + k] * y[k]).sum();
y[i] = (b[i] - s) / l[i * n + i];
}
// Solve L^T x = y (backward substitution)
let mut x = vec![0.0; n];
for i in (0..n).rev() {
let s: f64 = ((i + 1)..n).map(|k| l[k * n + i] * x[k]).sum();
x[i] = (y[i] - s) / l[i * n + i];
}
Some(x)
}
/// Solves `Lx=b` with constraint `1^T x = 0` using a KKT system.
fn solve_dense_kkt(l: &[f64], b: &[f64]) -> Option> {
let n = b.len();
let m = n + 1;
let mut a = vec![0.0; m * m];
let mut rhs = vec![0.0; m];
// Build the augmented KKT matrix: [L 1; 1^T 0]
for r in 0..n {
rhs[r] = b[r];
for c in 0..n {
a[r * m + c] = l[r * n + c];
}
a[r * m + n] = 1.0;
a[n * m + r] = 1.0;
}
// Simple Gaussian elimination with partial pivoting
for i in 0..m {
let mut max_row = i;
for k in i + 1..m {
if a[k * m + i].abs() > a[max_row * m + i].abs() {
max_row = k;
}
}
// swap entire rows i and max_row in-place (element-wise swap to satisfy borrow checker)
if i != max_row {
for j in 0..m {
a.swap(i * m + j, max_row * m + j);
}
}
rhs.swap(i, max_row);
if a[i * m + i].abs() < 1e-12 {
return None;
} // Singular
for k in i + 1..m {
let f = a[k * m + i] / a[i * m + i];
for j in i + 1..m {
a[k * m + j] -= a[i * m + j] * f;
}
rhs[k] -= rhs[i] * f;
}
}
// Back substitution
let mut x = vec![0.0; m];
for i in (0..m).rev() {
let sum: f64 = (i + 1..m).map(|j| a[i * m + j] * x[j]).sum();
x[i] = (rhs[i] - sum) / a[i * m + i];
}
x.truncate(n);
Some(x)
}
/// A "matrix-free" representation of the normal equations `Lx = b`.
struct NormalEq {
n: usize,
edges: Vec<(usize, usize, f64, f64)>,
priors: Vec,
ridge: f64,
gauge: Gauge,
}
impl NormalEq {
fn new(
n: usize,
edges: &HashMap<(usize, usize), EdgeData>,
priors: &[NodePrior],
ridge: f64,
gauge: Gauge,
) -> Self {
let e = edges
.iter()
.map(|(&(a, b), d)| (a, b, d.log_ratio, d.weight.max(1e-12)))
.collect();
Self {
n,
edges: e,
priors: priors.to_vec(),
ridge,
gauge,
}
}
fn new_scaled(
n: usize,
edges: &HashMap<(usize, usize), EdgeData>,
priors: &[NodePrior],
scale: &HashMap<(usize, usize), f64>,
ridge: f64,
gauge: Gauge,
) -> Self {
let e = edges
.iter()
.map(|(&(a, b), d)| {
let s = scale.get(&(a, b)).copied().unwrap_or(1.0);
(a, b, d.log_ratio, (d.weight * s).max(1e-12))
})
.collect();
Self {
n,
edges: e,
priors: priors.to_vec(),
ridge,
gauge,
}
}
fn matvec(&self, x: &[f64], y: &mut [f64]) {
y.fill(0.0);
for &(i, j, _, w) in &self.edges {
let d = x[i] - x[j];
y[i] += w * d;
y[j] -= w * d;
}
for i in 0..self.n {
y[i] += (self.priors[i].precision + self.ridge) * x[i];
}
// For PinX0 gauge, the operator itself is modified to be non-singular.
if let Gauge::PinX0 = self.gauge {
y[0] = x[0];
}
}
fn rhs(&self) -> Vec {
let mut b = vec![0.0; self.n];
for &(i, j, y, w) in &self.edges {
b[i] += w * y;
b[j] -= w * y;
}
for i in 0..self.n {
b[i] += self.priors[i].precision * self.priors[i].mu;
}
// Project the RHS onto the appropriate subspace to ensure consistency.
project_cg_vec(&mut b, self.gauge);
b
}
}
/// A standard Conjugate Gradient (CG) solver for `Ax=b`.
fn cg_solve(sys: &NormalEq, mut x: Vec, max_iter: usize, tol: f64) -> (Vec, usize, bool) {
let b = sys.rhs();
let n = x.len();
let mut r = vec![0.0; n];
let mut ax = vec![0.0; n];
project_cg_vec(&mut x, sys.gauge); // Ensure initial guess is in the correct subspace
sys.matvec(&x, &mut ax);
for i in 0..n {
r[i] = b[i] - ax[i];
}
project_cg_vec(&mut r, sys.gauge); // Project residual
let mut p = r.clone();
let mut rr_old = dot(&r, &r);
let b_norm = norm(&b).max(1e-16);
if (rr_old.sqrt() / b_norm) < tol {
return (x, 0, true);
}
let mut tmp = vec![0.0; n];
for it in 1..=max_iter {
sys.matvec(&p, &mut tmp);
// For ZeroMean, must project to stay in the subspace orthogonal to the nullspace.
// For PinX0, this has no effect on relevant components but is harmless.
project_cg_vec(&mut tmp, sys.gauge);
let denom = dot(&p, &tmp);
if denom.abs() < 1e-30 {
return (x, it, false);
} // Search direction is orthogonal to A*p
let alpha = rr_old / denom;
// Update solution and residual
for i in 0..n {
x[i] += alpha * p[i];
r[i] -= alpha * tmp[i];
}
project_cg_vec(&mut x, sys.gauge);
project_cg_vec(&mut r, sys.gauge);
let rr_new = dot(&r, &r);
if (rr_new.sqrt() / b_norm) < tol {
return (x, it, true);
}
// Update search direction
let beta = rr_new / rr_old;
rr_old = rr_new;
for i in 0..n {
p[i] = r[i] + beta * p[i];
}
project_cg_vec(&mut p, sys.gauge);
}
(x, max_iter, false)
}
/// Calculates the scaling factors for IRLS with Huber loss.
fn huber_scales(
x: &[f64],
edges: &HashMap<(usize, usize), EdgeData>,
delta: f64,
) -> HashMap<(usize, usize), f64> {
edges
.iter()
.map(|(&(a, b), edge)| {
let r = (x[a] - x[b]) - edge.log_ratio;
let v = if r.abs() <= delta {
1.0
} else {
(delta / r.abs()).max(1e-6)
};
((a, b), v)
})
.collect()
}
// ===== Math, Gauges, and Suggesters =====
/// Enforces the gauge condition on a vector. Essential for CG's correctness.
fn project_cg_vec(x: &mut [f64], gauge: Gauge) {
match gauge {
Gauge::PinX0 => x[0] = 0.0,
Gauge::ZeroMean => {
let mean = x.iter().sum::() / x.len() as f64;
x.iter_mut().for_each(|v| *v -= mean);
}
}
}
/// After solving, strictly enforce the gauge condition on the final solution vector.
fn project_gauge(mut x: Vec, gauge: Gauge) -> Vec {
project_cg_vec(&mut x, gauge);
x
}
#[inline]
fn dot(a: &[f64], b: &[f64]) -> f64 {
a.iter().zip(b).map(|(x, y)| x * y).sum()
}
#[inline]
fn norm(a: &[f64]) -> f64 {
dot(a, a).sqrt()
}
#[inline]
fn step_inf(a: &[f64], b: &[f64]) -> f64 {
a.iter()
.zip(b)
.map(|(x, y)| (x - y).abs())
.fold(0.0, f64::max)
}
/// Finds all pairs of nodes `(i, j)` with `i < j` that do not have a connecting edge.
fn missing_edges(n: usize, edges: &HashMap<(usize, usize), EdgeData>) -> Vec<(usize, usize)> {
(0..n)
.flat_map(|i| {
(i + 1..n)
.filter(move |&j| !edges.contains_key(&(i, j)))
.map(move |j| (i, j))
})
.collect()
}
/// A simple heuristic that suggests edges based on the sum of the degrees of the endpoints.
fn suggest_impact(
n: usize,
missing: &[(usize, usize)],
edges: &HashMap<(usize, usize), EdgeData>,
) -> Vec<(usize, usize, f64)> {
let mut deg = vec![0; n];
for (&(a, b), _) in edges {
deg[a] += 1;
deg[b] += 1;
}
let mut out: Vec<_> = missing
.iter()
.map(|&(i, j)| (i, j, 1.0 / ((deg[i].max(1) + deg[j].max(1)) as f64)))
.collect();
out.sort_by(|a, b| b.2.partial_cmp(&a.2).unwrap());
out
}