Skip to content

Commit 69fe6de

Browse files
authored
Add files via upload (#64)
1 parent 5f1cf19 commit 69fe6de

File tree

2 files changed

+481
-0
lines changed

2 files changed

+481
-0
lines changed

note/dd_note.typ

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
#set page(
2+
paper: "a4",
3+
margin: (x: 2cm, y: 2cm),
4+
numbering: "I",
5+
)
6+
#set text(
7+
font: ("Linux Libertine", "Source Han Serif SC"),
8+
lang: "en",
9+
size: 11pt
10+
)
11+
#show math.equation: set text(font: "Latin Modern Math")
12+
13+
// Nice explanation box style
14+
#let note-box(title, color, body) = block(
15+
fill: color.lighten(90%),
16+
stroke: (left: 4pt + color),
17+
inset: 12pt,
18+
radius: 4pt,
19+
width: 100%,
20+
[
21+
#text(fill: color, weight: "bold", size: 12pt)[#title]
22+
#v(0.5em)
23+
#body
24+
]
25+
)
26+
27+
#set heading(numbering: none)
28+
29+
= Note
30+
31+
== 1. #link(<NP_PP-complete_return>)[#text(fill:blue)[$("NP")^(PP)$-complete (complexity hierarchy)]] <NP_PP-complete>
32+
33+
This class describes the difficulty of computational problems at a high level of the Polynomial Hierarchy, implying extreme computational hardness.
34+
35+
#note-box("Intuition: nested decision and summation", blue)[
36+
To understand this, break it into two layers:
37+
- **NP (Non-deterministic Polynomial)**: represents nondeterministic polynomial time, corresponding to the **MAX operation** in MMAP. This is like searching for an optimal solution in a huge space (e.g., TSP).
38+
- **PP (Probabilistic Polynomial)**: corresponds to the **SUM operation** in MMAP. It requires summing over all possibilities or marginal probabilities, often harder than just finding an optimum.
39+
40+
**Meaning of $("NP")^(PP)$**:
41+
This is an "oracle" machine. It means we must solve an NP-hard optimization problem, but to verify each candidate, we must first solve a PP-hard summation problem.
42+
43+
**Conclusion**: This is harder than NP-complete or PP-complete alone. Since exact solutions are infeasible in polynomial time, in QEC or large-scale probabilistic inference we **must** rely on approximate algorithms (e.g., variational inference or dual decomposition).
44+
]
45+
46+
== 2. #link(<Lagrange_Multipliers_return>)[#text(fill:blue)[Lagrange Multipliers]] <Lagrange_Multipliers>
47+
48+
In optimization theory, this is a core technique for constrained problems. In dual decomposition, it acts as a "coordination variable."
49+
50+
#note-box("Mechanism: reach consensus via prices", orange)[
51+
When we decompose a complex global problem into independent subproblems (e.g., subgraph A and B), these subproblems can disagree on shared variables.
52+
53+
- **Hard constraint**: require $x_A = x_B$. Solving with hard constraints is difficult.
54+
- **Relaxation**: remove hard constraints and add Lagrange multiplier $delta$ as a penalty in the objective.
55+
56+
**Role of $delta$**:
57+
Think of $delta$ as the **price of inconsistency**.
58+
- If subproblem A predicts a higher value than B, the algorithm adjusts $delta$ to "fine" A and "subsidize" B.
59+
- By iteratively updating $delta$ (usually via subgradient methods), we force each subproblem to approach global consistency while optimizing locally.
60+
]
61+
62+
== 3. #link(<Variational_Upper_Bound_return>)[#text(fill:blue)[Variational Upper Bound]] <Variational_Upper_Bound>
63+
64+
When the objective is intractable (e.g., partition function or marginal likelihood), we build a tractable function that always upper-bounds the true objective.
65+
66+
#note-box("Geometric intuition: lowering the envelope", green)[
67+
Suppose the true optimum is $Phi^*$ (the true MMAP log-probability). Computing it directly requires high-dimensional sums/integrals.
68+
69+
**Variational strategy:**
70+
1. **Construct the dual function $L(delta)$**: via dual decomposition. By weak duality, for any $delta$, $L(delta) >= Phi^*$.
71+
2. **Minimize the upper bound**: since $L(delta)$ is always above $Phi^*$, we search for $delta$ that lowers it.
72+
3. **Approximation**: as we lower this "ceiling," it approaches the true $Phi^*$.
73+
74+
At convergence, the value may still be approximate, but the **upper bound** provides a theoretical guarantee on solution quality (the duality gap).
75+
]
76+
77+
// (Sections 1-3 above were previously here; omitted to save space, continuing with section 4)
78+
79+
== 4. #link(<Dual_Decomposition_return>)[#text(fill:blue)[Dual Decomposition]] <Dual_Decomposition>
80+
81+
This is the core algorithmic framework for complex graphical model inference. Its philosophy is "divide and coordinate."
82+
83+
#note-box("Core logic: split and negotiate", purple)[
84+
For a complex global problem (e.g., MMAP on a 2D grid), direct solution is extremely hard due to tight coupling. Dual decomposition proceeds as follows:
85+
86+
1. **Decompose**: cut some variable interactions, split the big graph into disjoint, easy subgraphs (trees or chains).
87+
2. **Solve locally**: each subgraph performs inference independently. Because the structure is simple, this is fast (polynomial time).
88+
3. **Coordinate**: the split is artificial, so subgraphs may disagree on shared variables. We introduce **dual variables** (see section 2) to penalize disagreement and drive consensus.
89+
90+
This is like distributing a big project to multiple teams; the manager (master algorithm) adjusts incentives so outputs align.
91+
]
92+
93+
== 5. #link(<Grid_Decomposition_return>)[#text(fill:blue)[Grid decomposition details (Row/Col Decomposition)]] <Grid_Decomposition>
94+
95+
For surface codes or 2D Ising models, how do we decompose an $N times N$ grid into easy structures?
96+
97+
#note-box("Operational details: edge-based split", red)[
98+
A 2D grid has **nodes** and **edges**. The difficulty comes from **loops**. Our goal is to remove loops while keeping tree structures.
99+
100+
**Steps:**
101+
1. **Duplicate nodes**: for each node $x_(i,j)$, create a copy $x_(i,j)^("row")$ in the horizontal set and $x_(i,j)^("col")$ in the vertical set.
102+
2. **Assign edges (key step)**:
103+
- **Row strips**: keep only **horizontal edges**. The grid becomes $N$ independent horizontal chains (1D, no loops).
104+
- **Col strips**: keep only **vertical edges**. The grid becomes $N$ independent vertical chains.
105+
3. **Result**: instead of one complex 2D grid, we now have $2N$ simple 1D chains.
106+
107+
This decomposition is ideal for parallel computation because each chain's inference (transfer matrix or forward-backward) is independent.
108+
]
109+
110+
== 6. #link(<Consensus_Constraint_return>)[#text(fill:blue)[Consistency constraint (Why Consistency?)]] <Consensus_Constraint>
111+
112+
In step 5, we created "row copies" and "col copies." Why must we enforce their agreement?
113+
114+
#note-box("Logic: returning to physical reality", teal)[
115+
**Why consistency?**
116+
In the real physical system (original problem), qubit $(i,j)$ is unique.
117+
- It cannot be "error" ($x=1$) from the row view while "no error" ($x=0$) from the column view.
118+
- If copies disagree, the solution violates physical reality and is invalid.
119+
120+
**Role of Lagrangian relaxation:**
121+
- Ideally, we want a hard constraint: $x^("row") = x^("col")$. This is hard to enforce directly.
122+
- **Relaxation**: allow temporary disagreement, but penalize it with Lagrange multipliers $delta$.
123+
- At convergence, if penalties work, the copies converge to the same value, and $L(delta)$ equals (or closely approximates) the original optimum.
124+
]

0 commit comments

Comments
 (0)