CompPhysics
diff --git a/‎doc/pub/week15/html/week15-bs.html‎
Lines changed: 59 additions & 45 deletions b/‎doc/pub/week15/html/week15-bs.html‎
Lines changed: 59 additions & 45 deletions
diff --git a/‎doc/pub/week15/html/week15-reveal.html‎
Lines changed: 55 additions & 45 deletions b/‎doc/pub/week15/html/week15-reveal.html‎
Lines changed: 55 additions & 45 deletions
@@ -382,28 +382,38 @@ <h2 id="what-will-we-need-in-the-case-of-a-quantum-computer">What will we need i
 <div class="alert alert-block alert-block alert-text-normal">
 <b></b>
 <p>
-<p>We will have to translate the classical data point \( \vec{x} \)
-into a quantum datapoint \( \vert \Phi{(\vec{x})} \rangle \). This can
-be achieved by a circuit \( \mathcal{U}_{\Phi(\vec{x})} \vert 0\rangle \).
+<p>We will have to translate the classical data point \( \boldsymbol{x} \)
+into a quantum datapoint \( \vert \Phi{(\boldsymbol{x})} \rangle \). This can
+be achieved by a circuit \( \mathcal{U}_{\Phi(\boldsymbol{x})} \vert 0\rangle \).
 </p>
 
 <p>Here \( \Phi() \) could be any classical function applied
-on the classical data \( \vec{x} \).
+on the classical data \( \boldsymbol{x} \).
 </p>
 </div>
 
 <div class="alert alert-block alert-block alert-text-normal">
 <b></b>
 <p>
-<p>We need a parameterized quantum circuit \( W(\theta) \) that
+<p>We need a parameterized quantum circuit \( W(\Theta) \) that
 processes the data in a way that in the end we
 can apply a measurement that returns a classical value \( -1 \) or
-\( 1 \) for each classical input \( \vec{x} \) that indentifies the label
+\( 1 \) for each classical input \( \boldsymbol{x} \) that indentifies the label
 of the classical data.
 </p>
 </div>
 </section>
 
+<section>
+<h2 id="parameterized-quantum-circuits">Parameterized quantum circuits </h2>
+
+<br/><br/>
+<center>
+<p><img src="figures/pqc.png" width="900" align="bottom"></p>
+</center>
+<br/><br/>
+</section>
+
 <section>
 <h2 id="the-most-general-ansatz">The most general ansatz </h2>
 
@@ -412,7 +422,7 @@ <h2 id="the-most-general-ansatz">The most general ansatz </h2>
 </p>
 <p>&nbsp;<br>
 $$
-W(\theta) \mathcal{U}_{\Phi}(\vec{x}) \vert 0 \rangle.
+W(\Theta) \mathcal{U}_{\Phi}(\boldsymbol{x}) \vert 0 \rangle.
 $$
 <p>&nbsp;<br>
 
@@ -425,7 +435,7 @@ <h2 id="quantum-svm">Quantum SVM </h2>
 <p>In the case of a quantum SVM we will only use the quantum feature maps</p>
 <p>&nbsp;<br>
 $$
-\mathcal{U}_{\Phi(\vec{x})},
+\mathcal{U}_{\Phi(\boldsymbol{x})},
 $$
 <p>&nbsp;<br>
 
@@ -444,14 +454,14 @@ <h2 id="defining-the-quantum-kernel">Defining the Quantum Kernel </h2>
 </p>
 <p>&nbsp;<br>
 $$
-K(\vec{x}, \vec{z}) = \vert \langle \Phi (\vec{x}) \vert \Phi(\vec{z}) \rangle \vert^2 = \langle 0^n \vert \mathcal{U}_{\Phi(\vec{x})}^{t} \mathcal{U}_{\Phi(\vec{z})} \vert 0^n \rangle,
+K(\boldsymbol{x}, \boldsymbol{z}) = \vert \langle \Phi (\boldsymbol{x}) \vert \Phi(\boldsymbol{z}) \rangle \vert^2 = \langle 0^n \vert \mathcal{U}_{\Phi(\boldsymbol{x})}^{t} \mathcal{U}_{\Phi(\boldsymbol{z})} \vert 0^n \rangle,
 $$
 <p>&nbsp;<br>
 
 <p>but now with the quantum feature maps</p>
 <p>&nbsp;<br>
 $$
-\mathcal{U}_{\Phi(\vec{x})}.
+\mathcal{U}_{\Phi(\boldsymbol{x})}.
 $$
 <p>&nbsp;<br>
 
@@ -839,8 +849,8 @@ <h2 id="quantum-neural-network">Quantum neural network </h2>
 kernel machines with a particular kernel determined by the circuit .
 In fact, one can often find a kernel SVM that matches or outperforms
 the variational model.  In practice, one can combine these: use a
-trainable quantum embedding \( U(\boldsymbol{x};\theta) \) with tunable
-parameters \( \theta \), and optimize \( \theta \) to maximize the SVM
+trainable quantum embedding \( U(\boldsymbol{x};\Theta) \) with tunable
+parameters \( \Theta \), and optimize \( \Theta \) to maximize the SVM
 classification accuracy.  This is called a quantum kernel learning
 approach.
 </p>
@@ -1848,8 +1858,8 @@ <h2 id="variational-quantum-circuits">Variational Quantum Circuits </h2>
 optimizer .  In this framework, a Variational Quantum Circuit (VQC)
 typically has three parts : (i) a state preparation or feature map
 that encodes classical input \( \mathbf{x} \) into a quantum state; (ii) a
-parameterized circuit \( W(\boldsymbol\theta) \) (often called the ansatz)
-that depends on trainable parameters \( \boldsymbol\theta \); and (iii) a
+parameterized circuit \( W(\boldsymbol\Theta) \) (often called the ansatz)
+that depends on trainable parameters \( \boldsymbol\Theta \); and (iii) a
 measurement that extracts a classical output from the final quantum
 state.
 </p>
@@ -1868,20 +1878,20 @@ <h2 id="setting-up-a-vqc">Setting up a VQC </h2>
 
 <p>where \( U(\mathbf{x}) \) is a unitary (possibly composed of rotations)
 that depends on the data.  We then apply the variational circuit
-\( W(\boldsymbol\theta) \), often built as a product of layers
-\( V_j(\theta_j) \), so that the final state is
+\( W(\boldsymbol\Theta) \), often built as a product of layers
+\( V_j(\Theta_j) \), so that the final state is
 </p>
 
 <p>&nbsp;<br>
 $$
-\vert \Psi(\mathbf{x};\boldsymbol\theta)\rangle = W(\boldsymbol\theta),U(\mathbf{x}),|0\rangle^{\otimes n}.
+\vert \Psi(\mathbf{x};\boldsymbol\Theta)\rangle = W(\boldsymbol\Theta),U(\mathbf{x}),|0\rangle^{\otimes n}.
 $$
 <p>&nbsp;<br>
 
 <p>For instance, one common ansatz is the hardware-efficient circuit:
 layers of parameterized single-qubit rotations and entangling gates
 (like CNOTs) repeated several times.  The structure of
-\( W(\boldsymbol\theta) \) can dramatically affect the circuit&#8217;s
+\( W(\boldsymbol\Theta) \) can dramatically affect the circuit&#8217;s
 expressivity and trainability.
 </p>
 </section>
@@ -1895,21 +1905,21 @@ <h2 id="outputs">Outputs </h2>
 </p>
 <p>&nbsp;<br>
 $$
-f_k(\mathbf{x};\boldsymbol\theta) ;=; \langle \Psi(\mathbf{x};\boldsymbol\theta) | \hat B_k | \Psi(\mathbf{x};\boldsymbol\theta)\rangle.
+f_k(\mathbf{x};\boldsymbol\Theta) ;=; \langle \Psi(\mathbf{x};\boldsymbol\Theta) | \hat B_k | \Psi(\mathbf{x};\boldsymbol\Theta)\rangle.
 $$
 <p>&nbsp;<br>
 
 <p>Equivalently, with</p>
 <p>&nbsp;<br>
 $$
-\vert \Psi(\mathbf{x};\boldsymbol\theta)\rangle = W(\boldsymbol\theta)U(\mathbf{x})|0\rangle,
+\vert \Psi(\mathbf{x};\boldsymbol\Theta)\rangle = W(\boldsymbol\Theta)U(\mathbf{x})|0\rangle,
 $$
 <p>&nbsp;<br>
 
 <p>one has</p>
 <p>&nbsp;<br>
 $$
-f_k(\mathbf{x};\boldsymbol\theta) = \langle 0|U(\mathbf{x})^\dagger W(\boldsymbol\theta)^\dagger ,\hat B_k, W(\boldsymbol\theta) U(\mathbf{x}),|0\rangle.
+f_k(\mathbf{x};\boldsymbol\Theta) = \langle 0|U(\mathbf{x})^\dagger W(\boldsymbol\Theta)^\dagger ,\hat B_k, W(\boldsymbol\Theta) U(\mathbf{x}),|0\rangle.
 $$
 <p>&nbsp;<br>
 
@@ -1920,9 +1930,9 @@ <h2 id="outputs">Outputs </h2>
 <h2 id="short-summary">Short summary </h2>
 
 <p>In summary, a variational quantum model
-\( f(\mathbf{x};\boldsymbol\theta) \) maps inputs to outputs via the
+\( f(\mathbf{x};\boldsymbol\Theta) \) maps inputs to outputs via the
 hybrid quantum-classical procedure.  During training, the classical
-optimizer adjusts \( \boldsymbol\theta \) (e.g. by gradient descent) to
+optimizer adjusts \( \boldsymbol\Theta \) (e.g. by gradient descent) to
 minimize a cost function (like mean-squared error) defined on a
 dataset.  Because the mapping is inherently quantum, these models can,
 in principle, harness the high-dimensional Hilbert space for richer
@@ -1944,21 +1954,21 @@ <h2 id="mathematical-example">Mathematical example </h2>
 <p>and a variational layer is</p>
 <p>&nbsp;<br>
 $$
-V(\boldsymbol\theta)=R_y(\theta_1)\otimes R_y(\theta_2),\mathrm{CNOT}(0,1),
+V(\boldsymbol\Theta)=R_y(\Theta_1)\otimes R_y(\Theta_2),\mathrm{CNOT}(0,1),
 $$
 <p>&nbsp;<br>
 
 <p>(apply \( R_y \) on each qubit then entangle).  After
-applying \( W(\boldsymbol\theta)=V(\boldsymbol\theta) \) to \( |0,0\rangle \),
+applying \( W(\boldsymbol\Theta)=V(\boldsymbol\Theta) \) to \( |0,0\rangle \),
 we measure \( \hat B=Z\otimes I \) on qubit 0.  The output is
 </p>
 <p>&nbsp;<br>
 $$
-f(\mathbf{x};\boldsymbol\theta) = \langle 0,0|,U(\mathbf{x})^\dagger,V(\boldsymbol\theta)^\dagger, (Z\otimes I), V(\boldsymbol\theta),U(\mathbf{x}),|0,0\rangle.
+f(\mathbf{x};\boldsymbol\Theta) = \langle 0,0|,U(\mathbf{x})^\dagger,V(\boldsymbol\Theta)^\dagger, (Z\otimes I), V(\boldsymbol\Theta),U(\mathbf{x}),|0,0\rangle.
 $$
 <p>&nbsp;<br>
 
-<p>This \( f(x;\theta) \) is then compared to the target in a cost function for optimization.</p>
+<p>This \( f(x;\Theta) \) is then compared to the target in a cost function for optimization.</p>
 </section>
 
 <section>
@@ -1971,15 +1981,15 @@ <h2 id="key-elements">Key elements </h2>
 on each qubit , while more complex feature maps may exploit
 entanglement.  The circuit output is obtained via expectation values
 of observables (e.g. Pauli-Z), yielding a differentiable function
-\( f(\mathbf{x};\boldsymbol\theta) \) .
+\( f(\mathbf{x};\boldsymbol\Theta) \) .
 </p>
 </section>
 
 <section>
 <h2 id="test-yourself-exercises">Test yourself exercises </h2>
 
 <ol>
-<p><li> Compute the state \( |\Psi(\mathbf{x};\boldsymbol\theta)\rangle \) explicitly for a 1-qubit VQC with \( U(x)=R_x(x) \) and \( W(\theta)=R_y(\theta) \). What is \( \langle Z\rangle \) as a function of \( x,\theta \)?</li>
+<p><li> Compute the state \( |\Psi(\mathbf{x};\boldsymbol\Theta)\rangle \) explicitly for a 1-qubit VQC with \( U(x)=R_x(x) \) and \( W(\Theta)=R_y(\Theta) \). What is \( \langle Z\rangle \) as a function of \( x,\Theta \)?</li>
 <p><li> Draw (or describe) a hardware-efficient ansatz for 3 qubits with 2 layers of rotations and CNOTs. How many parameters does it have?</li>
 </ol>
 <p>
@@ -2029,7 +2039,7 @@ <h2 id="input-encoding">Input Encoding </h2>
 <h2 id="qnn-architecture-and-models">QNN Architecture and Models </h2>
 
 <p>A general QNN can be viewed as a parameterized unitary
-\( U(\mathbf{x},\boldsymbol\theta) \) acting on \( n \) qubits, followed by
+\( U(\mathbf{x},\boldsymbol\Theta) \) acting on \( n \) qubits, followed by
 measurements.  Fig. 2 (placeholder) might depict a generic QNN with
 several layers of trainable gates. Each layer can entangle qubits,
 building up complexity. The output is then a (classical) vector of
@@ -2045,8 +2055,8 @@ <h2 id="a-simple-feedforward-qnn-structure">A simple feedforward QNN structure <
 <p>
 <ol>
 <p><li> Embedding Layer: Convert \( \mathbf{x} \) to \( |0\rangle^{\otimes n} \) via \( U(\mathbf{x}) \).</li>
-<p><li> Variational Layers: Repeat \( L \) blocks of parameterized gates \( W(\boldsymbol\theta^{(l)}) \) (each block may act on all or subsets of qubits).</li>
-<p><li> Measurement: Measure selected qubits or observables to obtain the output predictions \( f(\mathbf{x};\boldsymbol\theta) \).</li>
+<p><li> Variational Layers: Repeat \( L \) blocks of parameterized gates \( W(\boldsymbol\Theta^{(l)}) \) (each block may act on all or subsets of qubits).</li>
+<p><li> Measurement: Measure selected qubits or observables to obtain the output predictions \( f(\mathbf{x};\boldsymbol\Theta) \).</li>
 </ol>
 </div>
 </section>
@@ -2055,8 +2065,8 @@ <h2 id="a-simple-feedforward-qnn-structure">A simple feedforward QNN structure <
 <h2 id="example">Example </h2>
 
 <p>For example, a 2-layer QNN on 2 qubits might apply encoding
-\( R_x(x_1)\otimes R_x(x_2) \), then apply \( W(\theta^{(1)}) \), then again
-encoding (or not), then \( W(\theta^{(2)}) \), and finally measure. In
+\( R_x(x_1)\otimes R_x(x_2) \), then apply \( W(\Theta^{(1)}) \), then again
+encoding (or not), then \( W(\Theta^{(2)}) \), and finally measure. In
 classification tasks, one typically assigns a label based on the sign
 of \( \langle Z\rangle \) or uses multiple measurements for multi-class
 outputs.
@@ -2074,19 +2084,19 @@ <h2 id="example">Example </h2>
 <section>
 <h2 id="training-output-and-cost-loss-function">Training output and Cost/Loss-function </h2>
 
-<p>Given a QNN with output \( f(\mathbf{x};\boldsymbol\theta) \) (a real
+<p>Given a QNN with output \( f(\mathbf{x};\boldsymbol\Theta) \) (a real
 number or vector of real values), one must define a loss function to
 train on data. Common choices are the mean squared error (MSE) for
 regression or cross-entropy for classification.  For a training set
 \( {\mathbf{x}i,y_i} \), the MSE cost/loss-function is
 </p>
 <p>&nbsp;<br>
 $$
-C(\boldsymbol\theta) = \frac{1}{N} \sum_{i=1}^N \bigl(f(\mathbf{x}i;\boldsymbol\theta) - y_i\bigr)^2.
+C(\boldsymbol\Theta) = \frac{1}{N} \sum_{i=1}^N \bigl(f(\mathbf{x}i;\boldsymbol\Theta) - y_i\bigr)^2.
 $$
 <p>&nbsp;<br>
 
-<p>One then computes gradients \( \nabla{\boldsymbol\theta}C \) and updates
+<p>One then computes gradients \( \nabla{\boldsymbol\Theta}C \) and updates
 parameters via gradient descent or other optimizers.
 </p>
 </section>
@@ -2095,7 +2105,7 @@ <h2 id="training-output-and-cost-loss-function">Training output and Cost/Loss-fu
 <h2 id="exampe-variational-classifier">Exampe: Variational Classifier </h2>
 
 <p>A binary classifier can output
-\( f(\mathbf{x};\boldsymbol\theta)=\langle Z_0\rangle \) on qubit 0, and
+\( f(\mathbf{x};\boldsymbol\Theta)=\langle Z_0\rangle \) on qubit 0, and
 predict label \( +1 \) if \( f\ge0 \), else \( -1 \).
 </p>
 </section>
@@ -2136,15 +2146,15 @@ <h2 id="training-qnns-and-loss-landscapes">Training QNNs and Loss Landscapes </h
 <section>
 <h2 id="gradient-computation">Gradient Computation </h2>
 
-<p>Gradients \( \partial f/\partial\theta_j \) are obtained using the parameter-shift rule.  For many gates \( e^{-i\theta P/2} \) (with \( P \) a Pauli), one can compute</p>
+<p>Gradients \( \partial f/\partial\Theta_j \) are obtained using the parameter-shift rule.  For many gates \( e^{-i\Theta P/2} \) (with \( P \) a Pauli), one can compute</p>
 <p>&nbsp;<br>
 $$
-\frac{\partial}{\partial\theta}\langle B\rangle
-= \frac{1}{2}\Bigl[\langle B\rangle_{\theta+\pi/2} - \langle B\rangle_{\theta-\pi/2}\Bigr],
+\frac{\partial}{\partial\Theta}\langle B\rangle
+= \frac{1}{2}\Bigl[\langle B\rangle_{\Theta+\pi/2} - \langle B\rangle_{\Theta-\pi/2}\Bigr],
 $$
 <p>&nbsp;<br>
 
-<p>where \( \langle B\rangle_{\theta\pm\pi/2} \) are expectation values
+<p>where \( \langle B\rangle_{\Theta\pm\pi/2} \) are expectation values
 evaluated at shifted parameter values.  This formula allows exact
 gradients by two circuit evaluations per parameter (independent of
 circuit size).  PennyLane automatically applies parameter-shift rule
@@ -2183,7 +2193,7 @@ <h2 id="barren-plateaus">Barren Plateaus </h2>
 <section>
 <h2 id="cost-loss-landscape-visualization">Cost/Loss-landscape visualization </h2>
 
-<p>One can imagine the cost/loss function \( C(\boldsymbol\theta) \) over the
+<p>One can imagine the cost/loss function \( C(\boldsymbol\Theta) \) over the
 parameter space.  Unlike convex classical problems, this landscape may
 have many local minima and saddle points.  Barren plateaus correspond
 to regions where \( \nabla C\approx 0 \) almost everywhere.  Even if
@@ -2205,8 +2215,8 @@ <h2 id="cost-loss-landscape-visualization">Cost/Loss-landscape visualization </h
 <h2 id="exercises">Exercises </h2>
 
 <ol>
-<p><li> Compute a gradient by hand: For a circuit with one qubit and \( f(\theta)=\langle0|R_y(\theta)^\dagger Z R_y(\theta)|0\rangle \), use the parameter-shift rule to compute \( df/d\theta \).</li>
-<p><li> Explore barren plateaus: Numerically evaluate \( \partial f/\partial\theta \) for a simple 5-qubit random circuit as depth increases. Observe the trend of gradient norms. What does this suggest?</li>
+<p><li> Compute a gradient by hand: For a circuit with one qubit and \( f(\Theta)=\langle0|R_y(\Theta)^\dagger Z R_y(\Theta)|0\rangle \), use the parameter-shift rule to compute \( df/d\Theta \).</li>
+<p><li> Explore barren plateaus: Numerically evaluate \( \partial f/\partial\Theta \) for a simple 5-qubit random circuit as depth increases. Observe the trend of gradient norms. What does this suggest?</li>
 <p><li> Optimizer effects: Implement a small QNN (2 qubits) and train with both SGD and Adam optimizers. Compare convergence speed.</li>
 </ol>
 </section>