Skip to content

Commit 895a2f1

Browse files
committed
updates week 42
1 parent f2a9b51 commit 895a2f1

File tree

4 files changed

+4101
-0
lines changed

4 files changed

+4101
-0
lines changed
Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
TITLE: Project 2 on Machine Learning, deadline November 4 (Midnight)
2+
AUTHOR: "Data Analysis and Machine Learning FYS-STK3155/FYS4155":"http://www.uio.no/studier/emner/matnat/fys/FYS3155/index-eng.html" {copyright, 1999-present|CC BY-NC} at Department of Physics, University of Oslo, Norway
3+
DATE: today
4+
5+
6+
===== Classification and Regression, from linear and logistic regression to neural networks =====
7+
8+
The main aim of this project is to study both classification and
9+
regression problems by developing our own feed-forward neural network
10+
(FFNN) code. We can reuse the regression algorithms studied in project
11+
1. We will also include logistic regression for classification
12+
problems and write our own FFNN code for studying both regression and
13+
classification problems. The codes developed in project 1, including
14+
bootstrap _and/or_ cross-validation as well as the computation of the
15+
mean-squared error and/or the $R2$ or the accuracy score
16+
(classification problems) functions can also be utilized in the
17+
present analysis.
18+
19+
20+
The data sets that we propose here are (the default sets)
21+
22+
* Regression (fitting a continuous function). In this part you will need to bring back your results from project 1 and compare these with what you get from your Neural Network code to be developed here. The data sets could be
23+
o A simple one-dimensional function or the Franke function or the terrain data from project 1, or data sets your propose. It could be a simpler function than the Franke function. We recommend testing a simpler function (see below). But if you wish to try more complex function, feel free to do so.
24+
* Classification. Here you will also need to develop a Logistic regression code that you will use to compare with the Neural Network code. The data set we propose are the so-called "Wisconsin Breat Cancer Data":"https://www.kaggle.com/uciml/breast-cancer-wisconsin-data" data set of images representing various features of tumors. A longer explanation with links to the scientific literature can be found at the "Machine Learning repository of the University of California at Irvine":"https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29". Feel free to consult this site and the pertinent literature.
25+
You can find more information about this at the "Scikit-Learn site":"https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html" or at the "University of California at Irvine":"https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)".
26+
27+
28+
However, if you would like to study other data sets, feel free to
29+
propose other sets. What we list here are mere suggestions from our
30+
side. If you opt for another data set, consider using a set which has
31+
been studied in the scientific literature. This makes it easier for
32+
you to compare and analyze your results. Comparing with existing
33+
results from the scientific literature is also an essential element of
34+
the scientific discussion. The University of California at Irvine
35+
with its Machine Learning repository at
36+
URL:"https://archive.ics.uci.edu/ml/index.php" is an excellent site to
37+
look up for examples and
38+
inspiration. "Kaggle.com":"https://www.kaggle.com/" is an equally
39+
interesting site. Feel free to explore these sites.
40+
41+
42+
We will start with a regression problem and we will reuse our codes from project 1 starting with writing our own Stochastic Gradient Descent (SGD) code.
43+
44+
=== Part a): Write your own Stochastic Gradient Descent code, first step ===
45+
46+
In order to get started, we will now replace in our standard ordinary
47+
least squares (OLS) and Ridge regression codes (from project 1) the
48+
matrix inversion algorithm with our own gradient descent (GD) and SGD
49+
codes. You can use the Franke function or the terrain data from
50+
project 1. _However, we recommend using a simpler function like_
51+
$f(x)=a_0+a_1x+a_2x^2$ or higher-order one-dimensional polynomials.
52+
You can obviously test your final codes against for example the Franke
53+
function.
54+
55+
The exercise set for week 41 should help in solving this part of the project.
56+
57+
58+
You should include in your analysis of the GD and SGD codes the following elements
59+
o A plain gradient descent with a fixed learning rate (you will need to tune it) using the analytical expression for the gradient.
60+
o Add momentum to the plain GD code and compare convergence with a fixed learning rate (you may need to tune the learning rate). Keep using the analytical expression for the gradient.
61+
o Repeat these steps for stochastic gradient descent with mini batches and a given number of epochs. Use a tunable learning rate as discussed in the lectures from weeks 39 and 40. Discuss the results as functions of the various parameters (size of batches, number of epochs etc). Use the analytical gradient.
62+
o Implement the Adagrad method in order to tune the learning rate. Do this with and without momentum for plain gradient descent and SGD.
63+
o Add RMSprop and Adam to your library of methods for tuning the learning rate.
64+
The lecture notes from "weeks 39 and 40 contain more
65+
details":"https://compphysics.github.io/MachineLearning/doc/pub/week39/html/week39.html" and code examples. Feel free to use these examples.
66+
o Replace thereafter your analytical gradient with either _Autograd_ or _JAX_
67+
68+
_Feel free to use codes on these methods from the lecture notes from week 39 and week 40_.
69+
70+
71+
In summary, you should
72+
perform an analysis of the results for OLS and Ridge regression as
73+
function of the chosen learning rates, the number of mini-batches and
74+
epochs as well as algorithm for scaling the learning rate. You can
75+
also compare your own results with those that can be obtained using
76+
for example _Scikit-Learn_'s various SGD options. Discuss your
77+
results. For Ridge regression you need now to study the results as functions of the hyper-parameter $\lambda$ and
78+
the learning rate $\eta$. Discuss your results.
79+
80+
You will need your SGD code for the setup of the Neural Network and
81+
Logistic Regression codes. You will find the Python "Seaborn
82+
package":"https://seaborn.pydata.org/generated/seaborn.heatmap.html"
83+
useful when plotting the results as function of the learning rate
84+
$\eta$ and the hyper-parameter $\lambda$ when you use Ridge
85+
regression. Since you will use different gradient descent methods, you can also add Lasse regression. This is however optional. How to code Lasso regression is discussed in the lecture notes from week 40.
86+
87+
We recommend reading chapter 8 on optimization from the textbook of Goodfellow, Bengio and Courville at URL:"https://www.deeplearningbook.org/". This chapter contains many useful insights and discussions on the optimization part of machine learning.
88+
89+
=== Part b): Writing your own Neural Network code ===
90+
91+
Your aim now, and this is the central part of this project, is to
92+
write your own Feed Forward Neural Network code implementing the back
93+
propagation algorithm discussed in the lecture slides from "week 41":"https://compphysics.github.io/MachineLearning/doc/pub/week41/ipynb/week41.ipynb" and
94+
"week 42":"https://compphysics.github.io/MachineLearning/doc/pub/week42/ipynb/week42.ipynb".
95+
96+
We will focus on a regression problem first and study either the simple second-order polynomial from part a) or the
97+
Franke function or terrain data (or both or other data sets) from
98+
project 1.
99+
100+
Discuss again your choice of cost function.
101+
102+
Write an FFNN code for regression with a flexible number of hidden
103+
layers and nodes using the Sigmoid function as activation function for
104+
the hidden layers. Initialize the weights using a normal
105+
distribution. How would you initialize the biases? And which
106+
activation function would you select for the final output layer?
107+
108+
Train your network and compare the results with those from your OLS and Ridge Regression codes from project 1 if you use the Franke function or the terrain data.
109+
You should test your results against a similar code using _Scikit-Learn_ (see the examples in the above lecture notes from weeks 41 and 42) or _tensorflow/keras_ or _Pytorch_ (for Pytorch, see Raschka et al.'s text chapters 12 and 13).
110+
111+
Comment your results and give a critical discussion of the results
112+
obtained with the Linear Regression code and your own Neural Network
113+
code.
114+
Make an analysis of the regularization parameters and the learning rates employed to find the optimal MSE and $R2$ scores.
115+
116+
A useful reference on the back progagation algorithm is Nielsen's book at URL:"http://neuralnetworksanddeeplearning.com/". It is an excellent
117+
read.
118+
119+
120+
121+
=== Part c): Testing different activation functions ===
122+
123+
You should now also test different activation functions for the hidden layers. Try out the Sigmoid, the RELU and the Leaky RELU functions and discuss your results. You may also study the way you initialize your weights and biases.
124+
125+
=== Part d): Classification analysis using neural networks ===
126+
127+
128+
129+
With a well-written code it should now be easy to change the
130+
activation function for the output layer.
131+
132+
Here we will change the cost function for our neural network code
133+
developed in parts b) and c) in order to perform a classification analysis.
134+
135+
We will here study the Wisconsin Breast Cancer data set. This is a typical binary classification problem with just one single output, either True or Fale, $0$ or $1$ etc.
136+
You find more information about this at the "Scikit-Learn
137+
site":"https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html" or at the "University of California
138+
at Irvine":"https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)".
139+
140+
141+
To measure the performance of our classification problem we use the
142+
so-called *accuracy* score. The accuracy is as you would expect just
143+
the number of correctly guessed targets $t_i$ divided by the total
144+
number of targets, that is
145+
146+
147+
!bt
148+
\[
149+
\text{Accuracy} = \frac{\sum_{i=1}^n I(t_i = y_i)}{n} ,
150+
\]
151+
!et
152+
153+
where $I$ is the indicator function, $1$ if $t_i = y_i$ and $0$
154+
otherwise if we have a binary classification problem. Here $t_i$
155+
represents the target and $y_i$ the outputs of your FFNN code and $n$ is simply the number of targets $t_i$.
156+
157+
Discuss your results and give a critical analysis of the various parameters, including hyper-parameters like the learning rates and the regularization parameter $\lambda$ (as you did in Ridge Regression), various activation functions, number of hidden layers and nodes and activation functions.
158+
159+
160+
As stated in the introduction, it can also be useful to study other
161+
datasets.
162+
163+
Again, we strongly recommend that you compare your own neural Network
164+
code for classification and pertinent results against a similar code using _Scikit-Learn_ or _tensorflow/keras_ or _pytorch_.
165+
166+
167+
168+
169+
170+
=== Part e): Write your Logistic Regression code, final step ===
171+
172+
Finally, we want to compare the FFNN code we have developed with
173+
Logistic regression, that is we wish to compare our neural network
174+
classification results with the results we can obtain with another
175+
method.
176+
177+
Define your cost function and the design matrix before you start writing your code.
178+
Write thereafter a Logistic regression code using your SGD algorithm. You can also use standard gradient descent in this case, with a learning rate as hyper-parameter.
179+
Study the results as functions of the chosen learning rates.
180+
Add also an $l_2$ regularization parameter $\lambda$. Compare your results with those from your FFNN code as well as those obtained using _Scikit-Learn_'s logistic regression functionality.
181+
182+
The weblink here URL:"https://medium.com/ai-in-plain-english/comparison-between-logistic-regression-and-neural-networks-in-classifying-digits-dc5e85cd93c3"compares logistic regression and FFNN using the so-called MNIST data set. You may find several useful hints and ideas from this article.
183+
184+
185+
=== Part f) Critical evaluation of the various algorithms ===
186+
187+
After all these glorious calculations, you should now summarize the
188+
various algorithms and come with a critical evaluation of their pros
189+
and cons. Which algorithm works best for the regression case and which
190+
is best for the classification case. These codes can also be part of
191+
your final project 3, but now applied to other data sets.
192+
193+
194+
195+
196+
===== Background literature =====
197+
198+
o The text of Michael Nielsen is highly recommended, see Nielsen's book at URL:"http://neuralnetworksanddeeplearning.com/". It is an excellent read.
199+
200+
o Goodfellow, Bengio and Courville, Deep Learning at URL:"https://www.deeplearningbook.org/". Here we recommend chapters 6, 7 and 8
201+
202+
o Raschka et al. at URL:"https://sebastianraschka.com/blog/2022/ml-pytorch-book.html". Here we recommend chapters 11, 12 and 13.
203+
204+
===== Introduction to numerical projects =====
205+
206+
Here follows a brief recipe and recommendation on how to write a report for each
207+
project.
208+
209+
* Give a short description of the nature of the problem and the eventual numerical methods you have used.
210+
211+
* Describe the algorithm you have used and/or developed. Here you may find it convenient to use pseudocoding. In many cases you can describe the algorithm in the program itself.
212+
213+
* Include the source code of your program. Comment your program properly.
214+
215+
* If possible, try to find analytic solutions, or known limits in order to test your program when developing the code.
216+
217+
* Include your results either in figure form or in a table. Remember to label your results. All tables and figures should have relevant captions and labels on the axes.
218+
219+
* Try to evaluate the reliabilty and numerical stability/precision of your results. If possible, include a qualitative and/or quantitative discussion of the numerical stability, eventual loss of precision etc.
220+
221+
* Try to give an interpretation of you results in your answers to the problems.
222+
223+
* Critique: if possible include your comments and reflections about the exercise, whether you felt you learnt something, ideas for improvements and other thoughts you've made when solving the exercise. We wish to keep this course at the interactive level and your comments can help us improve it.
224+
225+
* Try to establish a practice where you log your work at the computerlab. You may find such a logbook very handy at later stages in your work, especially when you don't properly remember what a previous test version of your program did. Here you could also record the time spent on solving the exercise, various algorithms you may have tested or other topics which you feel worthy of mentioning.
226+
227+
228+
229+
230+
231+
232+
===== Format for electronic delivery of report and programs =====
233+
234+
The preferred format for the report is a PDF file. You can also use DOC or postscript formats or as an ipython notebook file. As programming language we prefer that you choose between C/C++, Fortran2008 or Python. The following prescription should be followed when preparing the report:
235+
236+
* Use Canvas to hand in your projects, log in at URL:"https://www.uio.no/english/services/it/education/canvas/" with your normal UiO username and password.
237+
238+
* Upload _only_ the report file or the link to your GitHub/GitLab or similar typo of repos! For the source code file(s) you have developed please provide us with your link to your GitHub/GitLab or similar domain. The report file should include all of your discussions and a list of the codes you have developed. Do not include library files which are available at the course homepage, unless you have made specific changes to them.
239+
240+
* In your GitHub/GitLab or similar repository, please include a folder which contains selected results. These can be in the form of output from your code for a selected set of runs and input parameters.
241+
242+
243+
Finally,
244+
we encourage you to collaborate. Optimal working groups consist of
245+
2-3 students. You can then hand in a common report.
246+
247+
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/bin/sh
2+
doconce clean
3+
rm -rf *.pdf *.tex ipynb*.tar.gz *.html ._*.html *~ reveal.js Trash README.txt
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
#!/bin/sh
2+
set -x
3+
4+
function system {
5+
"$@"
6+
if [ $? -ne 0 ]; then
7+
echo "make.sh: unsuccessful command $@"
8+
echo "abort!"
9+
exit 1
10+
fi
11+
}
12+
13+
if [ $# -eq 0 ]; then
14+
echo 'bash make.sh slides1|slides2'
15+
exit 1
16+
fi
17+
18+
name=$1
19+
rm -f *.tar.gz
20+
21+
opt="--encoding=utf-8"
22+
opt=
23+
24+
rm -f *.aux
25+
26+
27+
28+
# Plain HTML documents
29+
html=${name}
30+
system doconce format html $name --pygments_html_style=default --html_style=bloodish --html_links_in_new_window --html_output=$html $opt
31+
system doconce split_html $html.html --method=space10
32+
33+
# Bootstrap style
34+
html=${name}-bs
35+
system doconce format html $name --html_style=bootstrap --pygments_html_style=default --html_admon=bootstrap_panel --html_output=$html $opt
36+
system doconce split_html $html.html --method=split --pagination --nav_button=bottom
37+
38+
# IPython notebook
39+
system doconce format ipynb $name $opt
40+
41+
42+
# Ordinary plain LaTeX document
43+
system doconce format pdflatex $name --print_latex_style=trac --latex_admon=paragraph $opt
44+
system doconce ptex2tex $name envir=verbatim
45+
# Add special packages
46+
doconce subst "% Add user's preamble" "\g<1>\n\\usepackage{simplewick}" $name.tex
47+
doconce replace 'section{' 'section*{' $name.tex
48+
pdflatex -shell-escape $name
49+
pdflatex -shell-escape $name
50+
mv -f $name.pdf ${name}.pdf
51+
cp $name.tex ${name}.tex
52+
53+
# Publish
54+
dest=../../../../Projects/2024
55+
if [ ! -d $dest/$name ]; then
56+
mkdir $dest/$name
57+
mkdir $dest/$name/pdf
58+
mkdir $dest/$name/html
59+
mkdir $dest/$name/ipynb
60+
fi
61+
cp ${name}*.tex $dest/$name/pdf
62+
cp ${name}*.pdf $dest/$name/pdf
63+
cp -r ${name}*.html ._${name}*.html $dest/$name/html
64+
65+
# Figures: cannot just copy link, need to physically copy the files
66+
if [ -d fig-${name} ]; then
67+
if [ ! -d $dest/$name/html/fig-$name ]; then
68+
mkdir $dest/$name/html/fig-$name
69+
fi
70+
cp -r fig-${name}/* $dest/$name/html/fig-$name
71+
fi
72+
73+
cp ${name}.ipynb $dest/$name/ipynb
74+
ipynb_tarfile=ipynb-${name}-src.tar.gz
75+
if [ ! -f ${ipynb_tarfile} ]; then
76+
cat > README.txt <<EOF
77+
This IPython notebook ${name}.ipynb does not require any additional
78+
programs.
79+
EOF
80+
tar czf ${ipynb_tarfile} README.txt
81+
fi
82+
cp ${ipynb_tarfile} $dest/$name/ipynb
83+
84+
85+
86+
87+

0 commit comments

Comments
 (0)