This is my repository for the M1 coursework. The code for this project is in the Jupyter Notebooks in the root directory of this repository.
- notebook1.ipynb contains the code for Question 1 and 2 of the coursework.
- notebook2.ipynb contains the code for Question 3 of the coursework.
- notebook3.ipynb contains the code for Question 4 of the coursework.
- notebook4.ipynb contains the code for Question 5 of the coursework.
The report for this project is in pdf format and is located in the report directory.
Note that for Question 2, the models/weights for the best performing neural network are saved in best_models/model_945.keras and best_models/model_945.weights.h5. The other models in the plots for Question 2 are saved in the init_models directory.
Clone this GitLab repository to your local machine.
git clone https://gitlab.developers.cam.ac.uk/phy/data-intensive-science-mphil/assessments/m1_coursework/fm565.gitCreate a conda environment by running:
conda env create -f environment.ymlin the root directory of this repository.
This will create a new conda environment called M1Coursework. This will install the necessary packages for this project, listed in the requirements.txt file.
Activate the environment by running:
conda activate M1CourseworkThis may automatically create a Jupyter Kernel for the new environment. If not, you can create a kernel manually e.g.
python -m ipykernel install --user --name M1Coursework --display-name "M1Coursework (Python 3.11)"You should now be able to run the notebooks in this repository.
To deactivate the conda environment, run conda deactivate.
Microsoft Copilot was used in the following cases:
- In creating a train-validation-test split, Copilot suggested splitting into a 'train' and 'temp' set, and then splitting the 'temp' data into 'validation' and 'test' sets. I then implemeted this manually.
- In understanding how to set the right random seeds to ensure reproducbility (e.g. using random, numpy, tensorflow).
- In applying t-SNE to the embedding layer, Copilot suggested a way of constructing a model where the output is the penultimate layer of a neural network. I then implemented this manually for the best performing neural network I had previously saved.
A declaration of the use of generative tools in writing the report is given in the report itself.