Questions about train.py

### Issue type
Need help

### Summary
Some functions in `/cellbox/train.py` have some ambiguity in what task they perform. These are crucial to understand to reproduce similar results for Pytorch version of CellBox. Therefore, this issue is for resolving the ambiguity.

### Details
* [Line 76 to 79](https://github.com/sanderlab/CellBox/blob/master/cellbox/cellbox/train.py#L76-L79) in `train.py`, are `loss_valid_i` and `loss_valid_mse_i` evaluated on one random batch fetched from `args.feed_dicts['valid_set']`, or are these losses evaluated on the whole validation set?
* The `eval_model` function returns different values with different calls. At [line 101 to 103](https://github.com/sanderlab/CellBox/blob/master/cellbox/cellbox/train.py#L101-L103), it returns both the total and mse loss for `args.n_batches_eval` number of batches on the validation set. At [line 109 to 111](https://github.com/sanderlab/CellBox/blob/master/cellbox/cellbox/train.py#L109-L111), it returns only the mse loss for `args.n_batches_eval` number of batches on the test set. And at [line 262](https://github.com/sanderlab/CellBox/blob/master/cellbox/cellbox/train.py#L262) it returns the expression predictions `y_hat` for the whole test set. Are all of these statements correct?
* The `record_eval.csv` file generated after training, using the default training arguments and config file as specified in the README (`python scripts/main.py -config=configs/Example.random_partition.json`), has `test_mse` column to be None. Is it the expected behaviour of the code?
* `random_pos.csv`, generated after training, stores the index of the perturbation conditions. Does it indicate how the conditions for training, validation, and testing are split?
* After each substage, say substage 6, the code generates `6_best.y_hat.loss.csv`, containing the expression prediction for perturbation conditions in the test set for all nodes, but it does not indicate which row in this file corresponds to which perturbation condition. How is this file and `random_pos.csv` related?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about train.py #54

Issue type

Summary

Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Questions about train.py #54

Description

Issue type

Summary

Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions