Skip to content

Conversation

@tanikina
Copy link
Collaborator

@tanikina tanikina commented Mar 17, 2025

This PR adds a brief project description, an illustration of the "probing" workflow and a citation for the current arXiv publication that needs to be updated after the proceedings are out.

I also updated our pre-commit config because the previous version seems to be broken due to the docstring formatting, originally we had docstring v1.4 in .pre-commit-config.yaml which results in the following error message:

    pre_commit.clientlib.InvalidManifestError: 
=====> /home/username/.cache/pre-commit/repo_d3cgo1i/.pre-commit-hooks.yaml is not a file

I also had to specify an older version of wandb-0.16.0 in requirements.txt that allows to import wandb.wandb_torch, the current one (wandb-0.19.8) does not support this, unfortunately, and the tests fail.

Another issue was caused by AttributeError: np.float_ was removed in the NumPy 2.0 release. Use np.float64 instead. that appeared when running the tests. So, for now I simply added the older version of numpy to requirements.txt, the code works with numpy==1.24.1. Or should I rather update everything to make it work with NumPy 2.0?

EDIT: Do we need to remove DFKI-related info from the setup description (i.e. where to find the data on the cluster etc.)?

Important! This repository needs to be made public, so that we can add a working link to it in the paper :)

PS: any improvement suggestions, feedback and comments are very welcome!

@tanikina tanikina requested a review from ArneBinder March 17, 2025 20:31
Copy link
Collaborator

@ArneBinder ArneBinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also updated our pre-commit config because the previous version seems to be broken due to the docstring formatting,

👍

I also had to specify an older version of wandb-0.16.0 in requirements.txt that allows to import wandb.wandb_torch, the current one (wandb-0.19.8) does not support this, unfortunately, and the tests fail.

👍

Another issue was caused by AttributeError: np.float_ was removed in the NumPy 2.0 release. Use np.float64 instead. that appeared when running the tests. So, for now I simply added the older version of numpy to requirements.txt, the code works with numpy==1.24.1. Or should I rather update everything to make it work with NumPy 2.0?

no, using numpy<2.0.0 should be fine, upgrading sounds like pain

EDIT: Do we need to remove DFKI-related info from the setup description (i.e. where to find the data on the cluster etc.)?

I would not bother

Important! This repository needs to be made public, so that we can add a working link to it in the paper :)

done

PS: any improvement suggestions, feedback and comments are very welcome!

  1. There is a command in the readme that uses a script usrun.sh but I do not see it anywhere.
  2. Would it be possible to mention the commands to execute the actual experiments? Considering the current readme it is not really clear how to reproduce the results from the paper (but maybe I overlooked something).

@tanikina
Copy link
Collaborator Author

Thanks for the feedback!

  1. There is a command in the readme that uses a script usrun.sh but I do not see it anywhere.

Right, I added a short description in the README file. usrun.sh is simply a wrapper that specifies the container-mounts, container-image etc. in a separate script. One can also use a standard srun command, of course, I just copied it from my command line and forgot that it was a bit custom, sorry.

  1. Would it be possible to mention the commands to execute the actual experiments? Considering the current readme it is not really clear how to reproduce the results from the paper (but maybe I overlooked something).

This is a bit tricky since we had a LOT of different experiments that are quite well documents in both results/coref.md and log.md which are linked to each other. I added a brief explanation here. Let me know if this is sufficient or whether I should provide more precise instructions. I believe that it should be possible to reproduce our results by simply following the links in the tables from results/coref.md. All configurations are already available in configs.

@tanikina tanikina merged commit a19a333 into main Mar 19, 2025
2 checks passed
@tanikina tanikina deleted the project_description branch March 19, 2025 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants