-
Notifications
You must be signed in to change notification settings - Fork 1
Diffusion model #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diffusion model #64
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #64 +/- ##
===========================================
+ Coverage 97.58% 97.83% +0.25%
===========================================
Files 13 16 +3
Lines 663 786 +123
===========================================
+ Hits 647 769 +122
- Misses 16 17 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I realise I should probably not allow the inputs to be |
rozyczko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, but nothing serious enough to not approve.
| if isinstance(Q, Numeric): | ||
| Q = np.array([Q]) | ||
|
|
||
| if isinstance(Q, list): | ||
| Q = np.array(Q) | ||
|
|
||
| if not isinstance(Q, np.ndarray): | ||
| raise TypeError("Q must be a number, list, or numpy array.") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copy of the check in calculate_width - consider refactoring out.
Something like this maybe?
def _validate_and_convert_Q(self, Q) -> np.ndarray:
if isinstance(Q, Number):
Q = np.array([Q])
if isinstance(Q, list):
Q = np.array(Q)
if not isinstance(Q, np.ndarray):
raise TypeError("Q must be a number, list, or numpy array.")
return QThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And consider adding the Q.ndim check below to that as well.
| raise TypeError("diffusion_coefficient must be a number.") | ||
| self._diffusion_coefficient.value = diffusion_coefficient | ||
|
|
||
| def calculate_width(self, Q: np.ndarray) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q is type-hinted as np.ndarray, but the code manually handles Numeric and list.
Maybe accept ArrayLike to cover both lists and arrays? Then simply convert to np.array at the start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll eventually also want to accept sc.Variable, but I didn't want to deal with all the checks and conversions
| "angstrom": self._angstrom, | ||
| } | ||
|
|
||
| def __repr__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can easily be in the base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dependency map expression is different for diferent models, and same for the __repr__ - other models have more Parameters than just D and scale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but in case the child doesn't override __repr__ we probably should have some default way of showing what the model is.
| ------- | ||
| str or sc.Unit or None | ||
| """ | ||
| return self._unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This differs from def unit in ModelComponent. Consider standardizing.
Are we returning sc.Unit/str or just str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the str, thanks for catching it
| diffusion_coefficient : float or Parameter, optional | ||
| Diffusion coefficient D. If a number is provided, it is assumed to be in the unit given by diffusion_unit. Defaults to 1.0. | ||
| diffusion_unit : str, optional | ||
| Unit for the diffusion coefficient D. Default is "meV*Å**2". Options are 'meV*Å**2' or 'm**2/s' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. the current default is m**2/s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops
| def _write_width_dependency_map_expression(self) -> Dict[str, str]: | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value is not Dict[str, str], since self.diffusion_coefficient is a Parameter and self._hbar and self._angstrom are Descriptors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I blame copilot and myself for trusting copilot
| for Q_value in Q: | ||
| # Q is given as a float, so we need to divide by angstrom**2 to get the right units | ||
| width = ( | ||
| self._hbar | ||
| * self.diffusion_coefficient | ||
| * Q_value**2 | ||
| / (self._angstrom**2) | ||
| ) | ||
| width.convert_unit(self.unit) | ||
| width_list.append(width.value) | ||
| width = np.array(width_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really should be able to vectorize this.
This would be awesome to have:
width_var = (
self._hbar
* self.diffusion_coefficient
* Q_arr**2
/ (self._angstrom**2)
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be awesome. But to do that we need to extend our arithmetic dunder methods to allow arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's have it done in the core...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit_conversion_factor = (
self._hbar * self.diffusion_coefficient / (self._angstrom**2)
)
unit_conversion_factor.convert_unit(self.unit)
width = Q**2 * unit_conversion_factor.value
This may be more complicated with other diffusion models, but at least here it's simple.
damskii9992
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor comments, was one of your easier PRs to review :)
| "DeltaFunction", | ||
| "DampedHarmonicOscillator", | ||
| "Polynomial", | ||
| "BrownianTranslationalDiffusion", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, maybe you don't want everything exposed at the top-level?
More obscure/rare functionality makes sense to require a full path to import, just like how it makes sense to import your components from the components group:
from easydynamic.sample_model.components import Gaussian, LorentzianRather than:
from easydynamics import GaussianOr whatever your modules are called. Just a note for consideration :)
| raise TypeError("diffusion_coefficient must be a Parameter or a number.") | ||
|
|
||
| if not isinstance(diffusion_unit, str): | ||
| raise TypeError("diffusion_unit must be 'meV*Å**2' or 'm**2/s'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message here should be: "diffusion_unit must be a valid string, such as: 'meV*Å2' or 'm2/s'."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well right now it's only allowed to be one of those two specific strings :)
| raise ValueError("diffusion_unit must be 'meV*Å**2' or 'm**2/s'.") | ||
|
|
||
| if not isinstance(scale, Parameter): | ||
| scale = Parameter(name="scale", value=float(scale), fixed=False, min=0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you allow users to pass Parameters as well, then you also need to check if the given parameter has the correct unit. And maybe also "min" in this case?
And the same for the diffusion_cofficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm filing this under the same "don't allow parameters" issue as the components. I might just remove it since I don't think I need it.
| """ | ||
|
|
||
| if not (unit is None or isinstance(unit, (str, sc.Unit))): | ||
| raise TypeError("unit must be None, a string, or a scipp Unit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not enough to check its type, you also need to check if this is actually a valid unit. Otherwise passing stuff like: "test" would work, even though that is clearly not a valid unit. Consider adding a:
try:
test = Descriptor(unit=unit)
test.to('meV')
except exception as e:
raise UnitError('Not a valid unit etc.) from e| unit: str | sc.Unit = "meV", | ||
| scale: Parameter | Numeric = 1.0, | ||
| diffusion_coefficient: Parameter | Numeric = 1.0, | ||
| diffusion_unit: str = "m**2/s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit confusing that diffusion_unit cannot be a sc.Unit, I would standardize, either always allow both, or always only allow a string.
And this is of course not just for this PR, but throughout your codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess it can be a sc.Unit. The issue is that units are a mess with diffusion models since people (myelf included) tend to absorb hbar here and there and not be explicit about it.
| if isinstance(Q, Numeric): | ||
| Q = np.array([Q]) | ||
|
|
||
| if isinstance(Q, list): | ||
| Q = np.array(Q) | ||
|
|
||
| if not isinstance(Q, np.ndarray): | ||
| raise TypeError("Q must be a number, list, or numpy array.") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And consider adding the Q.ndim check below to that as well.
| QISF = self.calculate_QISF(Q) | ||
|
|
||
| # Create a Lorentzian component for each Q-value, with width D*Q^2 and area equal to scale. No delta function, as the EISF is 0. | ||
| for i in range(len(Q)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use enumerate instead.
| Write the dependency expression for the width as a function of Q to make dependent Parameters. | ||
| """ | ||
| if not isinstance(Q, (float)): | ||
| raise TypeError("Q must be a float.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't REALLY need checks on internal functions inputs, as you yourself is the only user (or other developers).
| ) | ||
|
|
||
| # Resolving the dependency can do weird things to the units, so we make sure it's correct. | ||
| lorentzian_component.width.convert_unit(self.unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhh, if resolving the dependency makes the unit weird, this line only solves it temporarily. As soon as the fit runs, this unit will be weird again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Halp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can add the convert_unit to the dependency expression? That way it will be evaluated everytime the dependency is updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to make a unit test to check if this works though. So make a diffusion_model and its component collection, then update the value of the diffusion_constant, and then check the unit/value of the component_collections Lorentzian.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is a serious problem. The dependency expression is
f"hbar * D* {Q} **2*1/(angstrom**2)"
hbar is in mev*s, D is in m**2/s and angstrom is in m**2, so the resulting unit should be meV. However, it becomes J. This is easy to fix with convert_unit, but of course problematic if it changes every fit. Does this mean I need to rethink how I do the dependency, or do we need to add a unit option to dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem to work. I tried
return f"(hbar * D* {Q} **2*1/(angstrom**2)).convert_unit('meV')"
But get
A Parameter with unique_name meV does not exist. Please check your dependency expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in a string is assumed to be a unique name by the parser, which makes sense. It seems the best solution is to allow specifying a desired unit for dependencies. I'll bring it up at standup tomorrow :)
| "angstrom": brownian_diffusion_model._angstrom, | ||
| } | ||
|
|
||
| assert expression_map == expected_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't test internal methods unless you're afraid that they will break. Internal methods are implementation details so such tests will test your implementation rather than your functionality.
Remember, a 100% tested codebase is over-tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should very much test internal methods, but preferably through the external API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm doing both :) We can always delete the test if they get annoying, but last time I added a test to go from 98 to 100% coverage I found a bug, so even though I agree in principle, I fear I need to over-test my code at least for the time being.
Basic diffusion model.
I've kept it a bit simpler than some of the other PRs in terms of allowed inputs, to get things flowing faster.
Once the overall structure is approved, I have ~6 more diffusion models to implement following the same style.