-
Notifications
You must be signed in to change notification settings - Fork 97
Combined multifluid derivatives to reduce fluid kernel wrapper size #1779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
😨 🎉 👍 🥇 |
TotoGaz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job so far, can't wait to see if anything can be done about #1225 (comment) 🤞
src/coreComponents/physicsSolvers/multiphysics/MultiphasePoromechanicsKernel.hpp
Show resolved
Hide resolved
| { | ||
| dPhaseVolFrac_dComp[ip][jc] = | ||
| (dPhaseFrac_dComp[ip][jc] - phaseVolFrac[ip] * dPhaseDens_dComp[ip][jc]) * phaseDensInv; | ||
| (dPhaseFrac[ip][Deriv::dC+jc] - phaseVolFrac[ip] * dPhaseDens[ip][Deriv::dC+jc]) * phaseDensInv; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet this Deriv::dC+jc is a trick to put keep the same dimension for all your arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the current version of the PR, the derivatives are grouped and ordered as follows:
- 0: derivative wrt pressure
- 1: derivative wrt temperature
- 2: first derivative wrt component fraction
With this approach, there is only one array storing derivatives for a given physical quantity. For instance, for the phase component fractions (of size: numPhases x numComps), there is one array of derivatives of size numPhases x numComps x numDofs, with numDofs = numComps + 2.
|
The code in the PR is now working and ready for review. The new implementation is a little more difficult to read than before and introduces a different treatment between the derivatives living in the solver ( I would be glad to try implementing this approach: #1225 (comment) and #1225 (comment), if we want something cleaner. |
klevzoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that that, in addition to solving a technical problem, this PR is a net decrease in code size, and readability is not affected much. In fact, I'm glad to get rid of dX_dY names - there was always something unsettling about those underscores violating our naming rules.
| internal::ArraySliceOrRef< T, DIM, USD > dPres; /// derivative w.r.t. pressure | ||
| internal::ArraySliceOrRef< T, DIM, USD > dTemp; /// derivative w.r.t. temperature | ||
| internal::ArraySliceOrRef< T, DIM + 1, USD_DC > dComp; /// derivative w.r.t. composition | ||
| internal::ArraySliceOrRef< T, DIM + 1, USD_DC > derivs; /// derivative w.r.t. pressure, temperature, compositions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where (along with View class below) where we would potentially add dP, dT and dC methods as described in #1225 (comment) and #1225 (comment) if we agree on that idea. The raw slice derivs need to remain accessible though, because it must be passed directly to applyChainRule.
This would somewhat decrease the flexibility of these classes however. For example, currently they are usable both with and without temperature derivatives - the owner decides whether or not it is present by resizing and then using the derivs array accordingly. By adding a dT() method we are forcing the temperature derivative to be always present. These classes could also be used outside of MultiFluid family, anywhere one need to bundle values with a bunch of derivatives - for example in RelativePermeabilityBase and CapillaryPressureBase, physics solvers, etc.
For that reason I'd propose a refactoring in which this family of classes is moved into some shared place (common probably?) under a different name (ValuesAndDerivatives[View|Slice]?), while here we create MultiFluid-specific wrappers that either inherit from or contain the above classes and add the specialized access methods. I would propose maybe for this to be a separate PR.
I am also not opposed to just keeping the use of Deriv::dP, etc. as explicit offsets for the foreseeable future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I perceive this pattern as really similar (in nature) to the extrinsicMeshData wrappings of #1750 (the current situation brings additional complexity).
There would be an extremely carefully designed implementation (in the current case, some CompositionalVarView, in the case of #1750, the *StencilAccessors), that may be "specialized" depending on the context. That way, if you do not want the dT, then don't ask your smart implementation for space to store dT, and don't provide the dT accessors. Therefore why would we lose flexibility?
For example, it's fine for the underlying implementation to provide variadic accessors real64 & dT( INDICES ... indices ) but providing a more constrained real64 & dT( int a, int b, int c ) member function will be simpler in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for the following plan:
- We merge this PR so that I can then finish and merge Activated thermal option in CO2-brine fluid model #1755 using the derivative format implemented in this PR
- Once Activated thermal option in CO2-brine fluid model #1755 is merged, we work on implementing the
ValuesAndDerivativesgetters for the constitutive models and solvers
Even though this plan implies a transition period during which Deriv::dP etc will be used, it will allow me to start testing #1225 soon, which is important.
Regarding the idea of implementing getters (dP(), dT(), dC()) to get the derivatives in the kernels, will it slow down the code a little bit? Or not significantly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That way, if you do not want the dT, then don't ask your smart implementation for space to store dT, and don't provide the dT accessors. Therefore why would we lose flexibility?
What I meant is, if we just added dT() getter to this class, it would become unusable in context where there is no temperature dependence. But of course as you said, if we provide appropriately specialized wrappers for different contexts, there will be no issues 👍
I would vote for the following plan:
Sounds good to me!
Regarding the idea of implementing getters (
dP(),dT(),dC()) to get the derivatives in the kernels, will it slow down the code a little bit? Or not significantly?
I expect them to be inlined 100% of the time, so no slowdown, if only a tiny compilation time hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go! 🚗 💨
I would vote for the following plan:
If I were you, I would open an issue with multiple tick boxes, references to meaningful comments, etc.
TotoGaz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving forward!
| template< typename VEC1, typename VEC2 > | ||
| GEOSX_HOST_DEVICE | ||
| void copy( integer const N, VEC1 const & v1, VEC2 const & v2 ) | ||
| void copy( integer const N, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to bother you with this, but documenting why we need this firstDeriv could be helpful (in which context we need this, e.g.)
More important, instead of providing firstDeriv, couldn't v2 be a range/slice instead? That would keep this function more mathematical and less implementation aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as we discussed, it would be nice to passe a slice or range, but I don't know how to do it.
Here for example:
https://github.com/GEOSX/GEOSX/blob/3e72aadf62b7083dd7f434052f3d90c8f3470230/src/coreComponents/constitutive/fluid/MultiFluidBase.hpp#L722
my phaseFrac.derivs[ip] is already a slice of size numComp+2, but I don't need first two entries (derivatives with respect to pressure and temperature). I would need a way to skip them, maybe by doing:
applyChainRuleInPlace( numComp, dCompMoleFrac_dCompMassFrac, phaseFrac.derivs[ip][Deriv::dC], work );
Now that I am writing this, I should probably try that to see if it works.
For now, I added the Doxygen documentation in the applyChainRule function to explain why we have to pass Deriv::dC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case v2 represents a general single-subscriptable entity, which may be a local array or an ArraySlice. The latter may be not contiguous in memory, and in that case precomputing an offset pointer is not safe or correct. So we have to rely on the type's operator[] to do the job at all times.
Theoretically it's possible to write a SubSlice type which represents a sub-selection of an ArraySlice along one or more dimensions. The type would store the offset(s) internally and apply as needed. But we don't have it yet.
| integer counter = 0; | ||
| for( integer iComp = 0; iComp < 3; ++iComp ) | ||
| { | ||
| for( localIndex iPres = 0; iPres < 3; ++iPres ) | ||
| for( integer iPres = 0; iPres < 3; ++iPres ) | ||
| { | ||
| for( localIndex iTemp = 0; iTemp < 3; ++iTemp ) | ||
| for( integer iTemp = 0; iTemp < 3; ++iTemp ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice that you are changing these localIndex into integer but I do think that you are wasting your energy.
If we really want to do this, then a carefully written search/replace on the whole code base will do the duty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I heard the discussion this morning about integer loops but I still do not fully understand. Take this loop
for( integer ip = 0; ip < numPhase; ++ip )
{
phaseMolecularWeight[ip] = props.getMolecularWeight( m_phaseTypes[ip] ).value;
phaseMolecularWeight[ip][Deriv::dP] = props.getMolecularWeight( m_phaseTypes[ip] ).dP;
[...]I thing the indexing type of phaseMolecularWeight or phaseMolecularWeight is localIndex. So what is the point at using integer instead of phaseMolecularWeight::IndexType which appear to be a localIndex? Performances really get improved there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this is not about performance but semantics. A localIndex is a counterpart to globalIndex where "local" and "global" refer, generally speaking, to parallel partition of the physical domain and mesh. Therefore these terms make sense when indexing elements, nodes, faces, etc. which can be either local or global. Model properties like number of phases or components are in a different category - they are much smaller, and the notion of local/global does not apply to them. Ideally we could introduce separate aliases for them, e.g.
using phaseIndex = /* some type */;
using componentIndex = /* some type */;
but I thought this might be a bit overkill. Therefore I started changing, and suggesting to others in reviews (mainly @francoishamon who did most of the work), those indices to a generic integer. As a secondary concern, we also get to save some bytes in kernel wrapper objects, which sometimes need to carry phase/component indices as members. But it's a really minor saving compared to what this PR achieves at scale.
Now, phaseMolecularWeight::IndexType is indeed currently a localIndex. But that is also wrong in my opinion! For certain "small" arrays (e.g. phaseMolecularWeight or componentMolecularWeight) we could be using a smaller type such as integer as index type, and it would also help a bit with reducing the size of kernel arguments. I don't advocate necessarily that we implement this change, because the increasing complexity in the type system has cognitive costs (there would be array1d<T> and some array1dSmallIndex<T> to keep in mind). But neither am I concerned about integer promotions taking place, just like I'd never force someone to change
std::vector< double > v;
v[0] ...to
v[ std::vector< double >::size_type(0) ] ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for the explanations. Some kind of Hungarian stuff on types, in a nutshell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can think of it that way. A type name that conveys meaning makes a certain type of errors less likely. But like everything (and like Hungarian), it's good in moderation.
211fda5 to
b9b88e3
Compare
As suggested here, this PR combines the multifluid derivatives (pressure, temperature, component fractions) into a single array to reduce the size of the fluid kernel wrapper and make sure that we can compile on GPU platforms in the thermal PRs #1755 and #1225.
I am getting the following result here:
sizeof(fluidWrapper)= 3096 bytessizeof(fluidWrapper)= 2136 bytesSo I am confident that I will be able to compile with the thermal option activated.
Requires rebaseline, but no numerical diffs: https://github.com/GEOSX/integratedTests/pull/193