Skip to content

Gdcm update 2022 01 05#3042

Closed
seanm wants to merge 2 commits intoInsightSoftwareConsortium:masterfrom
seanm:GDCM-update-2022-01-05
Closed

Gdcm update 2022 01 05#3042
seanm wants to merge 2 commits intoInsightSoftwareConsortium:masterfrom
seanm:GDCM-update-2022-01-05

Conversation

@seanm
Copy link
Copy Markdown
Contributor

@seanm seanm commented Jan 5, 2022

No description provided.

GDCM Upstream and others added 2 commits January 5, 2022 16:05
Code extracted from:

    https://github.com/malaterre/GDCM.git

at commit 17ec5d668eca38e13a33555b6396de61a7125343 (release).
# By GDCM Upstream
* upstream-GDCM:
  GDCM 2022-01-05 (17ec5d66)
@github-actions github-actions Bot added area:ThirdParty Issues affecting the ThirdParty module type:Data Changes to testing data type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots labels Jan 5, 2022
@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 5, 2022

This will fix some of the few remaining -Wreserved-id-macro warnings.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Jan 5, 2022

I guess we need to increase allowed file sizes here and here. While at it, we should probably clean up main .gitattributes.

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 5, 2022

Which exactly to update though? For example, one error message says commit c1974cb creates blob 0e6f95c9ebff44cde44a50323e9549552ab44d2f at Modules/ThirdParty/GDCM/src/gdcm/Source/DataDictionary/privatedicts.xml with size 952489 bytes (930.17 KiB) which is greater than the maximum size 300000 bytes (292.97 KiB). If the file is intended to be committed, set the hooks-max-size attribute on its path. But where is this 300000 maximum? It doesn't seem to be in any of 3 files you linked to.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Jan 5, 2022

Changes to the files I mentioned should be done before triggering the update. 300000 might be the default limit.

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 5, 2022

Something is still confusing though:

commit 21fbc51 creates blob 802c69d9f1a79607e3973ad9d49b48598ca983d0 at Source/InformationObjectDefinition/Part3.xml with size 2336006 bytes (2281.26 KiB) which is greater than the maximum size 1048576 bytes (1024.00 KiB). If the file is intended to be committed, set the hooks-max-size attribute on its path.

Where is this maximum size 1048576 bytes from?

The three files you mentioned have:

Source/InformationObjectDefinition/Part3.xml hooks-max-size=4336000

and

/Source/InformationObjectDefinition/Part3.xml hooks.MaxObjectKiB=4096

Both those number are already > 2336006 bytes.

Comment thread Modules/ThirdParty/GDCM/src/gdcm/Source/Common/gdcmSwapper.txx
Comment thread Modules/ThirdParty/GDCM/src/gdcm/Source/Common/gdcmSwapper.txx
issakomi referenced this pull request in malaterre/GDCM Jan 6, 2022
@Leengit
Copy link
Copy Markdown
Contributor

Leengit commented Jan 6, 2022

There's always the approach with reinterpret_cast

return *reinterpret_cast<float *>(&Swap(*reinterpret_cast<uint32_t *>(&val)));

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 6, 2022

There's always the approach with reinterpret_cast

Strictly speaking, that's UB too, which is why the memcpy dance is best.

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 6, 2022

@Leengit see for example https://adriann.github.io/undefined_behavior.html

@N-Dekker
Copy link
Copy Markdown
Contributor

N-Dekker commented Jan 6, 2022

return *reinterpret_cast<float *>(&Swap(*reinterpret_cast<uint32_t *>(&val)));

@Leengit Definitely UB, but I guess it won't compile anyway, as it tries to take the address of a temporary 😸 I mean: the r-value returned by Swap! Or did you recently overload operator& to support taking the address of an r-value? 😉

FYI https://godbolt.org/z/1j71M8e49 (gcc 11.2) says:

error: lvalue required as unary '&' operand

Update: just look here for a minute 😱😱: https://twitter.com/hankadusikova/status/1479391214659067906/photo/1 😸😸

@Leengit
Copy link
Copy Markdown
Contributor

Leengit commented Jan 6, 2022

As @issakomi mentions, we need a second memcpy, yes? As in:

template <>
inline float
SwapperNoOp::Swap<float>(float val)
{
  uint32_t temp;
  memcpy(&temp, &val, sizeof(uint32_t));
  temp = Swap(temp);
  memcpy(&val, &temp, sizeof(float));
  return val;
}

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 6, 2022

As @issakomi mentions, we need a second memcpy, yes?

malaterre/GDCM#135

@issakomi
Copy link
Copy Markdown
Member

issakomi commented Jan 7, 2022

Just for completeness, looked what other libraries do, e.g. boost 1.77.0

Other built-in types, such as bool, float, or unscoped enumerations, do not have the same property, which means that reversing their constituent bytes may produce an invalid value, leading to undefined behavior. These types are therefore disallowed in endian_reverse, but are still allowed in endian_reverse_inplace. Even if an object becomes invalid as a result of reversing its bytes, as long as its value is never read, there would be no undefined behavior.

And here in boost FAQ

Is there floating point support?
An attempt was made to support four-byte floats and eight-byte doubles, limited to IEEE 754 (also known as ISO/IEC/IEEE 60559) floating point and further limited to systems where floating point endianness does not differ from integer endianness. Even with those limitations, support for floating point types was not reliable and was removed. For example, simply reversing the endianness of a floating point number can result in a signaling-NAN.
Support for float and double has since been reinstated for endian_buffer, endian_arithmetic and the conversion functions that reverse endianness in place. The conversion functions that take and return by value still do not support floating point due to the above issues; reversing the bytes of a floating point number does not necessarily produce another valid floating point number.

SDL does exactly the same thing as in my example with union

Looks like there is probably no general good solution at all (IEEE 754 floating-point standard does not specify endianness), there are also very special implementations. If the conversion is required a programmer has to find the best solution that works in particular case.

@Leengit
Copy link
Copy Markdown
Contributor

Leengit commented Jan 7, 2022

A workable solution at the bottom that unfortunately would surely need changes to the code that calls it would be to not convert float to float. Instead an encoding Swap function (from actual floats to a byte array that is written to disk) would be float to uint32_t and decoding Swap function (from a byte array back to floats) would be uint32_t to float.

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 7, 2022

So this leaves figuring out the max sizes that @dzenanz and I were discussing...

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Jan 16, 2022

I guess I could just arbitrarily increase a few of them until CI is happy...

@dzenanz dzenanz mentioned this pull request Feb 15, 2022
@seanm seanm closed this Feb 17, 2022
@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Feb 18, 2022

Anyone know where hooks.MaxObjectKiB and hooks-max-size come from? They don't seem to be a git thing. Most search results are very kitware-specific. It would help to have docs about how they work.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 18, 2022

Judging by this comment, hooks-max-size comes from ghostflow:
https://github.com/BRAINSia/BRAINSTools/blob/d2fd21fb2295c071593e2f0728f0a3ab0890a1b3/.gitattributes#L45-L46

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Feb 18, 2022

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 18, 2022

Blame leads to this commit: 3251408 by @thewtex. Maybe @bradking or @mathstuf can explain where hooks-max-size comes from?

@mathstuf
Copy link
Copy Markdown
Contributor

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Feb 18, 2022

Thanks! And this hooks.MaxObjectKiB thing? Anyone know where it's from? Seems not to be from ghostflow, and most search results are very old...

@N-Dekker
Copy link
Copy Markdown
Contributor

FYI hooks-max-size has previously triggered error messages in a pull request that would extend "itkMacro.h", which was almost 100 KB already: #2785 (review) saying:

commit 8f26661 creates blob 12398f35ac335d57dafd510e044b91bd28b4d29f at Modules/Core/Common/include/itkMacro.h with size 101090 bytes (98.72 KiB) which is greater than the maximum size 100000 bytes (97.66 KiB). If the file is intended to be committed, set the hooks-max-size attribute on its path.

Fortunately this could be fixed by removing a lot of white-space from itkMacro.h 😃 By clang-format AlignEscapedNewlines: commit 7829ef6

@mathstuf
Copy link
Copy Markdown
Contributor

hooks.MaxObjectKiB is from the hooks branch used for pre-commit checks. These can largely be removed these days IMO. It was renamed because the KiB meant anything wanting partial KiBs was dealing with floating point parsing, so just straight bytes was preferred.

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Feb 20, 2022

OK thanks. So maybe this is what we need: malaterre/GDCM@ba4d34a

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Feb 21, 2022

OK that's now upstreamed, so maybe this will work: #3215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:ThirdParty Issues affecting the ThirdParty module type:Data Changes to testing data type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants