Skip to content

Conversation

@AdamGlustein
Copy link
Collaborator

@AdamGlustein AdamGlustein commented Oct 10, 2025

Cleanup and a few fixes to #574, thanks @sim15 for the contribution!

Simon Beyzerov and others added 4 commits September 16, 2025 12:03
Signed-off-by: Simon Beyzerov <simon.beyzerov@point72.com>
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
…ance bug

Signed-off-by: Adam Glustein <adam.glustein@point72.com>
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
@timkpaine timkpaine added the type: enhancement Issues and PRs related to improvements to existing features label Oct 10, 2025
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
@ptomecek
Copy link
Collaborator

ptomecek commented Oct 27, 2025

Previously, when all fields on structs were optional, we hard-coded these two flags for pydantic validation

Can we only apply this logic when allow_unset=True?
I think that when allow_unset is False, pydantic should infer it correctly based on the declared type and the default value, though we should add a unit test.

@AdamGlustein
Copy link
Collaborator Author

Previously, when all fields on structs were optional, we hard-coded these two flags for pydantic validation

Can we only apply this logic when allow_unset=True? I think that when allow_unset is False, pydantic should infer it correctly based on the declared type and the default value, though we should add a unit test.

I tested out this approach and it doesn't validate correctly in the case of a required field with a default value.

class MyStrictStruct(csp.Struct, allow_unset=False):
            req_default_str: str = "default"
E       pydantic_core._pydantic_core.SchemaError: Error building "list" validator:
E         SchemaError: Error building "function-wrap" validator:
E         SchemaError: Error building "typed-dict" validator:
E         SchemaError: Field 'req_default_str': a required field cannot have a default value

@AdamGlustein AdamGlustein requested a review from ptomecek November 3, 2025 20:59
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
template<typename T>
bool InputAdapter::consumeTick( const T & value )
{
if constexpr( CspType::Type::fromCType<T>::type == CspType::TypeTraits::STRUCT )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a central place we can do this check on push rather than consume? Will be much harder to track down on consumption

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see such a place - we could add it to each adapter type (push, pull, pushpull and managers) separately in their pushTick impls, but I think it's cleaner to have the validation logic in one spot.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but my point stands that it would be a lot harder to tell where the error is coming from
Also unfortunate to vall validate on all input adapters, even python based ones which were already validated due to PyStruct creation in python

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to the second point here, we could store a "dirty" flag on the struct to avoid validating it twice if it has a) been validated already and b) not had any modifications since the last validation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually, since you can't delete fields on a strict struct, you only ever need to validate it once so point b) is irrelevant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, so we could store a bit to indicate its already validated, but im not sure its necessary at this point since we almost always be validating once up front

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really we can skip the validation check in InputAdapter::consume for any PyStruct.
In a somewhat hacky way we already have this info as we could check struct -> dialectPtr() != nullptr

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are concerned about the double validation I would go with adding a validated bit, but maybe in a separate PR

@ptomecek
Copy link
Collaborator

Previously, when all fields on structs were optional, we hard-coded these two flags for pydantic validation

Can we only apply this logic when allow_unset=True? I think that when allow_unset is False, pydantic should infer it correctly based on the declared type and the default value, though we should add a unit test.

I tested out this approach and it doesn't validate correctly in the case of a required field with a default value.

class MyStrictStruct(csp.Struct, allow_unset=False):
            req_default_str: str = "default"
E       pydantic_core._pydantic_core.SchemaError: Error building "list" validator:
E         SchemaError: Error building "function-wrap" validator:
E         SchemaError: Error building "typed-dict" validator:
E         SchemaError: Field 'req_default_str': a required field cannot have a default value

I think this just means that if there is a default value, we leave it as "optional" (as the user doesn't have to provide it), but if there isn't, then it's not. I guess I was wrong about pydantic inferring it. However, the behavior for non-defaulted fields will be different for the two types of Struct.

@AdamGlustein AdamGlustein marked this pull request as draft November 19, 2025 21:23
…ield is set to None in the struct's bitmask

Signed-off-by: Adam Glustein <adam.glustein@point72.com>
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
@AdamGlustein AdamGlustein marked this pull request as ready for review November 21, 2025 17:46
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
PushBatch batch( m_engine -> rootEngine() );
m_inputAdapter -> processMessage( c, t, &batch );
}
catch( csp::Exception & err )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this try/catch here now if we are throwing in consume?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is doing anything, right. Reverted the change here

if constexpr( CspType::Type::fromCType<T>::type == CspType::TypeTraits::STRUCT )
{
if( unlikely( !( value -> validate() ) ) )
CSP_THROW( ValueError, "Struct " << value -> meta() -> name() << " is not valid; required fields " << value -> formatAllUnsetStrictFields() << " were not set on init" );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're likely leaving this in consume, can we add something i meant to add.
Please add a virtual const char * name() const; method on InputAdapter, can default it to return "InputAdapter"
we should really implement name where we can, and use name() in the exception here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for purposes of this PR we can just define name() as "InputAdapter" and separately we can implement it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and added to the error message

m_partialSize = m_size - baseSize;
m_maskLoc = m_size - m_maskSize;

uint8_t numRemainingBits = ( m_fields.size() - m_firstPartialField + optionalFieldCount ) % 8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like m_firstPartialField is always 0 at this point.
In any case I dont think its needed here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed and removed

for( auto & f : m_fields )

// Set optional fields first so that their 2-bits never cross a byte boundary
m_optionalFieldsSetBits = new uint8_t[ m_maskSize ]();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::unique_ptr or just use std::vector
also consider merging into one allocation for both masks to avoid potential fragmentation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use vectors. I could store it as one vector containing both the set bitmasks and the none bitmasks, but I think the added complexity with that approach is not really worth it, considering this is a member of the meta.

}
}
// deal with last (partial) byte filled with optional fields
auto lastOptIndex = maskLoc - m_maskLoc;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i forget if we have a way to run coverage right now but would be good to ensure all of these code paths are being tested or at least executed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified this logic a lot to just set the bits programmatically rather than using these precalculated constants, not sure why I didn't do that to start with

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a member of the meta so space isnt a concern, its more the frequent access of these vectors.
if the allocations did happen to fragment it would be unfortunate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see what you mean, if those vectors get allocated on separate cachelines then each validate call will be loading in both.
I am fine to make it a single vector and just use offset logic to access each mask.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it so that all the set/none masks are in the same vector.

if( field -> isNone( struct_ptr ) )
py_obj = PyObjectPtr::incref( Py_None );
else
py_obj = switchCspType( field -> type(), [field, struct_ptr, this]( auto tag )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do keep None processing, brackets around multi-line clause

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

for( const auto & field: fields )
{
if( !field -> isSet( struct_ptr ) )
if( !field -> isSet( struct_ptr ) && !field -> isNone( struct_ptr ) )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same Q, do we want to propagate None here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what else we would do, just treat it as unset? Makes more sense to pass it through as a null for consistency.


- **Default:** `False` (for backwards compatibility)
- **When set to `True`:** Enforces *strict struct semantics* — required fields must be set at initialization.
- Note that `del struct.field` is not allowed for strict structs — this will raise an error. As a result, for any strict struct, `hasattr(struct, field)` always returns True for any defined field.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move these two bullets below to the semantics section

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


- **Default:** `False` (for backwards compatibility)
- **When set to `True`:** Enforces *strict struct semantics* — required fields must be set at initialization.
- Note that `del struct.field` is not allowed for strict structs — this will raise an error. As a result, for any strict struct, `hasattr(struct, field)` always returns True for any defined field.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for any strict struct, hasattr(struct, field) always returns True for any defined field.
" and never has to be checked"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

1. **Example**

```python
class MyStruct(csp.Struct, allow_unset=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix param name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Signed-off-by: Adam Glustein <adam.glustein@point72.com>
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
Signed-off-by: Adam Glustein <adam.glustein@point72.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: enhancement Issues and PRs related to improvements to existing features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants