Skip to content

Conversation

@PadLex
Copy link

@PadLex PadLex commented Jan 9, 2026

The current synthetic test is positionally invariant, and thus fails to test the positional encoding. I've added a Symmetry Detection Test that converges to 98% accuracy with sinusoidal PE and 50% without. I also added an optional real-world sentiment analysis test, though it requires Hugging Face's datasets and transformers libraries. It's completely optional.

The model expected one-hot encoded vectors which it would pass through a dense layer. This is non-standard and quite slow for real-world use-cases. I fixed it by replacing the dense layer with a standard embedding layer. The model also used a non-standard mean pooling which can destroy PE-related gradients. I implemented a BERT-style CLS token.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant