-
Notifications
You must be signed in to change notification settings - Fork 3
Add grammar & converter for Christian holidays #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rlskoeser
wants to merge
1
commit into
develop
Choose a base branch
from
feature/christian-holidays
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| %import common.WS | ||
| %ignore WS | ||
|
|
||
| %import .undate_common.DATE_PUNCTUATION | ||
| %ignore DATE_PUNCTUATION | ||
|
|
||
|
|
||
| holiday_date: movable_feast year | fixed_date year? | ||
|
|
||
| // holidays that shift depending on the year | ||
| movable_feast: EASTER | EASTER_MONDAY | HOLY_SATURDAY | ASCENSION | ||
| | PENTECOST | WHIT_MONDAY | TRINITY | ASH_WEDNESDAY | SHROVE_TUESDAY | ||
|
|
||
| // holidays that are always on the same date | ||
| fixed_date: EPIPHANY | CANDLEMASS | ST_PATRICKS | ALL_FOOLS | ST_CYPRIANS | ||
|
|
||
| year: /\d{4}/ | ||
|
|
||
| // all patterns use case-insensitive regex | ||
|
|
||
| // Fixed-date holidays | ||
| EPIPHANY: /epiphany/i | ||
| CANDLEMASS: /candlemass?/i // recognize with both one and 2 s | ||
| ST_PATRICKS: /st\.?\s*patrick'?s?\s*day/i | ||
| ALL_FOOLS: /(april|all)\s*fools?\s*day/i | ||
| ST_CYPRIANS: /st\.?\s*cyprian'?s?\s*day/i | ||
|
|
||
| // Moveable feasts | ||
| EASTER: /easter/i | ||
| EASTER_MONDAY: /easter\s*monday/i | ||
| HOLY_SATURDAY: /holy\s*saturday/i | ||
| ASCENSION: /ascension\s*day|ascension/i | ||
| PENTECOST: /pentecost/i | ||
| WHIT_MONDAY: /whit\s*monday|whitsun\s*monday/i | ||
| TRINITY: /trinity\s*sunday|trinity/i | ||
| ASH_WEDNESDAY: /ash\s*wednesday/i | ||
| SHROVE_TUESDAY: /shrove\s*tuesday/i | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,171 @@ | ||
| """ | ||
| Holiday date Converter: parse Christian liturgical dates and convert to Gregorian. | ||
| """ | ||
|
|
||
| import datetime | ||
|
|
||
| from lark import Lark, Transformer, Tree, Token | ||
| from lark.exceptions import UnexpectedInput | ||
|
|
||
| from convertdate import holidays | ||
| from undate import Undate, Calendar | ||
| from undate.converters.base import BaseDateConverter, GRAMMAR_FILE_PATH | ||
|
|
||
| # To add a new holiday: | ||
| # 1. Add a name and pattern to holidays.lark grammar file | ||
| # 2. Include the in appropriate section (fixed or movable) | ||
| # 3. Add an entry to FIXED_HOLIDAYS or MOVEABLE_FEASTS; must match grammar terminal name | ||
|
|
||
|
|
||
| # holidays that fall on the same date every year | ||
| # key must match grammar term; value is tuple of numeric month, day | ||
| FIXED_HOLIDAYS = { | ||
| "EPIPHANY": (1, 6), # January 6 | ||
| "CANDLEMASS": (2, 2), # February 2; 40th day & end of epiphany | ||
| "ST_PATRICKS": (3, 17), # March 17 | ||
| "ALL_FOOLS": (4, 1), # All / April fools day, April 1 | ||
| "ST_CYPRIANS": (9, 16), # St. Cyprian's Feast day: September 16 | ||
| } | ||
|
|
||
| # holidays that shift depending on the year; value is days relative to Easter | ||
| MOVEABLE_FEASTS = { | ||
| "EASTER": 0, # Easter, no offset | ||
| "HOLY_SATURDAY": -1, # day before Easter | ||
| "EASTER_MONDAY": 1, # day after Easter | ||
| "ASCENSION": 39, # fortieth day of Easter | ||
| "PENTECOST": 49, # 7 weeks after Easter | ||
| "WHIT_MONDAY": 50, # Monday after Pentecost | ||
| "TRINITY": 56, # first Sunday after Pentecost | ||
| "ASH_WEDNESDAY": -46, # Wednesday of the 7th week before Easter | ||
| "SHROVE_TUESDAY": -47, # day before Ash Wednesday | ||
| } | ||
|
|
||
|
|
||
| parser = Lark.open( | ||
| str(GRAMMAR_FILE_PATH / "holidays.lark"), rel_to=__file__, start="holiday_date" | ||
| ) | ||
|
|
||
|
|
||
| class HolidayTransformer(Transformer): | ||
| calendar = Calendar.GREGORIAN | ||
|
|
||
| def year(self, items): | ||
| value = "".join([str(i) for i in items]) | ||
| return Token("year", value) | ||
| # return Tree(data="year", children=[value]) | ||
|
|
||
| def movable_feast(self, items): | ||
| # moveable feast day can't be calculated without the year, | ||
| # so pass through | ||
| return items[0] | ||
|
|
||
| def fixed_date(self, items): | ||
| item = items[0] | ||
| holiday_name = item.type.split("__")[-1] | ||
| # token_type = item.type | ||
| # token type is holiday fixed-date name; use to determine month/day | ||
| month, day = FIXED_HOLIDAYS.get(holiday_name) | ||
| return Tree("fixed_date", [Token("month", month), Token("day", day)]) | ||
| # for key in FIXED_HOLIDAYS: | ||
| # if token_type == key or token_type == f"holidays__{key}": | ||
| # month, day = FIXED_HOLIDAYS[key] | ||
| # return Tree("fixed_date", [Token("month", month), Token("day", day)]) | ||
| # raise ValueError(f"Unknown fixed holiday: {item.type}") | ||
|
|
||
| def holiday_date(self, items): | ||
| parts = self._get_date_parts(items) | ||
| return Undate(**parts) | ||
|
|
||
| def _get_date_parts(self, items) -> dict[str, int | str]: | ||
| # recursive method to take parsed tokens and trees and generate | ||
| # a dictionary of year, month, day for initializing an undate object | ||
| # handles nested tree with month/day (for fixed date holidays) | ||
| # and includes movable feast logic, after year is determined. | ||
|
|
||
| parts = {} | ||
| date_parts = ["year", "month", "day"] | ||
| movable_feast = None | ||
| for child in items: | ||
| field = value = None | ||
| # if this is a token, get type and value | ||
| if isinstance(child, Token): | ||
| # month/day from fixed date holiday | ||
| if child.type in date_parts: | ||
| field = child.type | ||
| value = child.value | ||
| # check for movable feast terminal | ||
| elif child.type in MOVEABLE_FEASTS: | ||
| # collect but don't handle until we know the year | ||
| movable_feast = child.type | ||
| # handle namespaced token type; happens when called from combined grammar | ||
| elif ( | ||
| "__" in child.type and child.type.split("__")[-1] in MOVEABLE_FEASTS | ||
| ): | ||
| # collect but don't handle until we know the year | ||
| movable_feast = child.type.split("__")[-1] | ||
|
|
||
| # if a tree, check for type and anonymous token | ||
| if isinstance(child, Tree): | ||
| # if tree is a date field (i.e., year), get the value | ||
| if child.data in date_parts: | ||
| field = child.data | ||
| # in this case we expect one value; | ||
| # convert anonymous token to value | ||
| value = child.children[0] | ||
| # if tree has children, recurse to get date parts | ||
| elif child.children: | ||
| parts.update(self._get_date_parts(child.children)) | ||
|
|
||
| # if date fields were found, add to dictionary | ||
| if field and value: | ||
| # currently all date parts are integer only | ||
| parts[str(field)] = int(value) | ||
|
|
||
| # if date is a movable feast, calculate relative to Easter based on the year | ||
| if movable_feast is not None: | ||
| offset = MOVEABLE_FEASTS[movable_feast] | ||
| holiday_date = datetime.date( | ||
| *holidays.easter(parts["year"]) | ||
| ) + datetime.timedelta(days=offset) | ||
| parts.update({"month": holiday_date.month, "day": holiday_date.day}) | ||
|
|
||
| return parts | ||
|
|
||
|
|
||
| class HolidayDateConverter(BaseDateConverter): | ||
| """ | ||
| Converter for Christian liturgical dates. | ||
|
|
||
| Supports fixed-date holidays (Epiphany, Candlemass, etc.) and | ||
| Easter-relative moveable feasts (Easter, Ash Wednesday, Pentecost, etc.). | ||
|
|
||
| Example usage:: | ||
|
|
||
| Undate.parse("Easter 1942", "holidays") | ||
| Undate.parse("Ash Wednesday 1942", "holidays") | ||
| Undate.parse("Epiphany", "holidays") | ||
|
|
||
| Does not support serialization. | ||
| """ | ||
|
|
||
| name = "holidays" | ||
|
|
||
| def __init__(self): | ||
| self.transformer = HolidayTransformer() | ||
|
|
||
| def parse(self, value: str) -> Undate: | ||
| if not value: | ||
| raise ValueError("Parsing empty string is not supported") | ||
|
|
||
| try: | ||
| parsetree = parser.parse(value) | ||
| # transform the parse tree into an undate or undate interval | ||
| undate_obj = self.transformer.transform(parsetree) | ||
| # set the input holiday text as a label on the undate object | ||
| undate_obj.label = value | ||
| return undate_obj | ||
| except UnexpectedInput as err: | ||
| raise ValueError(f"Could not parse '{value}' as a holiday date") from err | ||
|
|
||
| def to_string(self, undate: Undate) -> str: | ||
| raise ValueError("Holiday converter does not support serialization") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| import pytest | ||
|
|
||
| from undate import Undate, Calendar | ||
| from undate.date import Weekday | ||
| from undate.converters.holidays import HolidayDateConverter | ||
|
|
||
|
|
||
| class TestHolidayConverter: | ||
| converter = HolidayDateConverter() | ||
|
|
||
| @pytest.mark.parametrize( | ||
| "input_string,expected", | ||
| [ | ||
| ("Epiphany 1921", Undate(1921, 1, 6)), | ||
| ("candlemas 1913", Undate(1913, 2, 2)), | ||
| ("Candlemass 1862", Undate(1862, 2, 2)), | ||
| ("st. patrick's day 1823", Undate(1823, 3, 17)), | ||
| ("st patrick's day 1901", Undate(1901, 3, 17)), | ||
| ("all fools day 1933", Undate(1933, 4, 1)), | ||
| ("st. cyprian's day 1902", Undate(1902, 9, 16)), | ||
| ], | ||
| ) | ||
| def test_fixed_holidays(self, input_string, expected): | ||
| assert self.converter.parse(input_string) == expected | ||
|
|
||
| @pytest.mark.parametrize( | ||
| "input_string,expected,expected_weekday", | ||
| [ | ||
| ("Easter 1900", Undate(1900, 4, 15), Weekday.SUNDAY), | ||
| ("easter monday 1925", Undate(1925, 4, 13), Weekday.MONDAY), | ||
| ("holy saturday 2018", Undate(2018, 3, 31), Weekday.SATURDAY), | ||
| ("Ash Wednesday 2000", Undate(2000, 3, 8), Weekday.WEDNESDAY), | ||
| ("shrove tuesday 1940", Undate(1940, 2, 6), Weekday.TUESDAY), | ||
| ("Ascension 1988", Undate(1988, 5, 12), Weekday.THURSDAY), | ||
| ("Ascension Day 1999", Undate(1999, 5, 13), Weekday.THURSDAY), | ||
| ("Pentecost 2016", Undate(2016, 5, 15), Weekday.SUNDAY), | ||
| ("whit monday 2005", Undate(2005, 5, 16), Weekday.MONDAY), | ||
| ("whitsun monday 2023", Undate(2023, 5, 29), Weekday.MONDAY), | ||
| ("trinity 1978", Undate(1978, 5, 21), Weekday.SUNDAY), | ||
| ("Trinity Sunday 1967", Undate(1967, 5, 21), Weekday.SUNDAY), | ||
| ], | ||
| ) | ||
| def test_moveable_feasts(self, input_string, expected, expected_weekday): | ||
| result = self.converter.parse(input_string) | ||
| assert result == expected | ||
| assert result.label == input_string | ||
| assert result.earliest.weekday == expected_weekday | ||
|
|
||
| def test_holiday_without_year(self): | ||
| result = self.converter.parse("Epiphany") | ||
| assert result.label == "Epiphany" | ||
| assert result.format("EDTF") == "XXXX-01-06" | ||
| assert not result.known_year | ||
| assert result.calendar == Calendar.GREGORIAN | ||
|
|
||
| def test_undate_parse(self): | ||
| # accessible through main undate parse method | ||
| assert Undate.parse("Epiphany 1942", "holidays") == Undate(1942, 1, 6) | ||
|
|
||
| def test_parse_empty(self): | ||
| with pytest.raises(ValueError, match="empty string"): | ||
| self.converter.parse("") | ||
|
|
||
| def test_parse_error(self): | ||
| with pytest.raises(ValueError, match="Could not parse"): | ||
| self.converter.parse("Not a holiday") | ||
|
|
||
| def test_moveable_without_year(self): | ||
| with pytest.raises(ValueError, match="Could not parse"): | ||
| self.converter.parse("Easter") | ||
|
|
||
| def test_to_string_error(self): | ||
| with pytest.raises(ValueError, match="does not support"): | ||
| self.converter.to_string(Undate(1916)) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 113
🏁 Script executed:
# Find and examine the holidays.py file fd -t f holidays.pyRepository: dh-tech/undate-python
Length of output: 137
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 104
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 47
🏁 Script executed:
cat -n src/undate/converters/holidays.py | head -150Repository: dh-tech/undate-python
Length of output: 6947
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 188
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 79
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 279
🏁 Script executed:
Repository: dh-tech/undate-python
Length of output: 1407
Reject
0000before it reaches the movable-feast calculation.The grammar currently allows
year: /\d{4}/to match0000, but the movable-feast resolver at line 127 usesdatetime.date(*holidays.easter(parts["year"])), which only supports years 1–9999. Parsing inputs likeEaster 0000will therefore result in aValueErrorat transformation time instead of a normal parse failure.Grammar fix
📝 Committable suggestion
🤖 Prompt for AI Agents