-
-
Notifications
You must be signed in to change notification settings - Fork 50
Use Pydantic classes for configuration settings #325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
4ca7e4f
(no-verify) move execution out of module import
PGijsbers 79d3938
Define classes for the configuration sections
PGijsbers 1ba61fa
Move some configuration options around, start using Config classes
PGijsbers f1d6ca0
Use configuration types internally
PGijsbers 270a942
Allow setting config directly without parsing
PGijsbers 43f5661
Do not cache get_config
PGijsbers 5425cef
Include port when creating engine
PGijsbers 993667c
Explain why this caching mechanism is used
PGijsbers b90981c
restructure db configuration loading
PGijsbers 52edb0b
Determine server url only after configuration is loaded
PGijsbers c1d58cf
define api key validation pattern at runtime
PGijsbers fbce8ef
Remove duplicate comment
PGijsbers 6a50c08
call server_url()
PGijsbers File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,84 +1,181 @@ | ||
| import functools | ||
| """Configuration logic and schema definitions. | ||
|
|
||
| The `get_config` function provides access to the most recently loaded configuration. | ||
| A default configuration is loaded if none is explicitly set. | ||
|
|
||
| Use `set_config` to use a different configuration. | ||
| To parse a configuration from a file and environment variables, use `parse_configuration`. | ||
| Example of loading a configuration with a custom TOML and .env file: | ||
|
|
||
| ``` | ||
| config = parse_config( | ||
| dotenv_file=Path("path/to/.env"), | ||
| configuration_file=Path("path/to/config.toml") | ||
| ) | ||
| set_config(config) | ||
| ``` | ||
| and then consequent calls to `get_config` will return that configuration. | ||
| """ | ||
|
|
||
| import os | ||
| import tomllib | ||
| import typing | ||
| from pathlib import Path | ||
| from typing import Literal, cast | ||
|
|
||
| from dotenv import load_dotenv | ||
| from loguru import logger | ||
| from pydantic import AnyUrl, BaseModel, Field | ||
|
|
||
| TomlTable = dict[str, typing.Any] | ||
|
|
||
| CONFIG_DIRECTORY_ENV = "OPENML_REST_API_CONFIG_DIRECTORY" | ||
| CONFIG_FILE_ENV = "OPENML_REST_API_CONFIG_FILE" | ||
| DOTENV_FILE_ENV = "OPENML_REST_API_DOTENV_FILE" | ||
|
|
||
| OPENML_DB_USERNAME_ENV = "OPENML_DATABASES_OPENML_USERNAME" | ||
| OPENML_DB_PASSWORD_ENV = "OPENML_DATABASES_OPENML_PASSWORD" # noqa: S105 # not a password | ||
| EXPDB_DB_USERNAME_ENV = "OPENML_DATABASES_EXPDB_USERNAME" | ||
| EXPDB_DB_PASSWORD_ENV = "OPENML_DATABASES_EXPDB_PASSWORD" # noqa: S105 # not a password | ||
|
|
||
| _config_directory = Path(os.getenv(CONFIG_DIRECTORY_ENV, Path(__file__).parent)) | ||
| _config_directory = _config_directory.expanduser().absolute() | ||
| _config_file = Path(os.getenv(CONFIG_FILE_ENV, _config_directory / "config.toml")) | ||
| _config_file = _config_file.expanduser().absolute() | ||
| _dotenv_file = Path(os.getenv(DOTENV_FILE_ENV, _config_directory / ".env")) | ||
| _dotenv_file = _dotenv_file.expanduser().absolute() | ||
| _config: Configuration | None = None | ||
|
|
||
|
|
||
| logger.info( | ||
| "Determined configuration sources.", | ||
| configuration_directory=_config_directory, | ||
| configuration_file=_config_file, | ||
| dotenv_file=_dotenv_file, | ||
| ) | ||
| # The reason we use a module variable instead of functools.cache | ||
| # is that this method allows a custom configuration to be set | ||
| # through `set_config` and subsequently loaded through `get_config`. | ||
| def get_config() -> Configuration: | ||
| if _config is None: | ||
| config = parse_config() | ||
| set_config(config) | ||
| return cast("Configuration", _config) | ||
|
PGijsbers marked this conversation as resolved.
|
||
|
|
||
| load_dotenv(dotenv_path=_dotenv_file) | ||
|
|
||
| def set_config(configuration: Configuration) -> None: | ||
| global _config # noqa: PLW0603 | ||
| _config = configuration | ||
|
|
||
| def _apply_defaults_to_siblings(configuration: TomlTable) -> TomlTable: | ||
| defaults = configuration["defaults"] | ||
| return { | ||
| subtable: (defaults | overrides) if isinstance(overrides, dict) else overrides | ||
| for subtable, overrides in configuration.items() | ||
| if subtable != "defaults" | ||
| } | ||
|
|
||
| class Configuration(BaseModel, frozen=True): | ||
| openml_database: DatabaseConfiguration | ||
| expdb_database: DatabaseConfiguration | ||
| development: DevelopmentConfiguration | ||
| routing: RoutingConfiguration | ||
| logging: list[LoggingConfiguration] | ||
|
|
||
| @functools.cache | ||
| def _load_configuration(file: Path) -> TomlTable: | ||
| return tomllib.loads(file.read_text()) | ||
|
|
||
| class DatabaseConfiguration(BaseModel, frozen=True): | ||
| """Settings for one database connection.""" | ||
|
|
||
| def load_routing_configuration(file: Path = _config_file) -> TomlTable: | ||
| return typing.cast("TomlTable", _load_configuration(file)["routing"]) | ||
| host: str = Field(default="database", description="Database server host name") | ||
| port: int = Field(default=3306, gt=0) | ||
| database: str = Field(description="Database name") | ||
| username: str = Field(default="root") | ||
| password: str = Field(default="ok") | ||
| echo: bool = Field( | ||
| default=False, | ||
| description="https://docs.sqlalchemy.org/en/20/core/engines.html#sqlalchemy.create_engine.params.echo", | ||
| ) | ||
| drivername: str = Field( | ||
| default="mysql+aiomysql", | ||
| description="SQLAlchemy `dialect` and `driver`: https://docs.sqlalchemy.org/en/20/dialects/index.html", | ||
| ) | ||
|
|
||
|
|
||
| class DevelopmentConfiguration(BaseModel, frozen=True): | ||
| """Settings for development or test specific features.""" | ||
|
|
||
| allow_test_api_keys: bool = Field(default=False) | ||
|
|
||
| @functools.cache | ||
| def load_database_configuration(file: Path = _config_file) -> TomlTable: | ||
| configuration = _load_configuration(file) | ||
| database_configuration = _apply_defaults_to_siblings( | ||
| configuration["databases"], | ||
|
|
||
| class RoutingConfiguration(BaseModel, frozen=True): | ||
| root_path: str = Field(default="", description="Path prefix under which the service is hosted.") | ||
| minio_url: AnyUrl = Field(description="URL to the MinIO server or service") | ||
| server_url: AnyUrl = Field( | ||
| description="URL to this server (excluding the path prefix of `fastapi.root_path`).", | ||
| ) | ||
| database_configuration["openml"]["username"] = os.environ.get( | ||
| OPENML_DB_USERNAME_ENV, | ||
| "root", | ||
|
|
||
|
|
||
| class LoggingConfiguration(BaseModel, frozen=True): | ||
| """Configuration for a single log sink. | ||
|
|
||
| You can add any arguments that `loguru.logger.add` allows, | ||
| the `sink` will be used as first positional argument. | ||
| See also: https://loguru.readthedocs.io/en/stable/api/logger.html | ||
| """ | ||
|
|
||
| sink: str | ||
| level: Literal["TRACE", "DEBUG", "INFO", "SUCCESS", "WARNING", "ERROR"] | ||
| rotation: str | None = Field( | ||
| default=None, | ||
| description="Set rotation policy by date or file size.", | ||
| ) | ||
| database_configuration["openml"]["password"] = os.environ.get( | ||
| OPENML_DB_PASSWORD_ENV, | ||
| "ok", | ||
| retention: str | None = Field( | ||
| default=None, | ||
| description="Timespan after which automatic cleanup occurs.", | ||
| ) | ||
| database_configuration["expdb"]["username"] = os.environ.get( | ||
| EXPDB_DB_USERNAME_ENV, | ||
| "root", | ||
| compression: str | None = Field(default="gz") | ||
| # Logs provided variables as JSON | ||
| serialize: bool = Field(default=True) | ||
| # Decouples log calls from I/O and makes it multiprocessing safe. | ||
| enqueue: bool = Field(default=True) | ||
|
|
||
|
|
||
| def _db_env_credentials(alias: str) -> dict[str, str]: | ||
| return { | ||
| "username": os.environ.get( | ||
| f"OPENML_DATABASES_{alias.upper()}_USERNAME", | ||
| "root", | ||
| ), | ||
| "password": os.environ.get( | ||
| f"OPENML_DATABASES_{alias.upper()}_PASSWORD", | ||
| "ok", | ||
| ), | ||
| } | ||
|
|
||
|
|
||
| def parse_config( | ||
| dotenv_file: Path | None = None, | ||
| configuration_file: Path | None = None, | ||
| ) -> Configuration: | ||
| """Load configuration from file and environment variables. | ||
|
|
||
| The parsed configuration is returned but not used by default for other calls in this module. | ||
| """ | ||
| _config_directory = Path(os.getenv(CONFIG_DIRECTORY_ENV, Path(__file__).parent)) | ||
| _config_directory = _config_directory.expanduser().absolute() | ||
| logger.info( | ||
| "Determined configuration directory to be {configuration_directory}.", | ||
| configuration_directory=_config_directory, | ||
| ) | ||
| database_configuration["expdb"]["password"] = os.environ.get( | ||
| EXPDB_DB_PASSWORD_ENV, | ||
| "ok", | ||
|
|
||
| if not dotenv_file: | ||
| dotenv_filepath = os.getenv(DOTENV_FILE_ENV, _config_directory / ".env") | ||
| dotenv_file = Path(dotenv_filepath).expanduser().absolute() | ||
|
|
||
| logger.info( | ||
| "Determined dotenv file path to be {dotenv_file}.", | ||
| dotenv_file=dotenv_file, | ||
| ) | ||
| return database_configuration | ||
| load_dotenv(dotenv_file) | ||
|
|
||
| if not configuration_file: | ||
| config_filepath = os.getenv(CONFIG_FILE_ENV, _config_directory / "config.toml") | ||
| configuration_file = Path(config_filepath).expanduser().absolute() | ||
|
|
||
| def load_configuration(file: Path | None = None) -> TomlTable: | ||
| file = file or _config_file | ||
| return tomllib.loads(file.read_text()) | ||
| logger.info( | ||
| "Determined config file path to be {config_file}.", | ||
| config_file=configuration_file, | ||
| ) | ||
|
|
||
| config = tomllib.loads(configuration_file.read_text()) | ||
| db_section = config["databases"] | ||
| openml_db = DatabaseConfiguration(**db_section["openml"], **_db_env_credentials("openml")) | ||
| expdb_db = DatabaseConfiguration(**db_section["expdb"], **_db_env_credentials("expdb")) | ||
|
|
||
| return Configuration( | ||
| routing=RoutingConfiguration(**config["routing"]), | ||
| logging=[ | ||
| LoggingConfiguration(**sink_configuration) | ||
| for sink_configuration in config["logging"].values() | ||
| ], | ||
| openml_database=openml_db, | ||
| expdb_database=expdb_db, | ||
| development=DevelopmentConfiguration(**config["development"]), | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,39 +1,24 @@ | ||
| arff_base_url="https://test.openml.org" | ||
| minio_base_url="https://openml1.win.tue.nl" | ||
|
|
||
| [development] | ||
| allow_test_api_keys=true | ||
|
|
||
| # Any number of logging.NAME configurations can be added. | ||
| # NAME is for reference only, it has no meaning otherwise. | ||
| # You can add any arguments to `loguru.logger.add`, | ||
| # the `sink` variable will be used as first positional argument. | ||
| # https://loguru.readthedocs.io/en/stable/api/logger.html | ||
| [logging.develop] | ||
| sink="develop.log" | ||
| # One of loguru levels: TRACE, DEBUG, INFO, SUCCESS, WARNING, ERROR | ||
| level="DEBUG" | ||
| # Automatically create a new file by date or file size | ||
| rotation="50 MB" | ||
| # Retention specifies the timespan after which automatic cleanup occurs. | ||
| retention="1 day" | ||
| compression="gz" | ||
|
|
||
| [fastapi] | ||
| root_path="" | ||
|
|
||
| [databases.defaults] | ||
| host="database" | ||
| port="3306" | ||
| # SQLAlchemy `dialect` and `driver`: https://docs.sqlalchemy.org/en/20/dialects/index.html | ||
| drivername="mysql+aiomysql" | ||
|
|
||
| [databases.expdb] | ||
| database="openml_expdb" | ||
|
|
||
| [databases.openml] | ||
| database="openml" | ||
|
|
||
| [routing] | ||
| root_path="" | ||
| minio_url="http://minio:9000/" | ||
| server_url="http://php-api:80/" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.