Mean functions: better defaults and a formula interface

Most GP libraries treat the mean function as an afterthought. The default is usually zero, and most examples never change it. But for practitioners, the mean function matters:

- A zero mean function means the GP reverts to zero away from data, which is rarely what you want in practice
- A constant mean is a better default (the GP reverts to the data mean), but still leaves trend on the table
- A linear mean function captures obvious trends and lets the kernel focus on residual structure, which is the semi-parametric GP pattern that works well in practice

ptgp currently has a `Zero` mean function. Some things to consider:

## Better default

Should the default mean function be a constant (estimated from data) rather than zero? This is a small change that would make out-of-the-box behavior much more reasonable for practitioners who don't think about mean functions.

## Formula interface

A random idea that might be interesting: a Wilkinson-style formula language (like R's `lm`) for specifying mean functions:

```python
gp = pg.VFE(kernel=k, mean=pg.mean.Formula("1 + x1 + C(x2)"), ...)
```

## Open questions

- What should the default mean function be? Constant or linear?
- Do mean functions need their own class, or can they just be a Python callable?
- What do people actually use for mean functions in practice?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mean functions: better defaults and a formula interface #16

Better default

Formula interface

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mean functions: better defaults and a formula interface #16

Description

Better default

Formula interface

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions