Skip to content

Conversation

@bbbales2
Copy link
Member

@bbbales2 bbbales2 commented Oct 8, 2020

Submission Checklist

  • Builds locally
  • Declare copyright holder and open-source license: see below

Summary

I copy-pasted the formulas and such from the math docs and the function signatures themselves come from what the stanc3 compiler says it exposes. I moved cov_exp_quad to the deprecated functions.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@bbbales2 bbbales2 requested a review from avehtari October 8, 2020 23:11
Copy link
Member

@drezap drezap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Thank you :)

<!-- matrix; cov_exp_quad; (row_vectors x, real alpha, real rho); -->
\index{{\tt \bfseries cov\_exp\_quad }!{\tt (row\_vectors x, real alpha, real rho): matrix}|hyperpage}
$$
k(x_i, x_j) = \alpha^2 \exp \left( -\dfrac{1}{2\rho^2} |x_i - x_j|^2 \right)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters for this formula and the inputs of the math library implementation don't match, and it's a bit misleading.

Can we agree on parameter names? sigma for magnitude and l for length scale are fine, as these are both used in references, and these match the math library implementation.

I've been using tau for the magnitude parameter and l for the length-scale parameter to match BDA3 on this branch: https://github.com/stan-dev/docs/tree/issue-185, but sigma and l are fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooo, yeah, I just copy pasted that from cov_exp_quad. I'll go \sigma and l with names magnitude and length scale.

With scale $\sigma$ and length scale $l$, the exponential kernel is:

$$
k(x_i, x_j) = \sigma^2 exp(-\frac{|x_i - x_j|}{2 l^2})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use \left ( for parenthesis so size of the parenthesis is cleaner? Same for kernels below.

}
```

With scale $\alpha$ and length scale $l$, the exponentiated quadratic kernel is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alpha isn't a scale parameter. It's a magnitude or marginal SD parameter. Scale and length scale are synonymous.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go magnitude here. Also I mixed notations again with the l vs rho thing.

set of points and itself, $K_{ij} = k(x_i, x_j)$, or the covariance between
two different sets of points, $K_{ij} = k(x_i, x_j)$. $x$ can be an array of
scalars for a one dimensional kernel or an array of vectors for a
multidimensional kernel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not clear. Something like: Gaussian process covariance functions compute the covariance between each observation in the input data set, or the cross-covariance between two input data sets. And the notation for covariance and cross covariance should be unique, in some way. Right now it's the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be vector of doubles, instead of scalars. May be make x boldface so we know it's a vector?

Can you also indicate somewhere that x_i is observation (row) i of x?

Also, instead of the kernel being one-dimensional, the input data is multi-dimensional. Or the design matrix has more than one covariate. Something like that. We say the GP is 1-D, not the kernel.


### Dot product kernel

With scale $\sigma$ the dot product kernel is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a scale parameter. This is an intercept or bias parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, may be a subscript 0 for the sigma parameter to differentiate it from the magnitude parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bias and \sigma_0 sound good.


### Exponential kernel

With scale $\sigma$ and length scale $l$, the exponential kernel is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, sigma is not a scale parameter. This is the same for all matern kernels (including exponential).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see it's magnitude in the code. I'll go with that.

With scale $\sigma$ and length scale $l$, the Matern 3/2 kernel is:

$$
k(x_i, x_j) = \sigma^2(1 + \frac{\sqrt{3}|x_i - x_j|}{l}) \exp(-\frac{\sqrt{3}|x_i - x_j|)}{l})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're missing an opening ( in the exponential. Either remove the closing one or add an opening one.

`matrix` **`cov_exp_quad`**`(vectors x1, vectors x2, real alpha, real rho)`<br>\newline
The covariance matrix with an exponentiated quadratic kernel of x1 and
x2.
Gaussian process with squared exponential kernel in multiple dimensions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is picky, and optional, but I'd be explicit about the dimension of D being greater than 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it at multiple dimensions. The code accepts 1D, so it'd work if someone did it.

x2.
`matrix` **`gp_exp_quad_cov`**`(vector[] x, real sigma, real[] length_scale)`<br>\newline

Gaussian process with squared exponential kernel in multiple dimensions with a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like: with a separate length scale parameter for each dimension.


`matrix` **`gp_exp_quad_cov`**`(vector[] x1, vector[] x2, real sigma, real length_scale)`<br>\newline

Gaussian process with squared exponential kernel in multiple dimensions between
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like: computes the cross covariance between two sets of data points x1, x2. Number of dimensions must be the same, as well.

@bbbales2
Copy link
Member Author

bbbales2 commented Oct 18, 2020

@drezap I went through and did the updates.

Here's the rendered file for your convenience: gaussian-process-covariance-functions.html.zip

Edit: I went with magnitude for the \sigmas on everything, and switched most of the xs to look like vectors. Intro I copied from what you posted and added some text to differentiate the cross-covariances from just the regular covariances.

@avehtari
Copy link
Member

Quick comment: If the covariance function is named gp_exp_quad_cov, we should write in the documentation always "exponentiated quadratic" instead of "squared exponential"

@bbbales2
Copy link
Member Author

"exponentiated quadratic" instead of "squared exponential"

Oh yeah I changed it to squared exponential I can change it back.

@bbbales2
Copy link
Member Author

Oh I see. Squared exponential seems like exp(2x) where as exponentiated quadratic is like exp(x^2).

@bbbales2
Copy link
Member Author

Updated (and new rendered html here: gaussian-process-covariance-functions.html.zip)

@drezap
Copy link
Member

drezap commented Oct 19, 2020 via email

@dirmeier
Copy link
Member

Hey @bbbales2 , can I give you a hand with anything here? It would be nice if this docu were available in the official reference.
Cheers, Simon

Copy link
Member

@avehtari avehtari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me (and sorry for missing the review request)

@WardBrian
Copy link
Member

@bbbales2 - do you mind if I update the merge on this for you and add the 'Since version x' text to the functions? I'd like to merge this before #439

@bbbales2
Copy link
Member Author

Sounds good, take it away!

@WardBrian WardBrian linked an issue Nov 23, 2021 that may be closed by this pull request
Copy link
Member

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes since Aki’s approval look good to me! Thanks Ben and Brian.

@rok-cesnovar rok-cesnovar merged commit e0564d3 into master Nov 23, 2021
@rok-cesnovar rok-cesnovar deleted the feature/gp_function_reference_docs branch November 23, 2021 14:18
@WardBrian WardBrian mentioned this pull request Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update manual to include new GP covariance functions

7 participants