A few years ago, a colleague and I were working on implementing the multivariate student-t distribution as a part of a new model. We were looking to implement the non-centered parameterization. Luckily enough, Andre Pfeuffer had shared the following solution on the Stan forums that rewrites the multivariate student-t in terms of inverse Chi-Square distribution:
This works well! But! This parameterization requires that you pass in the degrees of freedom, , as data.1 If you instead want to estimate as a part of fitting the model, you’ll need an alternative parameterization.
1 You could technically simply model as a parameter with this specification, but then you end up with a centered prior over , which defeats the purpose of implementing a non-centered parameterization.
Luckily enough for you, dear reader, I’ve done this work for you. If you want to skip the derivation and get straight to estimating, you can plop this function into your Stan model. What follows below is the derivation that I shared with the Stan forums.
real multivariate_t_scale_lpdf(vector x,real nu) {int N = size(x);vector[N] xx = nu/(x^2);real lp = 0.0;for (n in1:N) { lp += log(2) + log(nu) + chi_square_lpdf(xx[n] | nu) - 3*log(x[n]); }return lp;}
Deriving a multivariate student-t
The goal is to replace sqrt(nu / u) in Andre’s original solution with a new scale parameter, , and derive some density function for that takes as an input. From some tinkering, I found that the quantile function for the outcome of interest is
where is the quantile function of the Chi-Square distribution.
I didn’t exactly derive this quantile function robustly — it was a game of “guess and check.” If I simulate the outcome of interest, then compare the quantile distribution of the simulated values against my derived quantile distribution, the results line up well enough for me to say that this is the right function.
Code
library(tidyverse)library(riekelib)# simulate outcome w/df = 7nu <-7raw <-sqrt(nu/rchisq(1e4, nu))# quantiles for checking simulated results against empiricalsim <-tibble(p =seq(from =0, to =1, length.out =100),q =quantile(raw, probs =seq(from =0, to =1, length.out =100)))# check the quantile functiontibble(p =seq(from =0, to =1, length.out =1000)) %>%mutate(q =sqrt(nu/qchisq(1- p, nu))) %>%ggplot(aes(x = p,y = q)) +geom_line() +geom_point(data = sim,color ="royalblue") +theme_rieke()
We want to get on its own so that we can eventually derive a density, . Some rearranging yields the following:
Setting and recalling that the quantile and cumulative functions are inverses lets us solve for explicitly. Here, is the cumulative distribution function for the Chi-Square distribution — this gives us a cumulative distribution function for .
The density function is just the derivative of the CDF, so by virtue of the chain rule, we get the following if we take the derivative of our CDF with respect to :
where is the density function of the Chi-Square distribution and is the derivative of with respect to .
Sub in for our temporary variable, , finish out the derivative, do some simplification, and voila! A density function for the multivariate student-t scale drops out!
Comparing the empirical density of the simulated values against the derived density function again yields an overlap that is, to me, good enough to call correct.