Building intuition with Gaussians (student)

Instructions: Hand in the sheet to your instructor by the end of lab. If you don’t finish in lab, hand in on Gradescope by 9 AM the Monday of the next lab.

This lab is meant to get you more comfortable with numpy and start building some intuition for Gaussians and conditional distributions. You can find a Colab version of the required code here.

By the end of the lab, students should:
1. Be familiar with a linear-algebra capable library (e.g. numpy) and use it to carry out a calculation.
2. Understand multivariate Gaussian distributions.
3. Have intuition for how covariances translate to conditional distributions.

To start, we’re going to focus on three different 2x2 covariance matrices for a multivariate Gaussian and plot the probability densities they correspond to. Let’s set our three covariance matrices and mean.

from typing import Tuple

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

mu_all = np.zeros(2)
cov_one = np.array([[1.0, 0.5], [0.5, 1.0]])
cov_two = np.array([[1.0, 0.0], [0.0, 1.0]])
cov_three = np.array([[1.0, -0.5], [-0.5, 1.0]])

Multivariate Gaussian PDF implementation.

To gain intuition, implement this from scratch instead of using a library (e.g., scipy) that calculates the pdf for you.

Write the multivariate_gaussian_pdf function that calculates the probability density function of a multivariate Gaussian:

def multivariate_gaussian_pdf(x_vec: np.ndarray, 
        mu_vec: np.ndarray, covariance_matrix: np.ndarray) -> float:
    """
    Calculates the probability density function of a multivariate Gaussian.

    Args:
        x_vec: Vector at which to evaluate the pdf.
        mu_vec: Mean of distribution.
        covariance_matrix: Covariance matrix of the distribution.

    Returns:
        Probability density function value at location x_vec.

    Notes:
        You can use any library you want to carry out the linear algebra operations 
        (determinants, matrix multiplication, inversion), but you cannot use a 
        library (e.g. scipy) that calculates the pdf for you. The function should 
        work for any dimension of multivariate Gaussian, not just for the 
        2-dimensional case.  Some functions you may find useful:
        - np.linalg.det
        - np.linalg.inv
        - np.exp
        - np.matmul
    """

Marginal and conditional distributions. Now let’s see how conditioning on or marginalizing over a specific variable of our multivariate Gaussian changes our distribution. Let’s start by writing down the marginalized / conditioned mean and covariance matrix from class. To keep things simple, we’ll just marginalize / condition on a single variable.

def marginal_mean_covariance(mu_vec: np.ndarray, 
        covariance_matrix: np.ndarray, 
        marginal_index: int) -> Tuple[np.ndarray, np.ndarray]:
    """
    Calculates the marginalized mean and covariance matrix of a multivariate 
    Gaussian.

    Args:
        mu_vec: Mean of distribution.
        covariance_matrix: Covariance matrix of the distribution.
        marginal_index: Variable index to marginalize on.

    Returns:
        Mean and covariance of marginalized Gaussian.

    Notes:
        The easiest way to do this is to permute the vector and matrix so that 
        the mariginal_index is the last index.  Some functions you may find 
        useful beyond the ones you've already used:
        - np.delete
    """

def conditioned_mean_covariance(mu_vec: np.ndarray, 
        covariance_matrix: np.ndarray, condition_index: int, 
        condition_value: float) -> Tuple[np.ndarray, np.ndarray]:
    """
    Calculates the conditioned mean and covariance matrix of a multivariate 
    Gaussian.

    Args:
        mu_vec: Mean of distribution.
        covariance_matrix: Covariance matrix of the distribution.
        conditioned_index: Variable index to condition on.
        conditioned_value: Value of the variable to condition on.

    Returns:
        Mean and covariance of conditioned Gaussian.

    Notes:
        The easiest way to do this is to permute the vector and matrix so that 
        the condition_index is the last index. 
        Some functions you may find useful beyond the ones you've already used:
        - np.delete
        - np.squeeze
    """

Let’s consider our three-dimensional covariance matrix and see how marginalizing changes the distribution.

mu_four = np.array([0.2, 0.3, -0.2])
cov_four = np.array([[1.0, 0.5, 0.2],[0.5, 1.0, 0.5], [0.2, 0.5, 1.0]])

Intuition building. Stop for a second and think about what you expect when you condition on x_2. First, if we condition on a large positive value of x_2, how should that change the means of x_1 and x_3? Second, if we condition on x_2, should x_1 and x_3 become more strongly correlated or anti-correlated? To be more specific, how should x_3 change if we increase x_1 while conditioning on x_2?

a) If we condition on a large positive value of x_2, how should that change the means of x_1 and x_3?

A large positive value of x_2 will increase the mean of x_1 and x_3 since the covariance is positive.

b) If we condition on x_2, should x_1 and x_3 become more strongly correlated or anti-correlated?

If we condition on x_2, then x_1 and x_3 will become more strongly anti-correlated.

c) How should x_3 change if we increase x_1 while conditioning on x_2?

Increasing x_1 will require decreasing x_3 in order to keep x_2 constant. This happens because x_2 and x_1 are positively correlated and x_2 and x_3 are positively correlated.

Test your implementation. Use your functions to calculate the following:

a) Calculate the marginal mean and covariance when marginalizing over x_2 (index=1):

my_mean, my_covariance = marginal_mean_covariance(
    mu_four, cov_four, marginal_index=1)

Your results:

Mean: [0.2, -0.2] Covariance: [[1.0, 0.2], [0.2, 1.0]]

b) Calculate the conditional mean and covariance when conditioning on x_2 = 1.3 (index=1):

my_mean, my_covariance = conditioned_mean_covariance(
    mu_four, cov_four, condition_index=1, condition_value=1.3)

Your results:

Mean: [0.7, 0.3] Covariance: [[0.75, -0.05], [-0.05, 0.75]]

Analysis questions. For this question, you’ll need the visualizations from the associated code noteboook. Based on the notebook’s visualizations and your calculations:

a) Compare the three 2x2 covariance matrices (cov_one, cov_two, cov_three). How do the off-diagonal elements affect the shape of the probability density contours?

Positive off-diagonal elements (cov_one) create elliptical contours tilted at a positive angle, indicating positive correlation. Zero off-diagonal elements (cov_two) create circular contours, indicating no correlation. Negative off-diagonal elements (cov_three) create elliptical contours tilted at a negative angle, indicating negative correlation.

b) When you marginalize over x_3 in the 3D case, how does the resulting 2D covariance matrix compare to the original upper-left 2x2 block of the 3D matrix?

The resulting 2D covariance matrix after marginalizing over x_3 is exactly the upper-left 2x2 block of the original 3D matrix: [[1.0, 0.5], [0.5, 1.0]]. This is because marginalization simply removes the variable and its correlations from consideration.

c) How does conditioning on different values of x_2 affect the correlation between x_1 and x_3? Explain what you observe.

Conditioning on x_2 introduces negative correlation between x_1 and x_3 (the off-diagonal becomes -0.05). The specific value of x_2 we condition on affects the means but not the covariance structure. This negative correlation arises because both x_1 and x_3 are positively correlated with x_2, so when x_2 is fixed, increasing one requires decreasing the other.