Distributions Guide

This section provides detailed information about all probability distributions available in greybox.

Overview

greybox provides a comprehensive set of probability distributions for modeling different types of data. Each distribution family includes:

Density function (d*) - Probability density function (PDF)
Cumulative distribution function (p*) - CDF
Quantile function (q*) - Inverse CDF
Random generator (r*) - Random sample generation

Common Parameters

Most distribution functions share common parameters:

loc : Location parameter (often the mean). For the normal distribution, loc = μ (mean) and scale = σ (standard deviation).
scale : Scale parameter
log : If True, return the log of the probability
lower_tail : If True (default), probabilities are P[X ≤ x]

Continuous Univariate Distributions

Normal (Gaussian)

dnorm(q, loc=0.0, scale=1.0, log=False)

Normal (Gaussian) distribution density.

For the normal distribution, loc is the mean (μ) and scale is the standard deviation (σ).

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter (mean, μ)
scale – Scale parameter (standard deviation, σ)
log – If True, return log density

Returns:

Density at q

Example:

from greybox import dnorm
dnorm(0, loc=0, scale=1)  # ~0.3989

Laplace

dlaplace(q, loc=0, scale=1, log=False)

Laplace (double exponential) distribution density.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter (median)
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

S Distribution

ds(q, loc=0, scale=1, log=False)

S-distribution density - a heavy-tailed distribution.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Generalized Normal

dgnorm(q, loc=0, scale=1, shape=1, log=False)

Generalized Normal distribution density.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter
scale – Scale parameter
shape – Shape parameter (controls tail weight) - shape=1: Laplace - shape=2: Normal - shape<2: Heavy tails - shape>2: Light tails
log – If True, return log density

Returns:

Density at q

Example:

from greybox import dgnorm
# Normal-like
dgnorm(0, loc=0, scale=1, shape=2)
# Heavy-tailed
dgnorm(5, loc=0, scale=1, shape=1)

Logistic

dlogis(q, loc=0, scale=1, log=False)

Logistic distribution density.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Student’s t

dt(q, df, loc=0, scale=1, log=False)

Student’s t distribution density.

Parameters:

q – Value(s) at which to evaluate
df – Degrees of freedom
loc – Location parameter
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Asymmetric Laplace

dalaplace(q, loc=0, scale=1, alpha=0.5, log=False)

Asymmetric Laplace distribution density for quantile regression.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter
scale – Scale parameter
alpha – Asymmetry parameter (0 < alpha < 1) - alpha < 0.5: Right-skewed - alpha > 0.5: Left-skewed
log – If True, return log density

Returns:

Density at q

Log-Transformed Distributions

These distributions model the log of the response variable.

Log-Normal

dlnorm(q, loc=0, scale=1, log=False)

Log-Normal distribution density.

loc is the mean of the underlying normal on the log scale (meanlog), scale is the corresponding standard deviation (sdlog).

Parameters:

q – Value(s) - must be positive
loc – Mean of the underlying normal distribution (on log scale)
scale – Standard deviation of the underlying normal distribution
log – If True, return log density

Returns:

Density at q

Note: For positive-valued data that is right-skewed.

Log-Laplace

dllaplace(q, loc=0, scale=1, log=False)

Log-Laplace distribution density.

Parameters:

q – Value(s) - must be positive
loc – Location parameter (of log values)
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Log-S

dls(q, loc=0, scale=1, log=False)

Log-S distribution density.

Parameters:

q – Value(s) - must be positive
loc – Location parameter (of log values)
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Log-Generalized Normal

dlgnorm(q, loc=0, scale=1, shape=2, log=False)

Log-Generalized Normal distribution density.

Parameters:

q – Value(s) - must be positive
loc – Location parameter (of log values)
scale – Scale parameter
shape – Shape parameter
log – If True, return log density

Returns:

Density at q

Box-Cox Normal

dbcnorm(q, loc=0, scale=1, lambda_bc=0, log=False)

Box-Cox Normal distribution density.

Parameters:

q – Value(s) - must be positive
loc – Location parameter
scale – Scale parameter
lambda_bc – Box-Cox transformation parameter
log – If True, return log density

Returns:

Density at q

Folded and Rectified Distributions

These distributions model positive-valued data with a point mass at zero.

Folded Normal

dfnorm(q, loc=0, scale=1, log=False)

Folded Normal distribution density.

Parameters:

q – Value(s) - must be non-negative
loc – Location parameter of underlying normal
scale – Scale parameter of underlying normal
log – If True, return log density

Returns:

Density at q

Rectified Normal

drectnorm(q, loc=0, scale=1, log=False)

Rectified Normal distribution density.

Parameters:

q – Value(s) - must be non-negative
loc – Location parameter of underlying normal
scale – Scale parameter of underlying normal
log – If True, return log density

Returns:

Density at q

Distributions for Positive Values

Inverse Gaussian

dinvgauss(q, loc=1, scale=1, log=False)

Inverse Gaussian distribution density.

Parameters:

q – Value(s) - must be positive
loc – Mean parameter
scale – Scale parameter
log – If True, return log density

Returns:

Density at q

Gamma

dgamma(q, shape=1, scale=1, log=False)

Gamma distribution density.

Parameters:

q – Value(s) - must be positive
shape – Shape parameter (often denoted alpha)
scale – Scale parameter (often denoted theta)
log – If True, return log density

Returns:

Density at q

Exponential

dexp(q, loc=0, scale=1, log=False)

Exponential distribution density.

Parameters:

q – Value(s) - must be non-negative
loc – Location parameter
scale – Scale parameter (1/rate)
log – If True, return log density

Returns:

Density at q

Chi-Squared

dchi2(q, df, log=False)

Chi-squared distribution density.

Parameters:

q – Value(s) - must be non-negative
df – Degrees of freedom
log – If True, return log density

Returns:

Density at q

Count Distributions

Poisson

dpois(q, loc, log=False)

Poisson distribution probability mass function.

Parameters:

q – Value(s) - must be non-negative integers
loc – Mean/lambda parameter
log – If True, return log probability

Returns:

Probability mass at q

Negative Binomial

dnbinom(q, loc=1, size=1, log=False)

Negative Binomial distribution probability mass function.

Parameters:

q – Value(s) - must be non-negative integers
loc – Mean parameter
size – Dispersion parameter
log – If True, return log probability

Returns:

Probability mass at q

Geometric

dgeom(q, prob, log=False)

Geometric distribution probability mass function.

Parameters:

q – Value(s) - must be non-negative integers
prob – Probability of success
log – If True, return log probability

Returns:

Probability mass at q

Binomial

dbinom(q, size, prob, log=False)

Binomial distribution probability mass function.

Parameters:

q – Value(s) - must be integers between 0 and size
size – Number of trials
prob – Probability of success
log – If True, return log probability

Returns:

Probability mass at q

Binary/Bounded Distributions

Beta

dbeta(q, a, b, log=False)

Beta distribution density.

Parameters:

q – Value(s) - must be in [0, 1]
a – First shape parameter (alpha)
b – Second shape parameter (beta)
log – If True, return log density

Returns:

Density at q

Logit-Normal

dlogitnorm(q, loc=0, scale=1, log=False)

Logit-Normal distribution density.

Parameters:

q – Value(s) - must be in (0, 1)
loc – Mean of underlying normal (logit scale)
scale – Standard deviation (logit scale)
log – If True, return log density

Returns:

Density at q

CDF-Based Distributions

These distributions use the CDF in the likelihood.

Logistic CDF

plogis(q, loc=0, scale=1, log=False, lower_tail=True)

Logistic cumulative distribution function.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter
scale – Scale parameter
log – If True, return log probability
lower_tail – If True, return P[X ≤ q]

Returns:

CDF value at q

Probit (Normal CDF)

pnorm(q, loc=0, scale=1, log=False, lower_tail=True)

Normal cumulative distribution function.

Parameters:

q – Value(s) at which to evaluate
loc – Location parameter (mean)
scale – Scale parameter (standard deviation)
log – If True, return log probability
lower_tail – If True, return P[X ≤ q]

Returns:

CDF value at q

Summary

greybox supports the following distributions. For each distribution family, use the appropriate prefix:

d* - Probability density/mass function (PDF/PMF)
p* - Cumulative distribution function (CDF)
q* - Quantile function (inverse CDF)
r* - Random number generation

The choice of distribution depends on the nature of your data:

Continuous, symmetric: Normal, Laplace, Logistic, Student’s t
Heavy tails: Laplace, S, Generalized Normal, Student’s t
Positive, right-skewed: Log-Normal, Gamma, Inverse Gaussian
Count data: Poisson, Negative Binomial, Geometric, Binomial
Proportions: Beta, Logit-Normal
Zero-inflated: Folded Normal, Rectified Normal