Welcome to excee’s documentation!¶
excee is (yet another) library for analyzing and plotting sampled distributions, in particular from Markov chain Monte Carlo simulations.
This documentation is a work in progress.
Yet another “why yet another MCMC plotting library” explainer¶
excee is another in a long line of libraries for visualizing high-dimensional, sampled distributions.
Its defining design feature is its usage of xarray data structures, which provide sufficient metadata to remove numerous pain points of using plain numpy arrays (selecting variables by name rather than index, specifying labels via attrs rather than ordered lists, and enabling per-variable configuration via dictionary input) without introducing custom classes.
xarray’s data structures, aside from having their own slate of features (including built-in serialization to and from disk and semantic indexing; see Why xarray?), remain compatible with almost the entire Scientific Python ecosystem.
For the uninitiated, xarray.Dataset, the main workhorse for excee, could be thought of as a dictionary of numpy arrays that still acts like an array.
Several additional considerations motivate its design and features:
For posterior plots, it originally wrapped corner. Eventually, new features required a reimplementation that remains heavily inspired by corner. These include:
Panel layouts other than the classic isosceles triangle.
compare_2d_dists()andplot_joint_dist()allow specifying row and column variables separately and even support completely arbitrary arrangements (irregular plot grids).Smoothing via kernel density estimation that avoids the most common sources of bias, in particular from parameter boundaries (which naive smoothing bleeds across) and correlations between variables in 2D (which naive smoothing artificially isotropizes).
Automatic boundary detection from samples with a simple but effective coefficient of L-variation threshold, used both for boundary handling in kernel density estimation and for (optionally) automated credible interval summaries (choosing between one- and two-sided summaries).
Built-in support for comparing multiple distributions on the same axes with
compare_2d_dists().One-dimensional distribution plots, in standard style (
compare_1d_dists()) and as “violin” plots (compare_violin()).
Some libraries with similar feature sets tend to oversmooth by default, which can deceive the user into thinking that their samples have converged sufficiently. In the spirit of “crappy contours” being an extremely useful diagnostic tool, joint distribution plots display unsmoothed distributions by default. To leave little excuse for not checking that smoothed density estimates faithfully represent the underlying sample,
test_smoothing(), a thin wrapper ofcompare_2d_dists(), makes the check trivial.By centralizing on
xarrayfor its data format and avoiding custom containers formatplotlibfigures,exceeintroduces no library-specific classes or structures for users to learn and interface with—just a simple, functional API.Through simple design and effective use of functionality already implemented in
numpyandscipy, the library itself is dramatically simpler and a fraction of the size of some others.Beyond posterior plots,
exceeprovides diagnostic utilities to assess convergence, including (performance-optimized) autocorrelation analysis and diagnostic plots as well as other statistics.The very fact that this explainer is a cliché might suggest that (some of) the workflow these libraries address is too user-specific for a general purpose library. Nevertheless,
exceeattempts to impose a minimum of stylistic choices, leaving as much plot styling accessible (via propagated keyword argument dictionaries) as possible.
Table of contents¶
API Reference: