Flipping some notation, suppose we have a joint distribution (i.e., static problem, no observations). Consider the idea that parameterizes the likelihood (again, in a somewhat frequentist sense). Then, we may end up sampling , i.e., have an ensemble of samples of for each choice of (representing a distribution). We may want to perform some kind of generative modeling (or surrogate) approach of learning from these samples. One method would just be to throw out all but one for each (i.e. just take joint samples ), but this wastes precious information. Another idea, perhaps a little more "out-there" would be to learn maps learned from mapping to a Gaussian distribution (where every is parameterized identically). Then, each map would be parameterized by some vector . In some sense, this is ideally some kind of summary statistic of a distribution. Then, we have pairs . If we learn the map (not necessarily the same parameterization as ) mapping to a Gaussian, then we would be able to generate samples for a given parameter to generate realizations of , which in turn would allow us to plug into our and generate realizations of .