Skip to main content

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange

Loading…

current community
- Cross Validated
  
  help chat
- Cross Validated Meta
your communities

Sign up or log in to customize your list.

more stack exchange communities
company blog
Log in
Sign up

1. Home
2. Questions
3. Unanswered
4. AI Assist
5. Tags
7. Chat
8. Users
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

Stack Internal

Knowledge at work

Bring the best of human thought and AI automation together at your work.

Explore Stack Internal

Questions tagged [kullback-leibler]

Ask Question

An asymmetric measure of distance (or dissimilarity) between probability distributions. It might be interpreted as the expected value of the log likelihood ratio under the alternative hypothesis.

Learn more…
Top users
Synonyms

542 questions

Newest Active Bountied Unanswered

Bountied 0
Unanswered
Frequent
Score
Trending
Week
Month
Unanswered (my tags)

Filter by

No answers

No upvoted or accepted answers

Has bounty

Days old

Sorted by

Newest

Recent activity

Highest score

Most frequent

Bounty ending soon

Trending

Most activity

Tagged with

My watched tags

The following tags:

3 votes

2 answers

142 views

Why is KL Divergence notation $D_{KL}(P^* \parallel \hat{Q})$ "reversed" compared to Euclidean distance/difference $d(A,B)$ and subtraction $A-B$?

I'm having a fundamental disconnect between my intuition for KL divergence and the standard notation $D_{KL}(P \parallel Q)$. My intuition, which I believe is correct, is based on "excess ...

intuition
information-theory
notation
kullback-leibler
information-geometry

Charlie Parker

7,398

asked Nov 5 at 20:51

0 votes

0 answers

61 views

Is the binary hypothesis testing's minimum error probability monotonically decreasing with KL divergence?

For the following binary hypothesis testing problem $$ \begin{aligned} H_0: \boldsymbol{y} \sim f(\boldsymbol{y} | H_0)\\ H_1: \boldsymbol{y} \sim f(\boldsymbol{y} | H_1) \end{aligned} $$ where $\...

hypothesis-testing
kullback-leibler

colter

1

asked Aug 26 at 4:35

13 votes

1 answer

576 views

Is Berk (1966)'s main theorem standard in Statistics/Probability? Is there a name for it?

Not a technical question, more of a curiosity from someone outside of Statistics/Probability. The paper from Berk (1966), "Limiting Behavior of Posterior Distributions when the Model is Incorrect&...

bayesian
kullback-leibler
misspecification

Joao Francisco Cabral Perez

153

asked Aug 22 at 9:44

4 votes

1 answer

125 views

Is i.i.d assumption necessary for MLE?

I learnt from the course 18.650 MIT OCW that we need i.i.d samples to derive MLE from KL divergence. But in the GLM framework the catch is when we model the mean of the selected distribution basically ...

generalized-linear-model
maximum-likelihood
mean
kullback-leibler

Kavalali

373

asked Jul 25 at 22:10

1 vote

0 answers

93 views

KL divergence and deep learning paradigm

My question is regarding the paradigm of deep learning, I do not get where does the cost functions come from? For example for a classification task are we treating the encoder as the expected value of ...

neural-networks
maximum-likelihood
loss-functions
kullback-leibler

Kavalali

373

asked Jul 21 at 22:15

4 votes

0 answers

120 views

Is there an exact decomposition of KL divergence into marginal mismatches and higher-order dependencies?

Exact hierarchical decomposition of KL divergence into marginals and higher‑order interactions In the standard set‑up, you compare a joint distribution $$ P(X_1,\dots,X_k) $$ to an independent ...

entropy
information-theory
kullback-leibler

Will

416

asked Apr 18 at 15:30

0 votes

0 answers

22 views

Mutual information (also known as Kullback–Leibler divergence). Is that true? [duplicate]

I came across an article that stated the following: However, from this discussion, mutual information is not equivalent to Kullback–Leibler divergence. I assume only one interpretation can be correct ...

kullback-leibler
mutual-information

anna6931

151

asked Feb 16 at 6:00

3 votes

1 answer

182 views

Estimate Kullback-Leibler divergence with Monte Carlo when one of the distributions is simple

I'm interested in estimating $D_\mathrm{KL}(q \parallel p) = \int q(x) \log \frac{q(x)}{p(x)}\,\mathrm dx$, where $p$ is a multivariate Gaussian and $q$ is an implicit distribution parameterized by a ...

normal-distribution
kullback-leibler
gan

Kaiwen

307

asked Jan 16 at 16:12

0 votes

0 answers

91 views

How to calculate the KL divergence between two multivariate complex Gaussian distributions?

I am reading a paper "Complex-Valued Variational Autoencoder: A Novel Deep Generative Model for Direct Representation of Complex Spectra" In this paper, the author calculate the KL ...

kullback-leibler
variational-bayes
complex-numbers

Jiatong LI

1

asked Dec 30, 2024 at 12:13

1 vote

1 answer

135 views

How do I compute the KL divergence between my MCMC samples and the target distribution?

Assume $(E,\mathcal E,\lambda)$ is a $\sigma$-finite measure space and $\nu$ is a probability measure on $(E,\mathcal E)$ with $\nu\ll\lambda$. Furthermore, assume that $\mu=\sum_{i=0}^{n-1}\delta_{...

markov-chain-montecarlo
markov-process
kullback-leibler
ergodic

0xbadf00d

223

asked Nov 7, 2024 at 17:10

1 vote

1 answer

165 views

How exactly is Empirical KL Divergence Defined and how is it calculated

Suppose that we have two independent identically distributed samples. The first sample looks like $x_1 , \ldots, x_n$ with $x_i \in \mathbb{R}^d$ for every $i$. The second sample looks like $y_1, \...

distributions
mathematical-statistics
kullback-leibler
empirical-cumulative-distr-fn

温泽海

808

asked Oct 5, 2024 at 22:46

0 votes

0 answers

63 views

Quantification of KL divergence error for approximating a distribution over discrete random variables

I would like to know the following which has been stated in some literature, but never explicitly proved Consider a setup consisting of a binary vector of random variables of length n say $\vec{v}=(...

binary-data
kullback-leibler
polynomial
discrete-distributions
discrete-optimization

chemo

1

asked Sep 22, 2024 at 18:24

1 vote

1 answer

394 views

Generalised Jensen-Shannon divergence - What is a small JSD?

I am comparing the similarity between multiple distributions based on the output of different machine-learning models. I am applying the generalised JS divergence (wiki): $$ JSD_{\pi_1,...,\pi_n}(p_1,....

machine-learning
information-theory
kullback-leibler
divergence

Edi

175

asked Sep 15, 2024 at 17:46

1 vote

0 answers

133 views

Proof of asymmetry of relative entropy (KL-divergence) $D(p∥q) \neq D(q∥p)$ [duplicate]

Unlike a real distance measure, relative entropy is not symmetric in the sense that $D(p(x)∥q(x)) \neq D(q(x)∥p(x))$. It turns out that many information measures can be expressed by relative entropies....

dataset
entropy
kullback-leibler

허정윤

11

asked Sep 11, 2024 at 10:40

1 vote

0 answers

29 views

Why are we using KL divergence over cross entropy? [duplicate]

I read this question Why do we use Kullback-Leibler divergence rather than cross entropy in the t-SNE objective function? and I cannot fully understand the answer. If we're using KL divergence for the ...

probability
distributions
entropy
kullback-leibler
cross-entropy

COTHE

11

asked Aug 31, 2024 at 8:48

15 30 50 per page

1

2 3 4 5

…

Featured on Meta
AI Assist is now available on Stack Overflow
Native Ads coming soon to Stack Overflow and Stack Exchange

Hot Network Questions

Linguistics SciFi where females and males had developed separate languages
What distinction is present between Luke 24:1 and Acts 20:7?
How is a microcontroller a current source?
Is it fine if I bake chicken thighs whole, then dice them once cooked, when making stir fry?
Where should the bridges be built to minimize the length of the path between two towns?
What are the reasons not to install an oil catch can (OCC)?
Ramsey theorem where we only ask for "polychromatic" set?
Strange output using AMSmath's \operatorname and fontsetup
The weight of a free falling object
Why does open position favor black?
Just now in reported speech
Edge has taken over F7 - how to get it back?
Do indoor plants significantly lower indoor carbon dioxide levels?
Extract virus bootsectors from Spy Format Elite
Will a cup of boiling water wipe out the baking powder in a cake recipe?
Why does a (default) gateway use a layer 3 address?
Is there a natural topos where the Riemann hypothesis is provable or disprovable?
Docker daemon incorrectly responds with API version error
Science fiction book with a bizarre punishment/execution method where the prisoner is compressed into a very small gap by an approaching wall
Why were the 2025 mayoral elections in the UK put off?
What was "a super's banner"?
Zero-Cross Detector with LM393 + 9 VAC Wall Adapter
How is "no trespassing" enforced in a collective property with 1200 houses?
How to add a mirror like shadow effect to a title?

more hot questions

Newest kullback-leibler questions feed

Subscribe to RSS

Newest kullback-leibler questions feed

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

Cross Validated

Tour
Help
Chat
Contact
Feedback

Company

Stack Overflow
Stack Internal
Stack Data Licensing
Stack Ads
About
Press
Legal
Privacy Policy
Terms of Service
Cookie Policy

Stack Exchange Network

Technology
Culture & recreation
Life & arts
Science
Professional
Business
API
Data

Blog
Facebook
Twitter
LinkedIn
Instagram

Site design / logo © 2025 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2025.12.8.37763