Skip to main content

Questions tagged [loss-functions]

A function used to quantify the difference between observed data and predicted values according to a model. Minimization of loss functions is a way to estimate the parameters of the model.

Filter by
Sorted by
Tagged with
7 votes
1 answer
171 views

I’m trying to understand the common assumptions in machine-learning optimization theory, where a “well-behaved” loss function is often required to be both L-Lipschitz and β-smooth (i.e., have β-...
Antonios Sarikas's user avatar
0 votes
0 answers
16 views

I am using gradient boosting regressor from scikit-learn with squared error as the loss function. Then i want to plot the training set vs test set curve. Based on what i read, it is used to see the ...
Ocean's user avatar
  • 111
0 votes
0 answers
42 views

A key element in Bayesian neural networks is finding the probability of a set of weights, so that it can be applied to Bayes rule. I cannot think of many ways of doing this, for P(w) (also sometimes ...
user494234's user avatar
2 votes
1 answer
125 views

PREMISES: this question likely arises from my very basic knowledge of the field. Please, be very detailed in the answer, even it can seem that some facts are trivial. Also, sorry for my poor english. ...
2by2is2mod2's user avatar
0 votes
0 answers
70 views

For different models with same batchsizes the start loss and loss after the steep part would be very similar, is that normal? With bigger batchsizes, axis gets scaled but graph still has the same ...
Darius's user avatar
  • 1
4 votes
1 answer
316 views

Given two tensors $x$ and $y$ both of shape $(N,n)$ ($N$ being the number of samples and $n$ the number of dimensions of each sample), the MSE loss is (according to what I think): $$ \mathrm{MSE}(x,y)=...
xuanphong's user avatar
0 votes
0 answers
55 views

Given these two target representations for the same underlying data: Target A : Minority class samples (Cluster 5) isolated in distribution tail, majority class samples (Clusters 3+6) shifted toward ...
n0rdp0l's user avatar
5 votes
1 answer
172 views

Currently I am dealing with time-series data conserning the power consumption of machines. Therefore, all target variables range from zero to infinity, technically ($y \in [0, \infty)$). The data ...
Blindschleiche's user avatar
1 vote
0 answers
93 views

My question is regarding the paradigm of deep learning, I do not get where does the cost functions come from? For example for a classification task are we treating the encoder as the expected value of ...
Kavalali's user avatar
  • 373
4 votes
2 answers
511 views

Many textbooks on the theory of machine learning state that statistical decision theory provides the basis for comparing ML algorithms. In statistical decision theory, decision rules are compared ...
dcoccjcz's user avatar
  • 141
0 votes
0 answers
80 views

I want to predict an angular parameter ($\phi$) from some signal using a CNN. Due to the architecture of my code, the regression is done on the two targets ($\cos\phi$, $\sin\phi$). I created a model ...
Neinstein's user avatar
  • 101
8 votes
5 answers
1k views

The standard objective function when training a logistic regression model is: Minimize Negative Log Likelihood This form makes it easier to optimize, but it is mathematically equivalent to the more ...
Sam's user avatar
  • 91
4 votes
1 answer
141 views

This answer describes two loss functions for Bayesian credible intervals, each of which is minimized by a particular kind of interval. I am curious whether there exists a loss function on credible ...
Adam L. Taylor's user avatar
1 vote
0 answers
69 views

This is from another question here. The theorem below is from Lambert's paper about forecasting, (Elicitation and Evaluation of Statistical Forecasts): $\textbf{Proposition}\quad 1:$ Let $(\Theta = \{\...
Oliver Queen's user avatar
1 vote
0 answers
29 views

Let us say we have an i.i.d. sample of data from a random variable $X$. Suppose an agent must guess the value $x$ of $X$ that will be generated next. The guess is $\hat x$. They will make an error $e:=...
Richard Hardy's user avatar

15 30 50 per page
1
2 3 4 5
80