# How does scaled conjugate gradient work in neural network training? Comparison with gradient descent

Cross Validated Asked by Johanna on December 9, 2020

I am very new and beginner in the machine learning world, and I would like to ask if someone could simply explain to me how does the scaled conjugate gradient method work in neural network training? Especially in comparison with the gradient descent method, because I already understand that one.

I know exactly the steps on how to train a neural network with gradient descent, but in relation to scaled gradient I can only find far too advanced explanations that I can’t yet understand.

## Related Questions

### How to get from input depth to output depth in convnets?

1  Asked on February 6, 2021 by randy-welt

### What is “symmetry” in evaluation metrics

1  Asked on February 5, 2021 by cherry-wu

### Acceptable average difference value for scientific literature?

0  Asked on February 5, 2021 by mksm1228

### Not reaching convergence with mixed model

1  Asked on February 4, 2021 by paze

### Independent variables minimum counts for logistic regression

1  Asked on February 4, 2021 by gideon-j-i

### What’s the advantage of cosine distance over Jaccard distance for text document similarity

0  Asked on February 4, 2021 by zesla

### Best way to visualise presence/absence of specific events in multiple case/control studies

1  Asked on February 4, 2021 by user964689

### How can I determine the overall best algorithm from a set of algorithms given pairwise probabilities?

0  Asked on February 3, 2021 by relieff

### How to use CLT on statistical inference?

1  Asked on February 2, 2021 by user777

### Is there a mathematical proof for change being correlated with baseline value

2  Asked on February 2, 2021

### Why is $P(t < T leq t + dt) = f(t)dt$?

2  Asked on February 1, 2021

### I used Pearson’s product-moment correlation coefficient, what paper do I cite?

3  Asked on February 1, 2021 by mikhail

### Are the No Free Lunch Theorem and Halting Problem connected?

1  Asked on February 1, 2021 by user70990

### Interpreting logistic regression coefficients for a categorical variable

1  Asked on February 1, 2021 by rafael-hernndez-salazar

### Write mixed linear model as two level hierarchical model

1  Asked on January 31, 2021 by user179028

### How the multiplication of observations numbers contributes to Bayesian assumption in BIC calculation?

1  Asked on January 30, 2021 by eddie-s

### Confidence Intervals for the coefficients of a Multiple Multivariate Regression

0  Asked on January 30, 2021 by virginie

### Validation loss fluctuating while training the neural network in tensorflow

1  Asked on January 29, 2021 by i-a

### Specification of longitudinal mixed-effects model with varying treatment times, varying observation times in lme4

1  Asked on January 29, 2021

### Need help understanding how only variable A can be correlated to the absolute value of A-B

2  Asked on January 29, 2021 by marcus-bdholm

### Ask a Question

Get help from others!