Neural Netowrk Basic Terminology
π³πππ πππ πππππππ πππππππ ππ ππππ πππ¬π’π π΅πππππ πππππππ π»ππππ-
ππππ’π―πππ’π¨π§π¬ β Numbers that are calculated(both by Linear and non-linear layers).
πππ«ππ¦ππππ«π¬ β Numbers that are randomly initialized, and optimized(that is, the numbers that define the Model.
ππππ β Function that returns 0 for negative numbers and doesnβt change positive numbers.
ππ’π§π’-πππππ‘ β A small group of inputs and labels gathered together in two arrays. A gradient descent step is updated on this batch (rather than a whole epoch).
π
π¨π«π°ππ«π π©ππ¬π¬ β Applying the model to some input and computing the predictions.
ππ¨π¬π¬- A value that represents how well (or badly) our model is doing.
ππ«πππ’ππ§π β The derivative of the loss with respect to some parameter of the model.
ππππ€π°ππ«π π©ππ¬π¬- Computing the gradients of the loss with respect to all model parameters.
ππ«πππ’ππ§π-πππ¬πππ§π- Taking a step in the direction opposite to the gradients to make the model parameters a little bit better.
ππππ«π§π’π§π ππππ β The size of the step we take when applying SGD to update the parameters of the model.