date: 2024-12-25
title: ML-Cheat-Sheet
status: DONE
author:
  - AllenYGY
tags:
  - CheatSheet
  - MachineLearing
publish: true

ML-Cheat-Sheet

Basic Rules of Differentiation

Basic Rules

Constant Rule:
Power Rule:
Linear Combination:
Product Rule:
Quotient Rule:
Chain Rule:
Exponential: | |
Logarithmic ||

Linear Regression

1. Hypothesis

2. Cost Function

Mean Squared Error (MSE):

3. Optimization

Gradient Descent:
Normal Equation:

Logistic Regression

1. Hypothesis

Prediction Rule:
- Predict if , otherwise .

2. Cost Function

Log Loss:

3. Optimization

Gradient Descent:

4. Sigmoid Properties

Output:
Derivative:

Ridge Regression

Loss Function

Adds regularization to prevent overfitting:

: Regularization parameter. Higher values shrink .

Optimization

Closed-form Solution:
Gradient Descent:

Bayesian Classification

Dataset

Posterior Probability

The probability of class given input :

If features are conditionally independent:

SVM

Hard SVM
Hyperplane:
Constraint:
Goal: s.t.
Lagrangian:
Partial derivative:
Solution:
Lagrangian becomes: s.t. and
Weight vector:
Bias:

Soft SVM
Hyperplane:
Constraint:
Goal:
Lagrangian:
Partial Derivative:
Solution:
Dual Problem:
s.t.
Weight vector:
Bias:
The reason that ξ disappears: The slack variables disappear in the dual problem because they are implicitly handled through the Lagrange multipliers .
By taking the derivative of the Lagrangian with respect to , we obtain: This relationship ensures that is bounded by .
Consequently, the slack variables do not explicitly appear in the dual formulation. Instead, the dual problem balances maximizing the margin and allowing for misclassification through the constraint on .

Kernel SVM
Hyperplane:
Constraint:
Goal:
Lagrangian (Dual):
s.t.
Weight vector:
Decision Function:
Bias:
Kernel Functions:
Linear:
Polynomial:
Gaussian (RBF):
Sigmoid:

MLE and MAP

MLE

构建似然函数：联合分布。
取对数简化计算：。
求导并设为 0：，解得。
验证极值：通过二阶导数等方式确保是最大值。

MAP

结合先验构建后验概率：。
取对数后验函数：。
求导并设为 0：，解得。
验证极值：确保找到最大值。