Data Science(7)
-
[8] Understanding K-Neighborhood (KNN)
KNN KNN belongs to a typical 'Lazy Learner'. That is, instead of learning the discriminant function from the training data, we proceed with the learning by storing the training dataset in memory. Thanks to this, there is no cost in the learning process. Instead, the computational cost in the prediction phase is high. Memory-based classifiers have the advantage of being able to adapt immediately ..
2021.01.17 -
[7] About RandomForest
Random forest can be thought of as an ensemble of decision trees. Because individual decision trees have a high variance problem, the goal of Random Forest is to average multiple decision trees to improve generalization performance and reduce the risk of overfitting. Learning Process for Random Forests 1. Draw n random bootstrap samples by allowing redundancy in the training set. 2. Learn the de..
2021.01.07 -
[6] Information gain and impurity of decision tree
Decision trees are named because they are like trees in the form of class classification through certain criteria. The criteria for classifying decision trees are information gain. Information gains can be determined based on impurity. As the name suggests, impurity is an indicator of how various classes are mixed into the node. \( IG(D_p,f) = I(D_p) - \sum_{j=1}^{m}\frac{N_j}{N_p}I(D_j) \) Info..
2021.01.06 -
[5] Non-linear Troubleshooting with Kernel SVM
Algorithms such as linear SVMs and regression cannot distinguish classes that are distinguished by nonlinearity. Using kernel methods using the mapping function\(\phi\) can solve nonlinear problems. Using the mapping function, the nonlinear combination of the original characteristics can be projected into a linearly differentiated high-dimensional space, where the hyperplane is distinguished and..
2021.01.04 -
[3] Logistic Regression Principles
Logistic regression uses the sigmoid function as an activation function and the logability function as a cost function. Activation function Odds ratio: the probability that a particular event will occur. $$\frac{P}{(1 - P)}$$ Where P is the probability that it is a positive sample, which refers to the probability that the target to be predicted will occur. The log function is usually defined by ..
2020.12.31 -
[2] Implement Adaline (adaptive linear neuron)
This post is code from notbook, which won bronze medal in kaggle. If you are interested, please refer to the laptop. www.kaggle.com/choihanbin/predict-titanic-survival-by-adaptive-linear-neuron In the previous posting, we looked at perceptron. This time we're going to talk about Adalin, a slightly modified version of this. 2020/12/31 - [English/Machine learning algorithm] - [1] Try implementing ..
2020.12.31