kNearestNeighbors
K-Nearest Neighbors (kNN) is a simple, instance-based learning method used for classification and regression. It makes predictions for new data points by examining the labels of the k most similar examples in the training data, where similarity is defined by a distance metric in feature space. kNN is non-parametric and lazy: there is no explicit training phase that builds a model; instead, all training instances are stored and the prediction is computed at query time.
To predict a label for a new instance, the distances to all training instances are computed, the
Common choices of distance include Euclidean, Manhattan, and other Minkowski metrics; feature scaling is important because
Computationally, naïve kNN requires O(n d) distance calculations per prediction, where n is the number of training
Applications include image and text classification, recommender systems, anomaly detection, and imputation. Limitations include high memory