kNearestNeighbors

K-Nearest Neighbors (kNN) is a simple, instance-based learning method used for classification and regression. It makes predictions for new data points by examining the labels of the k most similar examples in the training data, where similarity is defined by a distance metric in feature space. kNN is non-parametric and lazy: there is no explicit training phase that builds a model; instead, all training instances are stored and the prediction is computed at query time.

To predict a label for a new instance, the distances to all training instances are computed, the

Common choices of distance include Euclidean, Manhattan, and other Minkowski metrics; feature scaling is important because

Computationally, naïve kNN requires O(n d) distance calculations per prediction, where n is the number of training

Applications include image and text classification, recommender systems, anomaly detection, and imputation. Limitations include high memory

k

a

a

high-dimensional

dimensionality).

k

k

k

d

nearest-neighbor