K-Nearest Neighbors Algorithm: Classification and Regression Star

The K-Nearest Neighbors algorithm (or kNN) can be used to solve both classification and regression problems. Simple and easy-to-implement, this supervised machine learning algorithm is one of the most widely used, from genetics to Netflix recommendations.

The kNN algorithm classifies a data point according to its neighbors’ classification. It assumes that similar things exist in proximity, hinging on this assumption being true enough for the algorithm to be useful.

The origins of kNN can be found in research carried out for the U.S. armed forces. Evelyn Fix (1904–1965) was a mathematician and statistician who taught statistics at Berkeley. Joseph Lawson Hodges Jr. (1922–2000) was a fellow statistician at Berkeley who worked with the Twentieth Air Force of USAF (United States Air Forces) from 1944. Putting their brilliant minds together, they wrote a technical analysis report in 1951 for USAF. It introduced discriminant analysis, a non-parametric classification method. However, the paper was never officially published — most likely for confidentiality reasons in the wake of World War II.

Championing Classification

This is the first known introduction of non-parametric classification and would later become the famous kNN algorithm. In 1967, Thomas Cover and Peter Hart proved an upper bound error rate with multiclass kNN classifications. Since then, there have been many improvements and new approaches continue to emerge.

“Birds of a feather flock together.”

Today, the kNN algorithm is often used as a benchmark for more complex classifiers, like the Support Vector Machines (SVM) and the Artificial Neural Networks (ANN). Despite its simplicity, kNN fares better than many more powerful classifiers. It proves useful in areas ranging from genetics to data compression, and economic forecasting. The kNN algorithm is also used in video recognition, image recognition, handwriting detection, and speech recognition — as well as the customer reading or watching recommendations made by companies like Amazon and Netflix.

BROUGHT TO YOU BY

K-Nearest Neighbors Algorithm: Classification and Regression Star

The K-Nearest Neighbors algorithm (or kNN) can be used to solve both classification and regression problems. Simple and easy-to-implement, this supervised machine learning algorithm is one of the most widely used, from genetics to Netflix recommendations.

Championing Classification

Key Dates

Non-Parametric Pattern Classification

Nearest Neighbor Pattern Classification

Fuzzy kNN

Discover more articles

Dartmouth Summer Research Project: The Birth of Artificial Intelligence

K-Nearest Neighbors Algorithm: Classification and Regression Star

Katherine Johnson: Trailblazing NASA Mathematician

Mary Lucy Cartwright: The Inspired Mathematician Behind Chaos Theory

K-Nearest Neighbors Algorithm: Classification and Regression Star

The K-Nearest Neighbors algorithm (or kNN) can be used to solve both classification and regression problems. Simple and easy-to-implement, this supervised machine learning algorithm is one of the most widely used, from genetics to Netflix recommendations.

Championing Classification

Key Dates

Non-Parametric Pattern Classification

Nearest Neighbor Pattern Classification

Fuzzy kNN

Discover more articles

Dartmouth Summer Research Project: The Birth of Artificial Intelligence

K-Nearest Neighbors Algorithm: Classification and Regression Star

Katherine Johnson: Trailblazing NASA Mathematician

Mary Lucy Cartwright: The Inspired Mathematician Behind Chaos Theory

Your inbox will love Data Science!