Similarity score
Encyclopedia
In Sabermetrics
Sabermetrics
Sabermetrics is the specialized analysis of baseball through objective, empirical evidence, specifically baseball statistics that measure in-game activity. The term is derived from the acronym SABR, which stands for the Society for American Baseball Research...

 and APBRmetrics
APBRmetrics
APBRmetrics is a term sometimes used to refer to the analysis of basketball through objective evidence, especially basketball statistics. APBRmetrics is a cousin to the study of baseball statistics, known as Sabermetrics, and similarly takes its name from the acronym APBR, which stands for the...

, similarity scores are a method of comparing baseball
Baseball
Baseball is a bat-and-ball sport played between two teams of nine players each. The aim is to score runs by hitting a thrown ball with a bat and touching a series of four bases arranged at the corners of a ninety-foot diamond...

 and basketball
Basketball
Basketball is a team sport in which two teams of five players try to score points by throwing or "shooting" a ball through the top of a basketball hoop while following a set of rules...

 players (usually in MLB
Major League Baseball
Major League Baseball is the highest level of professional baseball in the United States and Canada, consisting of teams that play in the National League and the American League...

 or the NBA
National Basketball Association
The National Basketball Association is the pre-eminent men's professional basketball league in North America. It consists of thirty franchised member clubs, of which twenty-nine are located in the United States and one in Canada...

) to other players, with the intent of discovering who the most similar historical players are to a certain player.

Similarity scores are among the many original sabermetric concepts first introduced by Bill James
Bill James
George William “Bill” James is a baseball writer, historian, and statistician whose work has been widely influential. Since 1977, James has written more than two dozen books devoted to baseball history and statistics...

. James initially created the concept as a way to effectively compare non-Hall of Fame
Hall of Fame
A hall of fame, wall of fame, walk of fame, walk of stars or avenue of stars is a type of attraction established for any field of endeavor to honor individuals of noteworthy achievement in that field...

 players to players in the Hall, to see who was either on track to make the HOF, or to determine if any eligible players had been snubbed by the selection committee. For example, if the most similar players to a non-HOFer were all in the Hall of Fame, one could effectively argue that that player should be in the Hall.

More recently, similarity scores have been used to determine career paths and projected statistics for players. The logic behind this line of thought is simple: players often follow similar career trajectories to their most similar players, so the historical similar players' performance in years after the active player's current age should be a good predictor of that active player's future production. An example of this would be the Football Outsiders
Football Outsiders
Football Outsiders is a website started in July 2003 which focuses on advanced statistical analysis of the NFL. The site is run by a staff of regular writers, who produce a series of weekly columns using both the site's in-house statistics and their personal analyses of NFL games.In 2005 and 2006,...

' discovery that all but the highest caliber of wide receiver
Wide receiver
A wide receiver is an offensive position in American and Canadian football, and is the key player in most of the passing plays. Only players in the backfield or the ends on the line are eligible to catch a forward pass. The two players who begin play at the ends of the offensive line are eligible...

s suffer a marked decline after their seventh season in the NFL
National Football League
The National Football League is the highest level of professional American football in the United States, and is considered the top professional American football league in the world. It was formed by eleven teams in 1920 as the American Professional Football Association, with the league changing...

, a fact that bore out for the receivers selected in the 1996 NFL Draft
1996 NFL Draft
The 1996 NFL Draft was the procedure by which National Football League teams selected amateur college football players. It is officially known as the NFL Annual Player Selection Meeting. The draft was held April 20–21, 1996...

 when their production collectively slipped.

Many baseball analysts have augmented James' method over the years, or come up with their own system of measuring similarity. Baseball Prospectus
Baseball Prospectus
Baseball Prospectus is an organization that publishes a website, BaseballProspectus.com, devoted to the sabermetric analysis of baseball. BP has a staff of regular columnists and provides advanced statistics as well player and team performance projections on the site...

employs a projection system developed by Nate Silver
Nate Silver
Nathaniel Read "Nate" Silver is an American statistician, psephologist, and writer. Silver first gained public recognition for developing PECOTA, a system for forecasting the performance and career development of Major League Baseball players, which he sold to and then managed for Baseball...

 known as PECOTA
PECOTA
PECOTA, an acronym for Player Empirical Comparison and Optimization Test Algorithm, is a sabermetric system for forecasting Major League Baseball player performance. The word is a backronym based on the name of journeyman major league player Bill Pecota, who with a lifetime batting average of .249...

 which applies nearest neighbor analysis
Nearest neighbor search
Nearest neighbor search , also known as proximity search, similarity search or closest point search, is an optimization problem for finding closest points in metric spaces. The problem is: given a set S of points in a metric space M and a query point q ∈ M, find the closest point in S to q...

 to calculate similarities between players from different eras. Pro Football Prospectus (written by Football Outsiders
Football Outsiders
Football Outsiders is a website started in July 2003 which focuses on advanced statistical analysis of the NFL. The site is run by a staff of regular writers, who produce a series of weekly columns using both the site's in-house statistics and their personal analyses of NFL games.In 2005 and 2006,...

) has their own system (dubbed "KUBIAK" after longtime Broncos
Denver Broncos
The Denver Broncos are a professional American football team based in Denver, Colorado. They are currently members of the West Division of the American Football Conference in the National Football League...

 backup quarterback
Quarterback
Quarterback is a position in American and Canadian football. Quarterbacks are members of the offensive team and line up directly behind the offensive line...

 Gary Kubiak
Gary Kubiak
Gary Wayne Kubiak is the head coach for the Houston Texans of the National Football League. Kubiak has participated in six Super Bowls, losing three as a player with the Denver Broncos and winning three as an assistant coach with Denver and the San Francisco 49ers.-High school:Kubiak passed for a...

) for projecting future performance. John Hollinger
John Hollinger
John Hollinger is an analyst and writer for ESPN. He primarily covers the NBA. Hollinger grew up in Mahwah, New Jersey and is a 1993 graduate of the University of Virginia....

 developed a similar system for basketball players in his Pro Basketball Forecast series of books, and several APBRmetricians have expanded on his methodology. Similarity scores are also used extensively in many statistical forecasting programs.

External links

  • Baseball Reference, which employs a similarity method much like James' original method
  • Basketball-Reference.com, which features a complex similarity-score system for NBA players
  • Football Outsiders
  • Baseball Prospectus, which uses similarity scores in PECOTA
    PECOTA
    PECOTA, an acronym for Player Empirical Comparison and Optimization Test Algorithm, is a sabermetric system for forecasting Major League Baseball player performance. The word is a backronym based on the name of journeyman major league player Bill Pecota, who with a lifetime batting average of .249...

    that are calculated in a way that differs significantly from James' method.
  • Ken Pomeroy of Basketball Prospectus who uses similarity scores for college basketball players.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK