CrimeStat
Encyclopedia
CrimeStat is a Windows-based spatial statistic software program that conducts spatial and statistical analysis and is designed to interface with a Geographic Information System
. The program is developed by Ned Levine & Associates, with funding by the National Institute of Justice
(NIJ). The program and manual are distributed for free by NIJ.
CrimeStat performs spatial analysis on objects located in a GIS. The objects can be points (e.g., events, locations), zones (e.g., blocks, traffic analysis zones, cities) or lines (e.g., street segments). The program can analyze the distribution of the objects, identify hot spots, indicate spatial autocorrelation, monitor the interaction of events in space and time, and model travel behavior. There is a regression module for non-linear spatial modeling. Some of its tools are specific to crime analysis. Others can by applied in many fields. There are 55 statistical routines in the program.
Distance can be measured as direct, indirect (Manhattan
) or on a network (which also allows travel time or speed to be used). Distance units are decimal degrees for spherical coordinates and feet, meters, miles, kilometers, or nautical miles for projected coordinates. The program can create reference grids. Several routines also use the area of the geographical region for their calculations.
The spatial description routines include:
Monte Carlo
simulations can be run on many routines to estimate credible intervals. The Nnh, Rnnh, STAC and K-means routines allow the hot spots to be output as either convex hulls or ellipses.
The spatial modeling routines include:
The Crime Travel Demand module models crime travel over a metropolitan area. It is an application of travel demand
modeling to crime or other rare events.. The purpose is to calibrate the travel behavior of a large number of offenders in committing crimes as a basis for modeling alternative interventions by law enforcement The data required must include a large number of events where both the crime location and the origin (residence) location of the offenders are known and are allocated to zones (e.g., blocks, traffic analysis zones).
All the routines in CrimeStat can be estimated either with distance (direct or indirect/Manhattan) or on a network using actual impedance estimates (e.g., travel time or speed).
). Finally, while the manual is comprehensive and well written, its size may be daunting to new users of spatial statistics.
NIJ has also run CrimeStat training courses for crime analysts. These are done on a periodic basis. At the NIJ Crime Mapping Research Conferences, held approximately every year and a half, workshops are conducted on various CrimeStat topics.
In addition to the CrimeStat program, NIJ has sponsored the development of a CrimeStat III User Workbook for crime analysts http://www.icpsr.umich.edu/CrimeStat/workbook.html and is currently developing a CrimeStat Analyst program that implements the most basic CrimeStat routines.
Examples of the use of CrimeStat outside of crime analysis include
Geographic Information System
A geographic information system, geographical information science, or geospatial information studies is a system designed to capture, store, manipulate, analyze, manage, and present all types of geographically referenced data...
. The program is developed by Ned Levine & Associates, with funding by the National Institute of Justice
National Institute of Justice
The National Institute of Justice is the research, development and evaluation agency of the United States Department of Justice. NIJ, along with the Bureau of Justice Statistics , Bureau of Justice Assistance , Office of Juvenile Justice and Delinquency Prevention , Office for Victims of Crime ,...
(NIJ). The program and manual are distributed for free by NIJ.
CrimeStat performs spatial analysis on objects located in a GIS. The objects can be points (e.g., events, locations), zones (e.g., blocks, traffic analysis zones, cities) or lines (e.g., street segments). The program can analyze the distribution of the objects, identify hot spots, indicate spatial autocorrelation, monitor the interaction of events in space and time, and model travel behavior. There is a regression module for non-linear spatial modeling. Some of its tools are specific to crime analysis. Others can by applied in many fields. There are 55 statistical routines in the program.
History
CrimeStat has been developed since the mid-1990s. The first prototype was a Unix-based C++ program called Pointstat that was developed to analyze motor vehicle crashes in Honolulu. In 1996, the National Institute of Justice funded the first version of CrimeStat and the early Pointstat routines were folded into the program.- The first version (1.0) was released in August 1999.
-
-
- Release 1.1 (July 2000)
- Version 2.0 (May 2002)
- Version 3.0 (November 2004)
- Release 3.1 (March 2007)
- Release 3.2 (October 2009)
- Release 3.3 (July 2010)
- Release 1.1 (July 2000)
-
Functionality
The program is divided into three main parts:- Data Setup
- Statistical Routines
- Output
Data Setup
CrimeStat can input data both attribute and GIS files but requires that all datasets have geographical coordinates assigned for the objects. The basic file format is dBase (dbf) but shape (shp), and Ascii text files can also be read. The program requires a Primary File but many routines also use a Secondary File. CrimeStat uses three coordinate systems:- Spherical (longitude, latitude)
- Projected
- Directional (angles).
Distance can be measured as direct, indirect (Manhattan
Taxicab geometry
Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual distance function or metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their coordinates...
) or on a network (which also allows travel time or speed to be used). Distance units are decimal degrees for spherical coordinates and feet, meters, miles, kilometers, or nautical miles for projected coordinates. The program can create reference grids. Several routines also use the area of the geographical region for their calculations.
Statistical Routines
The statistical routines are grouped into three categories- Spatial description
- Spatial modeling
- Crime travel demand.
The spatial description routines include:
- Spatial distributionSpatial descriptive statisticsSpatial descriptive statistics are used for a variety of purposes in geography, particularly in quantitative data analyses involving Geographic Information Systems .-Types of spatial data:...
statistics (mean center, standard deviation ellipse, center of minimum distance, median center, directional mean, convex hull) - Spatial autocorrelation statistics for zonal data (Moran’s “I”, Geary’s “C”, Getis-Ord Global “G”, Moran correlogram, Geary correlogram, Getis-Ord correlogram),
- Distance-based statistics among points (nearest neighborK-nearest neighbor algorithmIn pattern recognition, the k-nearest neighbor algorithm is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, or lazy learning where the function is only approximated locally and all computation is deferred until...
analysis, Ripley’s “K”, the allocation and summation of Primary File points to Secondary File points, and various distance calculation matrices), and - Hot spot statistics for points, zones or lines. CrimeStat has eight routines for hot spot identification:
- Mode (points and lines)
- Fuzzy mode (points and lines)
- Nearest neighbor hierarchical clusteringHierarchical clusteringIn statistics, hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types:...
(Nnh; points and lines) - Risk-adjusted nearest neighbor hierarchical clustering (Rnnh; points and lines)
- The Spatial and Temporal Analysis of Crime routine (STAC)) (points and lines)
- K-means clustering (points and lines)
- Anselin’s Local MoranIndicators of spatial associationIndicators of spatial association are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. For instance if we are studying cancer rates among census tracts in a given city local clusters in the rates mean that there are areas that have higher or lower...
(zones) - Local Getis-Ord “G” (zones).
Monte Carlo
Monte Carlo method
Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...
simulations can be run on many routines to estimate credible intervals. The Nnh, Rnnh, STAC and K-means routines allow the hot spots to be output as either convex hulls or ellipses.
The spatial modeling routines include:
- Single kernel density interpolationKernel density estimationIn statistics, kernel density estimation is a non-parametric way of estimating the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample...
for examining variation over a region of a single variable - Dual kernel density interpolationKernel density estimationIn statistics, kernel density estimation is a non-parametric way of estimating the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample...
of two variables (e.g., a set of events in relation to a population ‘at risk’) - Head Bang routine for smoothing zonal data
- ”Interpolated Head Bang surface that interpolates the Head Bang estimates to a grid
- ”Knox and Mantel indexes that identify the interaction between space and time in events
- Correlated Walk Analysis, based on random walkRandom walkA random walk, sometimes denoted RW, is a mathematical formalisation of a trajectory that consists of taking successive random steps. For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the...
theory, for modeling the sequential behavior of a serial offender in space and time and makes a prediction about the next event - Journey-to-crime analysis for modeling the likely origin of a serial offender based on the location of prior events committed by the offender (geographic profilingGeographic profilingGeographic profiling is a criminal investigative methodology that analyzes the locations of a connected series of crimes to determine the most probable area of offender residence. By incorporating both qualitative and quantitative methods, it assists in understanding spatial behaviour of an...
) - Bayesian Journey-to-crime which is an empirical BayesEmpirical Bayes methodEmpirical Bayes methods are procedures for statistical inference in which the prior distribution is estimated from the data. This approach stands in contrast to standardBayesian methods, for which the prior distribution is fixed before any data are observed...
method that integrates the Journey-to-crime estimate with information on the residence location of other serial offenders who committed crimes in the same places to produce an updated estimate. The diagnostic routine compares this estimate with its components in predicting the residence location for multiple serial offenders - Bayesian Journey-to-crime estimation which applies the Bayesian Journey-to-crime method to estimate the location of one serial offender
- Spatial regression. The models include Ordinary Least Squares and Poisson-basedPoisson regressionIn statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown...
, the latter of which are more appropriate for skewed data such as crime. Currently, there are Poisson models for MLEMaximum likelihoodIn statistics, maximum-likelihood estimation is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters....
Poisson, MLE Poisson with Linear Correction (NB1), MLE Poisson with Gamma dispersion (negative binomialNegative binomial distributionIn probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...
), MCMC Poisson-Gamma, and MCMC Poisson-Gamma with a conditional spatial autoregression (CAR) adjustment.
The Crime Travel Demand module models crime travel over a metropolitan area. It is an application of travel demand
Transportation forecasting
Transportation forecasting is the process of estimating the number of vehicles or people that will use a specific transportation facility in the future. For instance, a forecast may estimate the number of vehicles on a planned road or bridge, the ridership on a railway line, the number of...
modeling to crime or other rare events.. The purpose is to calibrate the travel behavior of a large number of offenders in committing crimes as a basis for modeling alternative interventions by law enforcement The data required must include a large number of events where both the crime location and the origin (residence) location of the offenders are known and are allocated to zones (e.g., blocks, traffic analysis zones).
All the routines in CrimeStat can be estimated either with distance (direct or indirect/Manhattan) or on a network using actual impedance estimates (e.g., travel time or speed).
Output
CrimeStat has three different types of output:- Screen output that displays the results once the calculations are finished. These can be saved to a text file.
- Non-graphical output for many routines in either dBase DBF or Ascii text format.
- Graphical output for many routines to allow the calculated objects to be displayed in a GIS. Currently, the graphical output formats include EsriESRIEsri is a software development and services company providing Geographic Information System software and geodatabase management applications. The headquarters of Esri is in Redlands, California....
SHP, MapInfoMapInfoMapInfo Corporation, initially incorporated as Navigational Technologies Incorporated, was a leading Location Intelligence and GIS company, headquartered in North Greenbush, New York. It was acquired on April 19, 2007 by Pitney Bowes, and on January 28, 2009, the name of division of Pitney Bowes it...
MIF/MID, Surfer for Windows DAT, and Ascii text formats.
Manual
All the routines are documented and illustrated in a manual. The current version (3.3) includes 17 chapters from version 3.0 and two update chapters. As with the program, the manual is distributed for free http://www.icpsr.umich.edu/CrimeStat.Shortcomings
Unlike some other spatial statistics programs, CrimeStat has no mapping capabilities and must be used with GIS software. Some users have found that the GUI interface is difficult to understand and inconsistent between routines. Because CrimeStat analyzes points in most routines, its results are not always consistent with those of software that analyzes areas (Eg. GeoDaGeoDA
GeoDa is a free software package that conducts spatial data analysis, geovisualization, spatial autocorrelation and spatial modeling. OpenGeoDa is the cross-platform, open source version of Legacy GeoDa. While Legacy GeoDa only runs on Windows XP, OpenGeoDa runs on different versions of Windows ,...
). Finally, while the manual is comprehensive and well written, its size may be daunting to new users of spatial statistics.
Ancillary CrimeStat Development
In addition to the development of the CrimeStat program, all the routines through version 2.0 plus the spatial autocorrelation routines have been converted into .NET libraries for use in third-party applications. Version 1.0 of the CrimeStat Libraries was released in August 2010 and is available on the CrimeStat web page.NIJ has also run CrimeStat training courses for crime analysts. These are done on a periodic basis. At the NIJ Crime Mapping Research Conferences, held approximately every year and a half, workshops are conducted on various CrimeStat topics.
In addition to the CrimeStat program, NIJ has sponsored the development of a CrimeStat III User Workbook for crime analysts http://www.icpsr.umich.edu/CrimeStat/workbook.html and is currently developing a CrimeStat Analyst program that implements the most basic CrimeStat routines.
Reviews and Examples
Reviews and examples of CrimeStat in its application to crime analysis include.Examples of the use of CrimeStat outside of crime analysis include
Use of CrimeStat by Baltimore County Police Analysts
Baltimore County Police analysts use CrimeStat to perform various spatial analytics. The primary responsibility of police analysts in Baltimore County is to identify and address existing or anticipated crime problems. Police analysts use “hot spot analysis” in CrimeStat to identify areas within the county having high concentrations of crime. Another example demonstrating the use of CrimeStat involves the department’s Data Driven Approaches to Crime and Traffic Safety (DDACTS). Police analysts used Nearest Neighbor Hierarchical Spatial clustering to identify areas having high concentrations of crime and traffic accidents. Analysts found that the two cluster groups, crime and accidents, did tend to overlap in many areas of the county. The County’s DDACTS program was initiated to increase police presence in the target areas. Preliminary results have been encouraging, with most targeted crimes and traffic accidents dropping in DDACTS areas. The Department’s DDACTS program has since become a model nationwide with the support of the National Highway Traffic Safety Administration. Finally, police analysts have used CrimeStat’s Journey to Crime and Bayesian Journey to Crime Estimation models to successfully identify a serial offender’s activity space. Once an offender’s activity space has been identified, police analysts will examine information captured from other police sources such as traffic stops, Field Interview Reports, and License Plate Readers to determine if a contact was made with a potential offender. Police have also used CrimeStat’s Crime Travel Demand model to identify road networks used by drivers under the influence (DUI). Roadways identified by the Crime Travel Demand model were targeted for interdiction programs by the department’s DUI Enforcement Team. Similar weighted road networks have been used in conjunction with Journey to Crime models to improve identification of an offender’s activity space.Further Reading
- Levine, N. (2008). “CrimeStat: a spatial statistical program for the analysis of crime incidents”. Shekhar, S. and Xiong, H. (eds), Encyclopedia of Geographic Information Science. Springer. 187-193.
- Levine, N. (2006). “Crime mapping and the CrimeStat program. Geographical Analysis. 38 (1), 41-55.