The NYPD’s Stop-and-Frisk policy in NYC has been controversial to say the least. Proponents for the policies point to data that shows reduced crime rates since the United States Supreme Court established the legal basis for the practice in Terry vs. Ohio.

But then there are things like this ...

Process and Analysis

Curious to analyze the data for myself, I took Stop and Frisk Data from New York Civil Liberties Union and started a cluster analysis. 

I started by analyzing the minimum and maximum values for each individual race in terms of frisks, searches, and arrests. After describing variables individually, the next step was to analyze them in pairs to search for interesting correlations. After completing the one and two dimensional analysis, I brushed the data with GGobi and clustered it. After I found patterns, I visualized the data in Processing by building classifiers which could then be used to predict categorical outcomes.

The visualization I produced was unique given the use of negative space to show the difference between racial clusters. The negative space between frisks and arrests is high when looking at the cluster for Blacks. However, when looking at Asians, the difference between frisks and actual arrests is nearly non-existent by comparison. This was statistically significant and evidence that racial profiling plays a huge role in the NYPD’s Stop-and-Frisk actions.


Awarded the Visualized/Tableau Data Communications Fellowship

Acknowledged as a “future innovator of data communication for making otherwise complex data easy to understand for the masses”.

Selection was based on the following three criteria.

• social importance of the information
• complexity of the data communicated
• overall clarity of the data communication 

My final data visualization focused on the NYPD’s controversial stop and frisk practices and made their inefficiency and potentially habitual racial profiling more transparent.