Update on the Historical OIS Data I've Been Pulling Together

Update on the Historical OIS Data I've Been Pulling Together

Roughly 2 years ago I started building a database of fatal officer-involved shootings (OIS) at the agency-year level, going back in time as far as possible. If you’re not familiar with the OIS data landscape, here’s the gist:

The former go back many decades, the latter only to 2013 or so (2000 in the case of Fatal Encounters, though the early 2000s data are likely less reliable than ~2013 onward).

What this means is that, unfortunately, we just don’t really know how current OIS trends square with more historical data, as we do in the case of crime data. That doesn’t stop researchers from trying, though. And of course there have been headlines the last few Januarys about how the year before was a “record high” for police killings. The headlines rarely convey that the “record” only goes back like 12 years. So, using WAPO data, we can visualize trends like this, going back to 2015:

Or like this:

But if we want to go further back in time than that, we can’t, at least not with good national-level data. We can with a few of the largest agencies, though:

But in the end, we just don’t know how current national trends compare to historical national trends.

So if you check the ReadMe file on my Github repo, you can catch up with what I set out to do 2 years ago. I started with Geller & Scott’s book, Deadly Force: What We Know, which includes historical OIS data (i.e., as far back as 1970 in some cases) for 11 major cities. Then I tried to track down data for the ~100 largest U.S. cities. Usually this involved: (1) checking their websites for open data or annual reports, (2) looking for other published reports (e.g., by academics, Attorney Generals, or groups like ACLU), (3) looking for other data compiled by journalists, often at the city (e.g., Texas Tribune) or state level (e.g., Tampa Bay Times, Salt Lake Tribune, Honolulu Civil Beat). I didn’t have much luck finding data from before 2000, but I did end up with at least some pre-2015 data for 417 agencies.

This past week I made some updates to the data that I hope will make it more useful to anyone who might be interested. Originally, I structured the dataset so that the first column was a “year” variable that spanned 53 years, 1970 to 2022. Each subsequent column was an agency, following a “jurisdiction_state” naming convention. This obviously isn’t useful for much, and it’s quite messy. So I wrote an R script that does the following:

  1. Imports WAPO’s incident-level and agency-level datasets, and creates a new dataset at the agency-year level for all agencies that WAPO says was involved in at least one shooting from 2015 to 2024 (3,211 agencies x 10 years = 32,110 rows).

  2. Imports my dataset with 417 agencies and reshapes it into agency-year format (417 agencies x 53 years = 22,101 rows).

  3. Assigns ORI codes to almost all of the 417 agencies in my dataset (there were a few I couldn’t track down - I’ll keep working on that).

  4. Merges my dataset with the WAPO agency-year dataset (from Step 1) on oricodes (this ends up dropping some federal agencies in the WAPO data).

Going forward, this will keep my dataset up-to-date as long as WAPO keeps updating their files. In some cases, it also allows me to see how well data I collected from other sources compares to WAPO’s data (e.g., comparing total_shootings_nix to total_shootings_wapo). And having the file in agency-year format will, I hope, make it easier to work with. For example, I created the following plot to try to convey agency-level trends over time while also being transparent about how much smaller the sample gets the further back in time we go. The solid lines represent the average number of fatal OIS observed among that year’s sample of agencies, while the lower/upper bounds of the shaded regions represent the minimum/maximum observed number of fatal OIS among agencies in that year’s sample. So in 1971, I found data for four agencies: NYPD (93 fatal OIS), Philadelphia PD (10), Indianapolis PD (2), and Dallas PD (6). The average for those 4 agencies is 28 - that’s what the solid line represents.

Right click it and open in a new tab if it’s too small.

fois_trends

So I’ve updated my Github repository to include the ORI codes I dug up and the R script I wrote to merge my data with WAPO’s. At the end of the R script you should also be able to recreate the figure above.

This remains a work in progress, and any mistakes are mine. If you do spot anything, please let me know and I can make any necessary fixes.

Avatar
Justin Nix
Associate Professor of Criminology and Criminal Justice

My research interests include police legitimacy, procedural justice, and officer-involved shootings.

Related