Resistance Lab Datasets

Resistance Lab Datasets

Run tests

This repository contains both datasets we have collected as well as scripts used to collect them.

Layout of this repository

Main folders: contain the source, a sensible cleanup, and then a folder for each distinct output.

  • source_data: raw data sources from government and other state agency websites
  • cleaned_data: source files tidied into cleaner formats for easier comparison
  • analysis: a set of folders with workspace environments for each specific output
    • analysis/0001-use-of-force: One directory per use case, first one given as an example.
      • analysis/0001-use-of-force/README.md: Description of this analysis, where it’s used, author info, etc.
      • analysis/0001-use-of-force/Makefile: A makefile for generating this analysis (make)
      • analysis/0001-use-of-force/force-mappings.csv: Mappings to make source metadata more descriptive and easier to read
      • analysis/0001-use-of-force/use-of-force.py: Script to create the outputs in this directory
      • analysis/0001-use-of-force/**/*: Outputs generated by script

Utility folders

  • .github: actions to test the pipelines
  • bibliography: BibTeX files
  • pipelines: populates source directory, cleans data (run make pipelines to run them all)

Contributing

If you have relevant datasets then we would like to include them here. We expect datasets to:

  • Be automated where possible, with a script in the scripts directory
  • Come with Great Expectations test suites
  • Be well documented with README files

Feel free to open a ticket or email kim@resistancelab.network with any questions.

Testing

Tests are provided using Great Expectations. You will need a recent version of Python installed to use this. The rest of the dependencies can then be installed with:

  1. Run python3 -m venv venv && source venv/bin/activate to create a virtual environment
  2. Install the dependencies with pip3 install -r requirements.txt
  3. Run great_expectations init to create any missing directories

To create a test suite for your new dataset run great_expectations suite new

To edit a test suite run great expectations suite edit police-population.warning

To run the tests and show the results run great_expectations docs build