Resistance Lab Datasets
This repository contains both datasets we have collected as well as scripts used to collect them.
Layout of this repository
Main folders: contain the source, a sensible cleanup, and then a folder for each distinct output.
- source_data: raw data sources from government and other state agency websites
- cleaned_data: source files tidied into cleaner formats for easier comparison
- analysis: a set of folders with workspace environments for each specific output
analysis/0001-use-of-force: One directory per use case, first one given as an example.
analysis/0001-use-of-force/README.md: Description of this analysis, where it’s used, author info, etc.
analysis/0001-use-of-force/Makefile: A makefile for generating this analysis (
analysis/0001-use-of-force/force-mappings.csv: Mappings to make source metadata more descriptive and easier to read
analysis/0001-use-of-force/use-of-force.py: Script to create the outputs in this directory
analysis/0001-use-of-force/**/*: Outputs generated by script
.github: actions to test the pipelines
bibliography: BibTeX files
- pipelines: populates source directory, cleans data (run
make pipelinesto run them all)
If you have relevant datasets then we would like to include them here. We expect datasets to:
- Be automated where possible, with a script in the
- Come with Great Expectations test suites
- Be well documented with README files
Feel free to open a ticket or email firstname.lastname@example.org with any questions.
Tests are provided using Great Expectations. You will need a recent version of Python installed to use this. The rest of the dependencies can then be installed with:
python3 -m venv venv && source venv/bin/activateto create a virtual environment
- Install the dependencies with
pip3 install -r requirements.txt
great_expectations initto create any missing directories
To create a test suite for your new dataset run
great_expectations suite new
To edit a test suite run
great expectations suite edit police-population.warning
To run the tests and show the results run
great_expectations docs build