Dataset of COVID-19 containment and mitigation measures v0.2

Background

The dataset attempts to cover all measures of national significance intended to reduce the transmission of COVID-19, in all the worlds nations.

Each measure in the database has entries on:

  • Country (and state for the US)
  • Textual description of the measure
  • Start date of measure
  • End date (if available)
  • URL to source of more information
  • Systematic keyword labels (e.g. "travel ban" or "hygiene enforcement")

This is work in progress.

The completion state of various countries can be seen here, and a detailed description of the main tags in the database and the preprocessing used to produce the sanitized version can be found here.

As of April 6th we have about 1600 entries spanning 117 countries. Countries are prioritised on the basis of population and number of confirmed cases. There is likely to be a degree of bias in the dataset due to some countries having information that is easier to access than others.


Download the data

We have a preprocessed version of the dataset together with John's Hopkins case and death data available as a <.csv> file here. Refer to the detailed tag descriptions linked above for information about the columns in this file, or see the dataset's Kaggle page for a briefer description.

Raw data is available as a <.csv> file here. There is some information available in the raw data not recoverable from the sanitized file, but it will take substantially more effort to get this data ready for analysis.

(Last updated April 6th. To get the most recent raw data, make a copy of the Notion database, and then click "Export".)


Explore the data

You can get an overview of the data, and sort it by country, by response measure, or by a calendar view, in the Notion database.


Modelling the effectiveness of containment measures

We are currently pursuing a machine learning project to model the effectiveness of various measures to reduce COVID-19 transmission, based on this dataset (and empirical observed reproduction rates).

The results will be used to improve our main epidemiological simulations.

If you are working on similar projects, we'd be keen to discuss opportunities for collaboration. Reach out using the form below. We also invite requests if you're using the dataset for another purpose, such as econometric modelling.


Contributions and questions

If you'd like to volunteer to extend the dataset, or have any questions about usage, please reach out using this form.





Usage
The dataset is licensed under the MIT license.