Tools for outbreak analytics infrastructures
Beyond the availability of data and methods, and the use of good practices for reproducible science, the outbreak response context poses a number of practical challenges for data analysis. In this lecture, we introduce tools which can help address some of these challenges, and create robust, efficient, and more easily deployable data analytics pipelines using R.
Slides
Click on the image below to access the slides:
Related packages
linelist
linelist
provides data cleaning tools which address most of the common problems in epidemiological data. While tailored for case data (hence the name linelist), it is very general and will likely be useful in many other contexts. Its main features include:
data standardisation: ensures consistent capitalisation, separators, and replaces non-ascii characters by their closest ascii match
guess dates: automatically detects dates, identify their formats, and performs required conversions
dictionary-based cleaning: applies cleaning rules to fix typos and recode variables according to a user-defined dictionary
For more information on this package, go to: https://repidemicsconsortium/linelist.
To install this package, type:
remotes::install_github("reconhub/linelist")
reportfactory
The reportfactory
provides an infrastructure for handling multiple Rmd
reports which need regular updating.
For more information on this package, go to: https://github.com/reconhub/reportfactory.
To install this package, type:
remotes::install_github("reconhub/reportfactory")
rmarkdown
rmarkdown
extends the capabilities of knitr
with a more diverse set of outputs generated from Rmd
files, including pdf documents, article templates, pdf or html slides, or even web applications.
More information on rmarkdown
is available from: http://rmarkdown.rstudio.com/.
To install this package, type:
install.packages("rmarkdown")
Other resources
Golden rules for writing analysis reports
These golden rules list several coding and statistical practices aimed at improving readability and robustness of analysis reports. Click on this link to download the current version, or visit this page for more information.
Report factory templates
This repository provides templates of report factories based on existing factories. Visit the github project for more information.
R4epi templates
The R4epi project provides several templates for epidemiological data analysis. Visit their website for more information.
About this document
Contributors
- Thibaut Jombart: initial version
Contributions are welcome via pull requests. The source files include:
Legal stuff
License: CC-BY Copyright: Thibaut Jombart, 2017