Data is the central currency of science, but the nature of scientific data has changed dramatically with the rapid pace of technology. – Ted Hart (@emhart), et. al., Ten Simple Rules for Digital Data Storage. PLoS Comput Biol 12(10): e1005097. doi:10.1371/journal.pcbi.1005097
Access to the computational steps taken to process data and generate findings is as important as access to data themselves. – Victoria Stodden, et. al., Enhancing reproducibility for computational methods. Science09 Dec 2016 : 1240-1241. 10.1126/science.aah6168
The Data Archive provides data services that are tied to the data life-cycle. We also believe that the creation, use, manipulation, communication, visualization and reuse of data increasingly becomes contingent on some form of software. Intricate knowledge and skills of how to setup, mung, and make sense using code or tools are an increasing facilitating distinguishing factor for what becomes good research. We want to promote best practices in data management and coding. We also want to promote and support larger movements in research transparency and reproducible research. Our services are grounded in and support emerging good data, code and standards.
Data
DataOne Data Life Cycle
We can help researchers in the following areas:
Plan
- Review data sources
- Help investigate archiving issues, costs, consent and disclosure risks
- Create a data management plan
Collect/Assure
- Project management and file organization.
- QA for collected data and acquisition workflows
- Provide consultation on restricted data (access and security)
Describe
- Help with filenames, standard terminology, data dictionaries
- Document analysis and file manipulations
- Help identify appropriate standards
Preserve
- Help with backing up your Data
- Help with deciding what data to preserve
- Choosing stable file formats
- Help identifying suitable repositories
Integration
- Consider the compatability
- Document the steps
- Captrue the provenance of you sources
Analysis
- Help identifying appropriate software
- Consultation on coding best practices
Software
We provide consultation with a number of programming and data analytic tools, including R, Python, Stata, SPSS, SAS and Open Refine.
We can help on the following issues:
- Reading data into a tools
- Data cleaning, re-shapping or transformation
- Understanding outputs of your tool (including errors)
- Data visualization
- Web-scraping and consuming APIs
- Code quality reviews
- Project management and version control
- Code quality and not repeating yourself
- Reproducible research practices
- Open and transparent science