Data Management

Data Management Resources


Finding, organizing, publishing, transporting, collaborating, and preserving your data.

Utilizing the resources below Research Computing can facilitate your research data management needs.

  • The UCR Library

    The UCR Library serves as an information commons and intellectual center for the campus and is the focal point for research and study at UCR.

    Learn More:

  • Collaboration

    Open Science Framework (OSF)

    OSF is a free and open-source project management tool that supports researchers throughout their entire project lifecycle. As a collaboration tool, OSF helps research teams work on projects privately or make the entire project publicly accessible for broad dissemination. As a workflow system, OSF enables connections to the many products researchers already use, streamlining their process, and increasing efficiency. Researchers use OSF to collaborate, document, archive, share, and register research projects, materials, and data. OSF is the flagship product of the non-profit Center for Open Science.


    Powerfull new online research tool.

    • Organize your digital research assets.
      • Access Your Data Anywhere
        • Synapse provides APIs to store or access your data from the web or programmatically via one of our supported analytical clients (R, Python, Command Line).
      • Query Structured Data
        • Use Synapse Tables to query structured data right from your web browser or from any analytical client.
    • Get credit for your research.
      • Record Provenance
        • Synapse provides tools to record and display Provenance for each step of your analysis.
      • Mint a DOI
        • A digital object identifier (DOI) provides a persistent and easy way to reference your digital assets in publications — including data, code, or analysis results.
    • Collaborate with your colleagues and the public.
      • Communicate Your Findings
        • Use the Synapse Wiki services to communicate your findings by embedding rich content such as images, Tables, Provenance, and LaTeX equations.
      • Share Your Research
        • New Synapse Projects are private by default — share with your colleagues, collaborators, or make your work public! Create Synapse Teams to manage your collaborations.


    The All In One Repository
    A home for papers, FAIR data and non-traditional research outputs that is easy to use and ready now


    The easy to use, online, collaborative LaTeX editor

    Open Source Data Management Software iRODS

    iRODS: Data Management Model
    iRODS provides eight packaged capabilities, each of which can be selectively deployed and configured (usually into known patterns). These patterns and capabilities represent the most common use cases as identified by community participation and reporting.

    The model contains eight capabilities which can be combined into interesting patterns:

    • Data to Compute is Automated Ingest + Tiering + Additional Policy
    • Compute to Data is Sorting Policy + Job Routing Policy
    • Synchronization is Automated Ingest + Sync Policy
    • Data Transfer Nodes is Cache Management Policy + Replication Policy
  • Security

    Data Security Plan Template for use at UCR.

    Protection Level Classification of Information and IT Resources at UC's established by UCOP

    UC AWS EA and BAA 

    UC Cloud Contract and Guidance 

  • Data Movement

    Globus move, share, & discover data via a single interface – whether your files live on a supercomputer, lab cluster, tape archive, public cloud or your laptop, you can manage this data from anywhere, using your existing identities, via just a web browser.​

    AWS DataSync makes it simple and fast to move large amounts of data online between on-premises storage and Amazon S3, Amazon Elastic File System (Amazon EFS), or Amazon FSx for Windows File Server. Manual tasks related to data transfers can slow down migrations and burden IT operations. Simplify and automate transfers, move data 10x faster.

    AWS Content Delivery Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency, high transfer speeds, all within a developer-friendly environment. CloudFront is integrated with AWS – both physical locations that are directly connected to the AWS global infrastructure, as well as other AWS services.

    Git Large File Storage An open source Git extension for versioning large files. Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like or GitHub Enterprise.

    DAT Dat is a protocol for sharing data between computers. By making sure changes in data are transparent, everyone receives only the data they want, and by connecting computers directly (rather than using a cloud server), Dat powers communities building the next-generation Web.


  • Finding Data

    Re3Data is a global registry of research data repositories that covers research data repositories from different academic disciplines.

    FAIRSharing A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies.

    AWS Registry of Open Data This registry exists to help people discover and share datasets that are available via AWS resources.

    Google Dataset Search Dataset Search enables users to find datasets stored across the Web through a simple keyword search. The tool surfaces information about datasets hosted in thousands of repositories across the Web, making these datasets universally accessible and useful.

    Azure Open Datasets Curated open data made easily accessible on Azure

    Gartner Research Portal - Founded in 1979, we are the leading research and advisory company. We’ve expanded well beyond our flagship technology research to provide senior leaders across the enterprise with the indispensable business insights, advice, and tools they need to achieve their mission-critical priorities and build the organizations of tomorrow.

  • Research Data Management

    Video Link  - Introduction to Data Management

    Research Data Can Take Many Forms

    Social Science Data Hard Science Data Data Forms Used By Both
    • Survey responses
    • Focus groups
    • Individual interviews
    • Economic indicators
    • Demographics
    • Opinion polling
    • Measurements generated by sensors/laboratory instruments
    • Computer modeling
    • Simulations
    • Observations and/or field studies
    • Specimen
    • Images
    • Video
    • Mapping/GIS data
    • Numerical measurements


    Research Data Categories
    • Collecting or Creating Raw Data
    • Processing Data
    • Analyzing Data
    • Finalizing/Publishing Data
    • Preserving or Archiving Data
    • Sharing Data
    • Re-Using Data 

    Research Data Life Cycle

    Data ONE Data Life Cycle

    • Plan: description of the data that will be compiled, and how the data will be managed and made accessible throughout its lifetime
    • Collect: observations are made either by hand or with sensors or other instruments and the data are placed into digital form
    • Assure: the quality of the data is assured through checks and inspections
    • Describe: data are accurately and thoroughly described using the appropriate metadata standards
    • Preserve: data are submitted to an appropriate long-term archive (i.e. data center)
    • Discover: potentially useful data are located and obtained, along with the relevant information about the data (metadata)
    • Integrate: data from disparate sources are combined to form one homogeneous set of data that can be readily analyzed
    • Analyze: data are analyzed

    Data Policies

    • Data ownership and intellectual property
    • Data management and stewardship responsibilities
    • Public access, data sharing & dissemination
    • Retention

    Federal Policies

    February 2013: Office of Science and Technology Policy at the White House issued a directive that Federal Agencies with more than $100 million in Research & Development design plans to make the research of federally funded research freely available to the public

    May 2013: President Obama issued an Executive Order “Making Open and Machine Readable the New Default for Government Information”

    **For the complete history and current direction of Federal Open Access Policies see SPARC’s Data Sharing Requirements by Federal Agency 

    NSF Data Management and Sharing Plans

    • The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project.
    • The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies).
    • Policies for access and sharing including provisions for appropriate protection of privacy,confidentiality, security, intellectual property, or other rights or requirements.
    • Policies and provisions for re-use, re-distribution, and the production of derivatives.
    • Plans for archiving data, samples, and other research products, and for preservation of access to them.
  • Unified Medical Language System (UMLS)

    The National Library of Medicine developed theUnified Medical Language System (UMLS), which provides:

    • Access to Terminology Data
    • A Common Data Model for Terminologies
    • Interoperability through Synonymy


    Free Mesh Tools (NLP Tools) 

    • MetaMap - A Tool For Recognizing UMLS Concepts in Text
    • MeSH on Demand - MeSH on Demand identifies MeSH® terms in your submitted text