Research Computing - Kostas Lab

Computing Resources

Breadcrumb

Here we have information on computing resources that can be used to conduct research at UCR. Depending on the nature of the research you are conducting and the computing resources required, different options below are available to satisfy most needs. Most research computing computation and associated workflows can be served by UCR's High-Performance Computing Center with an outstanding GPU enabled computing cluster and GPFS high-speed parallel storage. There are many other computing operations and workflows that may be better suited on one of the other resources.

Contact research-computing@ucr.edu to discuss options and get started.

  • UCR's High-Performance Computing Center

    UCR's High-Performance Computing Center (HPCC) provides state-of-the-art research computing infrastructure and training accessible to all UCR researchers and affiliates at low cost. This includes access to the shared HPC resources and services summarized below. The main advantage of a shared research computing environment is access to a much larger HPC infrastructure (with thousands of CPUs/GPUs and many PBs of directly attached storage) than what smaller clusters of individual research groups could afford, while also providing a long-term sustainability plan and professional systems administrative support.

    • Multipurpose cluster optimized for parallel, non-parallel and big data computing
    • Access to >1000 software tools, packages, and community databases

    Learn More:

  • AWS

    Web Servers, Web-Apps, Servers, Machine Learning, Databases, Data Lakes, High-Performance, and Scientific Computing...

    • EC2 Instances
      • Launch what you need when you need it.
      • Pricing
      • FAQ
    • Lightsail (Recommended for web apps and websites)
      • Everything you need to jumpstart your project on AWS—compute, storage, and networking for a predictable price.
      • Pricing
      • FAQ
    • Elastic Beanstalk
      • Manage applications in the AWS cloud. Once you upload your application, Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring.
      • Pricing
      • FAQ
  • Nautilus Cluster - Pacific Research Platform

    THE PACIFIC RESEARCH PLATFORM
    The PRP is a partnership of more than 50 institutions, led by researchers at UC San Diego and UC Berkeley and includes the National Science Foundation, Department of Energy, and multiple research universities in the US and around the world. The PRP builds on the optical backbone of Pacific Wave, a joint project of CENIC and the Pacific Northwest GigaPOP (PNWGP) to create a seamless research platform that encourages collaboration on a broad range of data-intensive fields and projects.

    Nautilus Cluster

    Nautilus is a heterogeneous, distributed cluster, with computational resources of various shapes and sizes made available by research institutions spanning multiple continents! Check out the Cluster Map to see where the nodes are located

    This is a free resource at the moment and can be used to run many types of machine learning and research workloads.

    Related Links:

  • XSEDE

    XSEDE The Extreme Science and Engineering Discovery Environment (XSEDE) is a single virtual system that scientists can use to interactively share computing resources, data and expertise. People around the world use these resources and services — things like supercomputers, collections of data and new tools — to improve our planet.

    If you are a US-based researcher and currently use, or want to use, advanced research computing resources and services, XSEDE can help. Whether you intend to use XSEDE-allocated resources or resources elsewhere, the XSEDE program works to make such resources easier to use and help more people use them. Once you are ready to take the next step, you can become an XSEDE user in a matter of minutes and be on your way to taking your computational activities to the next level.

    Science Gateways Free!

    A Science Gateway is a community-developed set of tools, applications, and data that are integrated via a portal or a suite of applications, usually in a graphical user interface, that is further customized to meet the needs of a specific community. Gateways enable entire communities of users associated with a common discipline to use national resources through a common interface that is configured for optimal use. Researchers can focus on their scientific goals and less on assembling the cyberinfrastructure they require. Gateways can also foster collaborations and the exchange of ideas among researchers.

    • Using Science Gateways
      • Gateways are independent projects, each with its own guidelines for access. Most gateways are available for use by anyone, although they usually target a particular research audience. XSEDE Science Gateways are portals to computational and data services and resources across a wide range of science domains for researchers, engineers, educators, and students. Depending on the needs of the communities, a gateway may provide any of the following features:
        • High-performance computation resources
        • Workflow tools
        • General or domain-specific analytic and visualization software
        • Collaborative interfaces
        • Job submission tools
        • Education modules
  • Open Science Grid

    Open Science Grid

    The OSG facilitates access to distributed high throughput computing for research in the US. The resources accessible through the OSG are contributed by the community, organized by the OSG, and governed by the OSG consortium. In the last 12 months, we have provided more than 1.2 billion CPU hours to researchers across a wide variety of projects.

    The Open Science Grid consists of computing and storage elements at over 100 individual sites spanning the United States. These sites, primarily at universities and national labs, range in size from a few hundred to tens of thousands of CPU cores.

    What kind of computational tasks are likely accelerated on the Open Science Grid?

    Jobs run on the OSG will be able to execute on servers at numerous remote physical clusters, making OSG an ideal environment for computational problems that can be executed as numerous, independent tasks that are individually relatively small and short (see below). Please consider the following guidelines:

    1. Independent compute tasks using up to 8 cores (ideally 1 core, each), less than 8 GB memory (RAM) per core, and 1 GPU, and running for 1-12 hours. Additional capabilities for COVID-19 research are currently available, with up to 48 hours of runtime per job. Please contact the support listed below for more information about these capabilities. Application-level checkpointing can be implemented for longer-running work (for example, applications writing out state and restart files). Workloads with independent jobs of 1 core and less than 1 GB RAM are ideal, with up to thousands of concurrently-running jobs and 100,000s of hours achieved daily. Jobs using several cores and/or several GB of RAM will likely experience hundreds of concurrently-running jobs.
    2. Compute sites in the OSG can be configured to use pre-emption, which means jobs can be automatically killed if higher priority jobs enter the system. Pre-empted jobs will restart on another site, but it is important that the jobs can handle multiple restarts and/or complete in less than 12 hours.
    3. Software dependencies can be staged with the job, distributed via containers, or installed on the read-only distributed OASIS filesystem (which can also support software modules). Statically-linked binaries are ideal. However, dynamically linked binaries with standard library dependencies, built for 64-bit Red Hat Enterprise Linux (RHEL) version 6 or 7 will also work. OSG can support some licensed software (like Matlab, Matlab-Simulink, etc.) where compilation allows execution without a license, or where licenses still accommodate multiple jobs and are not node-locked.
    4. Input and output data for each job should be <20 GB to allow them to be pulled in by the jobs, processed, and pushed back to the submit node. Note that the OSG Virtual Cluster does not currently have a globally shared file system, so jobs with such dependencies will not work. Projects with many TBs of data can be distributed with significant scalability, beyond the capacity of a single cluster, if subsets of the data are accessed across numerous jobs.

    The following are examples of computations that are a great match for OSG:

    1. parameters sweeps, parameter optimizations, statistical model optimizations, etc. (as pertains to many machine learning approaches)
    2. molecular docking and other simulations with numerous starting systems and/or configurations
    3. image processing (including medical images with non-restricted data), satellite images, etc.
    4. many genomics/bioinformatics tasks where numerous reads, samples, genes, etc., might be analyzed independent of one another before bringing results together
    5. text analysis And many others!

    The following are examples of computations that are not good matches for OSG:

    1. Tightly coupled computations, for example, MPI-based multi-node communication, will not work well on OSG due to the distributed nature of the infrastructure.
    2. Computations requiring a shared filesystem will not work, as there is no shared filesystem between the different clusters on OSG.
    3. Computations requiring complex software deployments or restrictive licensing are not a good fit. There is limited support for distributing software to the compute clusters, but for complex software (though containers may be helpful!), or licensed software, deployment can be a major task.
  • Secure Computing

    Data Security Plan Template for use at UCR.

    Protection Level Classification of Information and IT Resources at UC's established by UCOP

    HPCC

    • The recommended resource on campus for most computational research workflows.
    • It can accommodate up to and including P3 data security controls.
    • Website

    UC System Secure Storage and Compute

    Cloud

    Sherlock

    Workstation/Server

    • It can accommodate up to and including P4 data as long as all security controls are in place.
    • It can be a physical workstation/server or virtual machine as all security controls are in place.
    • Workstations can access secure cloud storage.
    • Workstations can also be provisioned in the cloud.
    • It can be Windows or Linux.
    • The normal limit is roughly ~64 Cpu Cores 24/7.
    • Remote GUI access is available via RDP, VNX, or X11 Forwarding.
    • Needs Full Disk encryption (BitLocker) and event logging enabled (Logging Guidelines)
  • Quantum Computing

    Quantum Computing

    • Amazon Braket
      • Explore and experiment with quantum computing
      • Amazon Braket is a fully managed quantum computing service that helps researchers and developers get started with the technology to accelerate research and discovery. Amazon Braket provides a development environment for you to explore and build quantum algorithms, test them on quantum circuit simulators, and run them on different quantum hardware technologies.
      • Features
      • Pricing
      • FAQ
      • How it Works How it works diagram

         

  • Distributed Computing

    A distributed computer system consists of multiple software components that are on multiple computers but run as a single system. The computers that are in a distributed system can be physically close together and connected by a local network, or they can be geographically distant and connected by a wide area network. A distributed system can consist of any number of possible configurations, such as mainframes, personal computers, workstations, minicomputers, and so on. The goal of distributed computing is to make such a network work as a single computer.

    • World Community Grid
      • World Community Grid enables anyone with a computer, smartphone or tablet to donate their unused computing power to advance cutting-edge scientific research on topics related to health, poverty, and sustainability. Through the contributions of over 650,000 individuals and 460 organizations, World Community Grid has supported 31 research projects to date, including searches for more effective treatments for cancer, HIV/AIDS, and neglected tropical diseases. Other projects are looking for low-cost water filtration systems and new materials for capturing solar energy efficiently.
      • How it Works
      • Active Research
      • Completed Research
      • Submit a Proposal
    • BOINC
      • From UC Berkeley
      • BOINC lets you help cutting-edge science research using your computer (Windows, Mac, Linux) or Android device. BOINC downloads scientific computing jobs to your computer and runs them invisibly in the background. It's easy and safe.
      • About 30 science projects use BOINC; examples include Rosetta@home, Einstein@Home, and IBM World Community Grid. These projects investigate diseases, study global warming, discover pulsars, and do many other types of scientific research.
      • You can participate in either of two ways:
        • Join Science United
          • To contribute to science areas (biomedicine, physics, astronomy, and so on) use Science United. Your computer will do work for current and future projects in those areas.
        • Download BOINC
          • To contribute to specific projects, download BOINC, and follow the directions.
      • High-Throughput Computing with BOINC
        • BOINC is a platform for high-throughput computing on a large scale (thousands or millions of computers). It can be used for volunteer computing (using consumer devices) or grid computing (using organizational resources). It supports virtualized, parallel, and GPU-based applications.
        • BOINC is distributed under the LGPL open source license. It can be used for commercial purposes, and applications need not be open source.
        • Computing with BOINC
        • Technical Documentation
    • Folding @ Home
      • While you keep going with your everyday activities, your computer will be working to help us find cures for diseases like cancer, ALS, Parkinson’s, Huntington’s, Influenza, and many others.
      • COVID-19
      • Current Projects
      • FAQ
    • Climateprediction.net
    • OpenScientist
      • List of Recommended Distributed Computing Projects
  • CloudBank for NSF Grants

    Home

    Managed Service to Simplify Cloud Access for Computer Science Research and Education

    CloudBank

    The University of California, San Diego's San Diego Supercomputer Center and Information Technology Services Division, the University of Washington's eScience Institute, and the University of California, Berkeley's Division of Data Science will develop and operate CloudBank, a cloud access entity that will help the computer science community access and use public clouds for research and education by delivering a set of managed services designed to simplify access to public clouds.

    CloudBank will provide on-ramp support that reduces researcher cloud adoption pain points such as: managing cost, translating and upgrading research computing environments to an appropriate cloud platform, and learning cloud-based technologies that accelerate and expand research. It will be complemented by a cloud usage system that gives NSF-funded researchers the ability to easily grant permissions to research group members and students, set spending limits, and recover unused cloud credits. These systems will support multiple cloud vendors, and be accessed via intuitive, easy-to-use user portal that gives users a single point of entry to these functions.

     

  • NIH STRIDES Initiative

    NIH STRIDES Initiative

    The NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative allows NIH to explore the use of cloud environments to streamline NIH data use by partnering with commercial providers. NIH’s STRIDES Initiative provides cost-effective access to industry-leading partners to help advance biomedical research. These partnerships enable access to rich datasets and advanced computational infrastructure, tools, and services.

    The STRIDES Initiative is one of many NIH-wide efforts to implement the NIH Strategic Plan for Data Science, which provides a roadmap for modernizing the NIH-funded biomedical data science ecosystem.

    By leveraging the STRIDES Initiative, NIH and NIH-funded institutions can begin to create a robust, interconnected ecosystem that breaks down silos related to generating, analyzing, and sharing research data. NIH-funded researchers with an active NIH award may take advantage of the STRIDES Initiative for their NIH-funded research projects. Eligible investigators include awardees of NIH contracts, other transaction agreements, grants, cooperative agreements, and other agreements.

    Benefits of using the STRIDES Initiative as a vehicle to access STRIDES Initiative partners include:

    Discounts on STRIDES Initiative partner services—Favorable pricing on computing, storage, and related cloud services for NIH Institutes, Centers, and Offices (ICOs) and NIH-funded institutions.
    Professional services—Access to professional service consultations and technical support from the STRIDES Initiative partners.
    Training—Access to training for researchers, data owners, and others to help ensure optimal use of available tools and technologies.
    Potential collaborative engagements—Opportunities to explore methods and approaches that may advance NIH’s biomedical research objectives (with scope and milestones of engagements agreed upon separately).
    At this time, the STRIDES Initiative supports programs/projects who want to prepare, migrate, upload, and compute on data in the cloud. In the future, the ability to access data across NIH and NIH-funded institutions from various research domain repositories will become available.

  • nanoHUB - Free platform for computational research

    nanoHUB.org is the premier open and free platform for computational research, education, and collaboration in nanotechnology, materials science, and related fields. Our site hosts a rapidly growing collection of simulation tools that run in the cloud and are accessible through a web browser. In addition, nanoHUB provides online presentations, nanoHUB-U short courses, animations, teaching materials, and more. These resources instruct users about our simulation tools as well as general nanoelectronics, materials science, photonics, data science, and other topics. A good starting page for those new to nanoHUB is the Education page.

    Our site offers researchers a venue to explore, collaborate, and publish content as well. Many of these collaborative efforts occur via workspacesuser groups, and projectsUncertainty Quantification (UQ) is now automatically available for most nanoHUB tools and adds powerful analytical and predictive capabilities for researchers.

    Learn More:

  • Chem Compute Org - Free computational chemistry software

    chemcompute.org  Computational chemistry software for undergraduate teaching and research.
    All without the hassle of compiling, installing, and maintaining software and hardware. Login or register at the top right to get full access to the system, or learn more about using Chem Compute in your class teaching.

    Learn More:

    • About Video
    • GAMESS - The General Atomic and Molecular Electronic Structure System, a quantum chemistry package.
    • TINKER - A molecular dynamics package from the Jay Ponder Lab.
    • JUPYTERHUB AND PSI4 - Analyze data and run quantum calculations in Python
    • NAMD  - A molecular dynamics package from the Theoretical and Computational Biophysics Group at the University of Illinois Urbana Champaign
  • QUBES - Free modeling and statistical software through the browser

    QUBES is a community of math and biology educators who share resources and methods for preparing students to use quantitative approaches to tackle real, complex, biological problems.

    Run free modeling and statistical software through their browser, eliminating the need to purchase or install software locally. Instructors can customize activities and datasets to fit their courses, minimizing logistical barriers between students and the course concepts.

    Learn More: