Science is not composed of isolated groups of practitioners, but is rather an interconnected network of communities of practice, with members who fluidly move between them. Infrastructure for scientific research and collaboration should leverage this structure to make science more productive and inclusive. NASA (along with many other scientific entities) has started to adopt practices consistent with this natural structure of contemporary science. Communities such as Project Jupyter and Pangeo have pioneered a model for the inclusive, interconnected, and data-intensive practices of the future through cloud-based JupyterHub workflows. However, substantial barriers exist for individual users to make the transition from their local systems to the cloud to accomplish research goals: cloud cost opacity, infrastructure deployment complexity, and a general lack of community awareness and knowledge, among others. We can overcome these barriers by building upon existing cloud-workflow models and creating infrastructure that allows researchers to seamlessly move their workflows wherever they can do their best work.
To optimize and expand this cloud-based model, CryoCloud is a NASA-funded project for a managed computing platform. The cloud environment is adapted to the current needs of researchers, provides cloud and community expert-led hackathon-style training workshops, and works towards development of new open-source tools for collaborative, open-science research. This cycle of interconnected practice, research, and development helps us better understand the evolving needs of researchers working in this manner, and thus adapt our tools to facilitate the growth of multi-community infrastructure.
Our community cloud-ecosystem leverages a structured partnership between organizers from the NASA Cryosphere community and the International Interactive Computing Collaboration (2i2c) team. As a team, the Cryosphere community representatives and 2i2c communicate around cloud issues and development, co-create content for guiding users and open-source software, and build infrastructure management best practices. Community representatives serve as experts in community goals and dynamics, bridging between users and 2i2c and delivering training for users. 2i2c operates and develops the community-specific cloud-infrastructure, develops and improves the open source tools behind it, and guides the Cryosphere communities in their use of these tools. We use the NASA Cryosphere research community as a use-case to develop training and tools to help related communities transition to an interconnected cloud workspace and build the community and technical knowledge to facilitate NASA’s open-source, interconnected, and science-accelerated vision of the future. From large undergraduate classes that can be cloud-taught for cents per student per semester to code implementations that reduce dataset read-ins and compute times by two orders of magnitude, we share examples of how these cloud-based tools make scientific computing more intuitive, cost- and time-efficient, and open for all.