About GeoLab#

Our Motivation#

The nature of scientific discovery and collaboration is changing right in front of us, in ways that continue to astonish and inspire us. The ways in which research is conducted, data are collected, and results communicated to public and private stakeholders are rapidly evolving. Broadly, we see science trending away from isolated, privileged institutions to distributed networks of communities with individual members that move dynamically from one group to another. With GeoLab, we seek to leverage that new, contemporary structure to make geophysical science more transparent, productive, and repeatable.

One of the key objectives of EarthScope’s GeoLab is to demonstrate how cloud infrastructure and services can facilitate high-quality, repeatable geophysical research. We are acutely aware of the barriers that exist for individuals that may want to transition from the traditional ways of downloading and managing large amounts of data on their local workstations to a cloud-based workflow. The complexity of deploying and maintaining infrastructure, the uncertainty of cost-control measures, the unfamiliarity with commercial cloud services, and a general lack of awareness of community standards and knowledge all contribute to the hesitancy of adopting cloud technologies for your own research goals.

Based on the welcoming, interconnected practices of open-science communities like Pangeo that have pioneered data-intensive JupyterHub workflows in the cloud, we seek to overcome these barriers by building on existing open-source infrastructure and software to lower the barrier of entry for researchers looking to move their workflows into the cloud.

What is GeoLab?#

GeoLab is a JupyterHub operated by EarthScope Consortium deployed on Amazon Web Services (AWS) and managed by the International Interactive Computing Collaboration (2i2c).

GeoLab provides customizable, cloud-based compute environments to geophysical researchers and educators for data analysis and visualization. Creating identical compute environments is easy, ensuring that software versions are consistent from one user to another. Since GeoLab operates in the cloud, anyone with a reliable internet connection can use it.

One of the primary benefits of GeoLab is that it runs adjacent to the NSF GAGE and SAGE Data Repositories in AWS (us-east2). This means that users can leverage low-latency, high-throughput access methods to analyze large volumes of data.

GeoLab has been designed with analysis of geodetic and seismological data in mind, but it is not limited to these domains. Any research group looking to work with large, geophysical datasets or that would prefer not to maintain their own complicated compute environment could benefit from working in GeoLab.

Hub Management#

The GeoLab platform is built and maintained by 2i2c, a non-profit organization that excels in using open-source tools to design and operate JupyterHubs for other institutions supporting research and education. An illustration of their community service model of bringing a community of users together into a shared compute instance focused on doing cutting-edge data science is shown below.

With their Right to Replicate policy, 2i2c ensures that GeoLab remains flexible with respect to commercial cloud providers and avoids the potential for vendor lock-in.

EarthScope and 2i2c work together to build resources that allow research and education communities to take advantage of cloud computing, data intensive processing without data transfer, repeatable analysis and more.

2i2c Service Model

Help#

GeoLab is a growing platform and community. We value your questions and feedback.

Please reach us at data-help@earthscope.org if you need assistance.

We welcome your thoughts on the future of GeoLab by filling out our Community Feedback Form.