Skip to content

boom-lab/crocolake-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrocoLake-Python

CrocoLake-Python is a collection of Jupyter notebooks that shows how to interface with CrocoLake and Argo's parquet databases with Python.

Table of Contents

  1. Usage
  2. Examples
  3. Databases
  4. Contact

Usage

python

Create your local environment (creates a folder in your current directory) virtualenv crocolake

Activate the new python environment source crocolake/bin/activate

Install the required packages pip install .

Launch jupyter lab to access the notebooks: jupyter lab

If you don't have virtualenv in your machine, you can install it with pip install virtualenv

conda

Create the local environment conda env create -f environment.yml

Activate the environment: conda activate crocolake

Launch jupyter lab to access the notebooks: jupyter lab

Notebooks

You can then launch any notebook from the notebooks folder and execute it. Each example needs a specific dataset, and it contains code to download it to your local machine.

Note that there are a couple of ways to load parquet datasets in a dataframe in Python: using pyarrow and using dask. Example 1 and Example 2 show both, while the other examples use the one that in my experience is most efficient (i.e. dask).

Examples

The notebooks folder contains four examples/tutorials:

  1. Example 1 shows how to make a map of dissolved oxygen content in the North West Atlantic;
  2. Example 2 shows how to make a map of temperature measurements in the North West Atlantic, including information about the source (Argo, GLODAP, or Spray Gliders);
  3. Example 3 shows how to make temperature-salinity plots from Argo QC-ed measurements;
  4. Example 4 shows how to make an animation of Argo's fleet growth over time on a world map;
  5. Example 5 shows how to make a map of dissolved oxygen measurements in the Pacific off the coast of California.

Databases

The following databases are currently available:

  • CrocoLake: contains the best available data from Argo, GLODAP, and Spray Gliders. More details here. This example uses CrocoLake.
  • Argo 'QC': contains the best available data, that is real time values are reported only when delayed values are not available. This version is the same used in CrocoLake, and here you can find more details on how it is generated. This example uses Argo 'QC'.
  • Argo 'ALL': contains all real time and adjusted variables as reported in the core ('<PLATFORM_NUMBER>_prof.nc') and synthetic ('<PLATFORM_NUMBER>_Sprof.nc') profile files, for the physical and biogeochemical versions respectively. Both this and this examples use Argo 'ALL'.

Each database comes in 'PHY' and 'BGC' versions.

Contact

For any questions, bugs, missing information, etc, open an issue or get in touch!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •