{
"cells": [
{
"cell_type": "markdown",
"id": "e5bddb1b",
"metadata": {
"execution": {}
},
"source": [
"[](https://colab.research.google.com/github/neuromatch/climate-course-content/blob/main/tutorials/W1D1_ClimateSystemOverview/student/W1D1_Tutorial5.ipynb) "
]
},
{
"cell_type": "markdown",
"id": "BSjO7xX42sEH",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial 5: Xarray Data Analysis and Climatology\n",
"\n",
"**Week 1, Day 1, Climate System Overview**\n",
"\n",
"**Content creators:** Sloane Garelick, Julia Kent\n",
"\n",
"**Content reviewers:** Katrina Dobson, Younkap Nina Duplex, Danika Gupta, Maria Gonzalez, Will Gregory, Nahid Hasan, Paul Heubel, Sherry Mi, Beatriz Cosenza Muralles, Jenna Pearson, Agustina Pesce, Chi Zhang, Ohad Zivan\n",
"\n",
"**Content editors:** Paul Heubel, Jenna Pearson, Chi Zhang, Ohad Zivan\n",
"\n",
"**Production editors:** Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan\n",
"\n",
"**Our 2024 Sponsors:** CMIP, NFDI4Earth"
]
},
{
"cell_type": "markdown",
"id": "e90a481e-8dd8-4d05-a5a1-a612f89cd637",
"metadata": {
"execution": {}
},
"source": [
"## \n",
"\n",
"Pythia credit: Rose, B. E. J., Kent, J., Tyle, K., Clyne, J., Banihirwe, A., Camron, D., May, R., Grover, M., Ford, R. R., Paul, K., Morley, J., Eroglu, O., Kailyn, L., & Zacharias, A. (2023). Pythia Foundations (Version v2023.05.01) https://zenodo.org/record/8065851\n",
"\n",
"## \n"
]
},
{
"cell_type": "markdown",
"id": "z99xmBTDi3JS",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial Objectives\n",
"\n",
"*Estimated timing of tutorial:* 25 minutes \n",
"\n",
"Global climate can vary on long timescales, but it's also important to understand seasonal variations. For example, seasonal variations in precipitation associated with the migration of the [Intertropical Convergence Zone (ITCZ)](https://glossary.ametsoc.org/wiki/Intertropical_convergence_zone#:~:text=(Also%20called%20ITCZ%2C%20equatorial%20convergence,and%20Northern%20Hemispheres%2C%20respectively).) and monsoon systems occur in response to seasonal changes in temperature. In this tutorial, we will use data analysis tools in Xarray to explore the seasonal climatology of global temperature. Specifically, in this tutorial, we'll use the `.groupby()` operation in Xarray, which involves the following steps:\n",
"\n",
"- **Split**: group data by value (e.g., month).\n",
"- **Apply**: compute some function (e.g., aggregate) within the individual groups.\n",
"- **Combine**: merge the results of these operations into an output dataset."
]
},
{
"cell_type": "markdown",
"id": "0af7bee1-3de3-453a-8ae8-bcd7910b4266",
"metadata": {
"execution": {},
"tags": []
},
"source": [
"# Setup\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9226796c-2eca-44c2-9dc7-5cf2ba93836d",
"metadata": {
"execution": {}
},
"outputs": [],
"source": [
"# installations ( uncomment and run this cell ONLY when using google colab or kaggle )\n",
"#!pip install pythia_datasets"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "06073287-7bdb-45b5-9cec-8cdf123adb49",
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 2358,
"status": "ok",
"timestamp": 1681572562093,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"# imports\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import xarray as xr\n",
"from pythia_datasets import DATASETS\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install and import feedback gadget\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2ad30cf1-5467-4010-ac68-5046fb316018",
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Install and import feedback gadget\n",
"\n",
"!pip3 install vibecheck datatops --quiet\n",
"\n",
"from vibecheck import DatatopsContentReviewContainer\n",
"def content_review(notebook_section: str):\n",
" return DatatopsContentReviewContainer(\n",
" \"\", # No text prompt\n",
" notebook_section,\n",
" {\n",
" \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
" \"name\": \"comptools_4clim\",\n",
" \"user_key\": \"l5jpxuee\",\n",
" },\n",
" ).render()\n",
"\n",
"\n",
"feedback_prefix = \"W1D1_T5\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Figure Settings\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "72e04965-e982-444d-b3da-4e1e639c6899",
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Figure Settings\n",
"import ipywidgets as widgets # interactive display\n",
"\n",
"%config InlineBackend.figure_format = 'retina'\n",
"plt.style.use(\n",
" \"https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Video 1: Terrestrial Temperature and Rainfall\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21725d4b-ee68-42aa-af76-70392b4ab6ac",
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @title Video 1: Terrestrial Temperature and Rainfall\n",
"\n",
"from ipywidgets import widgets\n",
"from IPython.display import YouTubeVideo\n",
"from IPython.display import IFrame\n",
"from IPython.display import display\n",
"\n",
"\n",
"class PlayVideo(IFrame):\n",
" def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
" self.id = id\n",
" if source == 'Bilibili':\n",
" src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
" elif source == 'Osf':\n",
" src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
" super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
"\n",
"\n",
"def display_videos(video_ids, W=400, H=300, fs=1):\n",
" tab_contents = []\n",
" for i, video_id in enumerate(video_ids):\n",
" out = widgets.Output()\n",
" with out:\n",
" if video_ids[i][0] == 'Youtube':\n",
" video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
" height=H, fs=fs, rel=0)\n",
" print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
" else:\n",
" video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
" height=H, fs=fs, autoplay=False)\n",
" if video_ids[i][0] == 'Bilibili':\n",
" print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
" elif video_ids[i][0] == 'Osf':\n",
" print(f'Video available at https://osf.io/{video.id}')\n",
" display(video)\n",
" tab_contents.append(out)\n",
" return tab_contents\n",
"\n",
"\n",
"video_ids = [('Youtube', 'SyvFyT3jVM8'), ('Bilibili', 'BV1PhbyeaEMk')]\n",
"tab_contents = display_videos(video_ids, W=730, H=410)\n",
"tabs = widgets.Tab()\n",
"tabs.children = tab_contents\n",
"for i in range(len(tab_contents)):\n",
" tabs.set_title(i, video_ids[i][0])\n",
"display(tabs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2371121-637e-4517-9db2-1c5faab01348",
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Terrestrial_Temperature_Rainfall_Video\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bfcf3bd-6805-4cfb-90c4-a9067f6ce91c",
"metadata": {
"cellView": "form",
"execution": {},
"pycharm": {
"name": "#%%\n"
},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @markdown\n",
"from ipywidgets import widgets\n",
"from IPython.display import IFrame\n",
"\n",
"link_id = \"9z6km\"\n",
"\n",
"print(f\"If you want to download the slides: https://osf.io/download/{link_id}/\")\n",
"IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render\", width=854, height=480)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c28bb25a-1852-4477-b932-41fdecbe42ef",
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Terrestrial_Temperature_Rainfall_Slides\")"
]
},
{
"cell_type": "markdown",
"id": "8704803f-300d-4631-a2fa-f62d18726d1c",
"metadata": {
"execution": {},
"tags": []
},
"source": [
"# Section 1: GroupBy: Split, Apply, Combine\n",
"\n",
"Simple aggregations (as we learned in the previous Tutorial 4) can give a useful summary of our dataset, but often we would prefer to aggregate conditionally on some coordinate labels or groups. Xarray provides the so-called `.groupby()` operation which enables the **split-apply-combine** workflow on Xarray DataArrays and Datasets. The split-apply-combine operation is illustrated in this figure from [Project Pythia](https://foundations.projectpythia.org/core/xarray/computation-masking.html):\n",
"\n",
"
\n",
"\n",
"- The **split** step involves breaking up and grouping an Xarray Dataset or DataArray depending on the value of the specified group key.\n",
"- The **apply** step involves computing some function, usually an aggregate, transformation, or filtering, within the individual groups.\n",
"- The **combine** step merges the results of these operations into an output Xarray Dataset or DataArray.\n",
"\n",
"We are going to use `.groupby()` to remove the seasonal cycle (\"climatology\") from our dataset, which will allow us to better observe long-term trends in the data. See the [xarray groupby user guide](https://xarray.pydata.org/en/stable/user-guide/groupby.html) for more examples of what `.groupby()` can take as an input."
]
},
{
"cell_type": "markdown",
"id": "9719db5b-e645-4815-b8df-d454fa7703e7",
"metadata": {
"execution": {}
},
"source": [
"Let's start by loading the same data that we used in the previous tutorial (monthly SST data from CESM2):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7837f8bd-da89-4718-ab02-d5107576d2d6",
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 388,
"status": "ok",
"timestamp": 1681573026385,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"filepath = DATASETS.fetch(\"CESM2_sst_data.nc\")\n",
"ds = xr.open_dataset(filepath)\n",
"ds"
]
},
{
"cell_type": "markdown",
"id": "713cc8d8-7374-4c5b-be61-aec4b5b0ffe6",
"metadata": {
"execution": {}
},
"source": [
"Then, let's select a gridpoint closest to a specified lat-lon (in this case let's select 50ºN, 310ºE), and plot a time series of SST at that point (recall that we learned this in Tutorial 2). The annual cycle will be quite pronounced. Note that we are using the `nearest` method to find the points in our datasets closest to the lat-lon values we specify. What this returns may not match these inputs exactly."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c0348ee8-6e9b-4f50-a844-375ae00d2771",
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 959,
"status": "ok",
"timestamp": 1681573031714,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"ds.tos.sel(\n",
" lon=310, lat=50, method=\"nearest\"\n",
").plot() # time range is 2000-01-15 to 2014-12-15"
]
},
{
"cell_type": "markdown",
"id": "e732cd9b",
"metadata": {
"execution": {}
},
"source": [
"This plot shows changes in monthly SST between 2000-01-15 and 2014-12-15. The annual cycle of SST change is apparent in this figure, but to understand the climatology of this region, we need to calculate the average SST for each month over this period. The first step is to split the data into groups based on month."
]
},
{
"cell_type": "markdown",
"id": "d1505625-cbcd-495b-a15f-8824e455415b",
"metadata": {
"execution": {}
},
"source": [
"## Section 1.1: Split\n",
"\n",
"Let's group data by month, i.e. all Januaries in one group, all Februaries in one group, etc.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6e4fb25e-165f-4350-a93d-46a344f2d175",
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 160,
"status": "ok",
"timestamp": 1681572674597,
"user": {
"displayName": "Sloane Garelick",
"userId": "04706287370408131987"
},
"user_tz": 240
},
"tags": []
},
"outputs": [],
"source": [
"ds.tos.groupby(ds.time.dt.month)"
]
},
{
"cell_type": "markdown",
"id": "5d176ad8-15f1-4ecc-ab3e-898cef3b4e18",
"metadata": {
"execution": {}
},
"source": [
"