{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"[](https://colab.research.google.com/github/neuromatch/climate-course-content/blob/main/tutorials/W2D1_AnEnsembleofFutures/student/W2D1_Tutorial4.ipynb) "
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial 4: Synthesising & Interpreting Diverse Data Sources\n",
"\n",
"**Week 2, Day 1, An Ensemble of Futures**\n",
"\n",
"**Content creators:** Brodie Pearson, Julius Busecke, Tom Nicholas\n",
"\n",
"**Content reviewers:** Mujeeb Abdulfatai, Nkongho Ayuketang Arreyndip, Jeffrey N. A. Aryee, Younkap Nina Duplex, Sloane Garelick, Paul Heubel, Zahra Khodakaramimaghsoud, Peter Ohue, Jenna Pearson, Abel Shibu, Derick Temfack, Peizhen Yang, Cheng Zhang, Chi Zhang, Ohad Zivan\n",
"\n",
"**Content editors:** Paul Heubel, Jenna Pearson, Ohad Zivan, Chi Zhang\n",
"\n",
"**Production editors:** Wesley Banfield, Paul Heubel, Jenna Pearson, Konstantine Tsafatinos, Chi Zhang, Ohad Zivan\n",
"\n",
"**Our 2024 Sponsors:** CMIP, NFDI4Earth"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Tutorial Objectives\n",
"\n",
"*Estimated timing for tutorial:* 40 minutes\n",
"\n",
"In this tutorial, we will synthesize scientific knowledge from various sources and use this diverse information to validate and contextualize CMIP6 simulations. By the end of this tutorial, you will be able to \n",
"- Create a time series of global mean sea surface temperature from observations, models, and proxy data;\n",
"- Use this data to validate and contextualize climate models, as well as to provide a holistic picture of Earth's past and future climate evolution."
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Setup\n",
"\n",
" \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 97497,
"status": "ok",
"timestamp": 1684265107976,
"user": {
"displayName": "Brodie Pearson",
"userId": "05269028596972519847"
},
"user_tz": 420
},
"tags": [
"colab"
]
},
"outputs": [],
"source": [
"# installations ( uncomment and run this cell ONLY when using google colab or kaggle )\n",
"\n",
"# !pip install condacolab &> /dev/null\n",
"# import condacolab\n",
"# condacolab.install()\n",
"\n",
"# # Install all packages in one call (+ use mamba instead of conda), this must in one line or code will fail\n",
"# !mamba install xarray-datatree intake-esm gcsfs xmip aiohttp nc-time-axis cf_xarray xarrayutils &> /dev/null"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": []
},
"outputs": [],
"source": [
"# imports\n",
"import intake\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import xarray as xr\n",
"\n",
"from xmip.preprocessing import combined_preprocessing\n",
"from xmip.postprocessing import _parse_metric\n",
"from datatree import DataTree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install and import feedback gadget\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Install and import feedback gadget\n",
"\n",
"!pip3 install vibecheck datatops --quiet\n",
"\n",
"from vibecheck import DatatopsContentReviewContainer\n",
"def content_review(notebook_section: str):\n",
" return DatatopsContentReviewContainer(\n",
" \"\", # No text prompt\n",
" notebook_section,\n",
" {\n",
" \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
" \"name\": \"comptools_4clim\",\n",
" \"user_key\": \"l5jpxuee\",\n",
" },\n",
" ).render()\n",
"\n",
"\n",
"feedback_prefix = \"W2D1_T4\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Figure settings\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Figure settings\n",
"import ipywidgets as widgets # interactive display\n",
"\n",
"plt.style.use(\n",
" \"https://raw.githubusercontent.com/neuromatch/climate-course-content/main/cma.mplstyle\"\n",
")\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Helper functions\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Helper functions\n",
"\n",
"def global_mean(ds: xr.Dataset) -> xr.Dataset:\n",
" \"\"\"Global average, weighted by the cell area\"\"\"\n",
" return ds.weighted(ds.areacello.fillna(0)).mean([\"x\", \"y\"], keep_attrs=True)\n",
"\n",
"\n",
"# calculate anomaly to reference period\n",
"def datatree_anomaly(dt):\n",
" dt_out = DataTree()\n",
" for model, subtree in dt.items():\n",
" # for the coding exercise, ellipses will go after sel on the following line\n",
" ref = dt[model][\"historical\"].ds.sel(time=slice(\"1950\", \"1980\")).mean()\n",
" dt_out[model] = subtree - ref\n",
" return dt_out\n",
"\n",
"\n",
"def plot_historical_ssp126_combined(dt):\n",
" for model in dt.keys():\n",
" datasets = []\n",
" for experiment in [\"historical\", \"ssp126\"]:\n",
" datasets.append(dt[model][experiment].ds.tos)\n",
"\n",
" da_combined = xr.concat(datasets, dim=\"time\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Video 1: Historical Context for Future Projections\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @title Video 1: Historical Context for Future Projections\n",
"\n",
"from ipywidgets import widgets\n",
"from IPython.display import YouTubeVideo\n",
"from IPython.display import IFrame\n",
"from IPython.display import display\n",
"\n",
"\n",
"class PlayVideo(IFrame):\n",
" def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
" self.id = id\n",
" if source == 'Bilibili':\n",
" src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
" elif source == 'Osf':\n",
" src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
" super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
"\n",
"\n",
"def display_videos(video_ids, W=400, H=300, fs=1):\n",
" tab_contents = []\n",
" for i, video_id in enumerate(video_ids):\n",
" out = widgets.Output()\n",
" with out:\n",
" if video_ids[i][0] == 'Youtube':\n",
" video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
" height=H, fs=fs, rel=0)\n",
" print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
" else:\n",
" video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
" height=H, fs=fs, autoplay=False)\n",
" if video_ids[i][0] == 'Bilibili':\n",
" print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
" elif video_ids[i][0] == 'Osf':\n",
" print(f'Video available at https://osf.io/{video.id}')\n",
" display(video)\n",
" tab_contents.append(out)\n",
" return tab_contents\n",
"\n",
"video_ids = [('Youtube', 'UdTO5stHoGk'), ('Bilibili', 'BV1mnGDeyEJg')]\n",
"tab_contents = display_videos(video_ids, W=730, H=410)\n",
"tabs = widgets.Tab()\n",
"tabs.children = tab_contents\n",
"for i in range(len(tab_contents)):\n",
" tabs.set_title(i, video_ids[i][0])\n",
"display(tabs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Historical_Context_for_Future_Projections_Video\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"pycharm": {
"name": "#%%\n"
},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"# @markdown\n",
"from ipywidgets import widgets\n",
"from IPython.display import IFrame\n",
"\n",
"link_id = \"vw259\"\n",
"\n",
"print(f\"If you want to download the slides: https://osf.io/download/{link_id}/\")\n",
"IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render\", width=854, height=480)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Historical_Context_for_Future_Projections_Slides\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Section 1: Reproduce Global Mean SST for Historical and Future Scenario Experiments\n",
"\n",
"We are now going to reproduce the plot you created in Tutorial 3, which showed the likely range of CMIP6 simulated global mean sea surface temperature for historical and future scenarios (*SSP1-2.6* and *SSP5-8.5*) experiments from a *multi-model ensemble*. However, now we will add an additional dataset called *HadISST* which is an observational dataset spanning back to the year 1870. Later in the tutorial, we will also include the paleo data you saw in the previous mini-lecture.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"## Section 1.1: Load CMIP6 SST Data from Several Models using `xarray`\n",
"\n",
"Let's load the five different CMIP6 models again for the three CMIP6 experiments.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"tags": []
},
"outputs": [],
"source": [
"col = intake.open_esm_datastore(\n",
" \"https://storage.googleapis.com/cmip6/pangeo-cmip6.json\"\n",
") # open an intake catalog containing the Pangeo CMIP cloud data\n",
"\n",
"# pick our five example models\n",
"# there are many more to test out! Try executing `col.df['source_id'].unique()` to get a list of all available models\n",
"source_ids = [\"IPSL-CM6A-LR\", \"GFDL-ESM4\", \"ACCESS-CM2\", \"MPI-ESM1-2-LR\", \"TaiESM1\"]\n",
"experiment_ids = [\"historical\", \"ssp126\", \"ssp585\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 17912,
"status": "ok",
"timestamp": 1684265194265,
"user": {
"displayName": "Brodie Pearson",
"userId": "05269028596972519847"
},
"user_tz": 420
},
"tags": []
},
"outputs": [],
"source": [
"# from the full `col` object, create a subset using facet search\n",
"cat = col.search(\n",
" source_id=source_ids,\n",
" variable_id=\"tos\",\n",
" member_id=\"r1i1p1f1\",\n",
" table_id=\"Omon\",\n",
" grid_label=\"gn\",\n",
" experiment_id=experiment_ids,\n",
" require_all_on=[\n",
" \"source_id\"\n",
" ], # make sure that we only get models which have all of the above experiments\n",
")\n",
"\n",
"# convert the sub-catalog into a datatree object, by opening each dataset into an xarray.Dataset (without loading the data)\n",
"kwargs = dict(\n",
" preprocess=combined_preprocessing, # apply xMIP fixes to each dataset\n",
" xarray_open_kwargs=dict(\n",
" use_cftime=True\n",
" ), # ensure all datasets use the same time index\n",
" storage_options={\n",
" \"token\": \"anon\"\n",
" }, # anonymous/public authentication to google cloud storage\n",
")\n",
"\n",
"cat.esmcat.aggregation_control.groupby_attrs = [\"source_id\", \"experiment_id\"]\n",
"dt = cat.to_datatree(**kwargs)\n",
"\n",
"cat_area = col.search(\n",
" source_id=source_ids,\n",
" variable_id=\"areacello\", # for the coding exercise, ellipses will go after the equals on this line\n",
" member_id=\"r1i1p1f1\",\n",
" table_id=\"Ofx\", # for the coding exercise, ellipses will go after the equals on this line\n",
" grid_label=\"gn\",\n",
" experiment_id=[\n",
" \"historical\"\n",
" ], # for the coding exercise, ellipses will go after the equals on this line\n",
" require_all_on=[\"source_id\"],\n",
")\n",
"\n",
"cat_area.esmcat.aggregation_control.groupby_attrs = [\"source_id\", \"experiment_id\"]\n",
"dt_area = cat_area.to_datatree(**kwargs)\n",
"\n",
"dt_with_area = DataTree()\n",
"\n",
"for model, subtree in dt.items():\n",
" metric = dt_area[model][\"historical\"].ds[\"areacello\"]\n",
" dt_with_area[model] = subtree.map_over_subtree(_parse_metric, metric)\n",
"\n",
"# average every dataset in the tree globally\n",
"dt_gm = dt_with_area.map_over_subtree(global_mean)\n",
"\n",
"dt_gm_anomaly = datatree_anomaly(dt_gm)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"\n",
"### Coding Exercise 1.1\n",
"\n",
"Complete the following code to:\n",
"\n",
"\n",
"1. Calculate a time series of the global mean sea surface temperature (GMSST) from the HadISST dataset\n",
"2. Subtract a base period from the HadISST GMSST time series. Use the same base period as the CMIP6 time series you are comparing against. "
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"execution": {},
"executionInfo": {
"elapsed": 114,
"status": "error",
"timestamp": 1684265198522,
"user": {
"displayName": "Brodie Pearson",
"userId": "05269028596972519847"
},
"user_tz": 420
}
},
"source": [
"```python\n",
"#################################################\n",
"## TODO for students: Add HadISST (observation-based) dataset to the previous CMIP ensemble plot. ##\n",
"# Please remove the following line of code once you have completed the exercise:\n",
"raise NotImplementedError(\"Student exercise: Add HadISST (observation-based) dataset to the previous CMIP ensemble plot.\")\n",
"#################################################\n",
"\n",
"fig, ax = plt.subplots()\n",
"for experiment, color in zip([\"historical\", \"ssp126\", \"ssp585\"], [\"C0\", \"C1\", \"C2\"]):\n",
" datasets = []\n",
" for model in dt_gm_anomaly.keys():\n",
" # calculate annual mean\n",
" annual_sst = (\n",
" dt_gm_anomaly[model][experiment]\n",
" .ds.tos.coarsen(time=12)\n",
" .mean()\n",
" .assign_coords(source_id=model)\n",
" .load()\n",
" )\n",
" datasets.append(\n",
" annual_sst.sel(time=slice(None, \"2100\")).load()\n",
" ) # the french model has a long running member for ssp 126\n",
" # concatenate all along source_id dimension\n",
" da = xr.concat(datasets, dim=\"source_id\", join=\"override\").squeeze()\n",
" # compute ensemble mean and draw time series\n",
" da.mean(\"source_id\").plot(color=color, label=experiment, ax=ax)\n",
" # extract time coordinates\n",
" x = da.time.data\n",
" # calculate the lower and upper bound of the likely range\n",
" da_lower = da.squeeze().quantile(0.17, dim=\"source_id\")\n",
" da_upper = da.squeeze().quantile(0.83, dim=\"source_id\")\n",
" # shade via quantile boundaries\n",
" ax.fill_between(x, da_lower, da_upper, alpha=0.5, color=color)\n",
"\n",
"\n",
"# but now add observations (https://pangeo-forge.org/dashboard/feedstock/43)\n",
"store = \"https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/HadISST-feedstock/hadisst.zarr\"\n",
"ds_obs = xr.open_dataset(store, engine=\"zarr\", chunks={}).convert_calendar(\n",
" \"standard\", use_cftime=True\n",
")\n",
"# mask missing values\n",
"ds_obs = ds_obs.where(ds_obs > -1000)\n",
"weights = np.cos(\n",
" np.deg2rad(ds_obs.latitude)\n",
") # In a regular lon/lat grid, area is ~cos(latitude)\n",
"# calculate weighted global mean for observations\n",
"sst_obs_gm = ...\n",
"# calculate anomaly for observations\n",
"sst_obs_gm_anomaly = ...\n",
"\n",
"sst_obs_gm_anomaly.coarsen(time=12, boundary=\"trim\").mean().plot(\n",
" color=\"0.3\", label=\"Observations\", ax=ax\n",
")\n",
"ax.set_ylabel(\"Global Mean SST with respect to 1950-1980 (°C)\")\n",
"ax.set_xlabel(\"Time (years)\")\n",
"ax.legend()\n",
"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {},
"executionInfo": {
"elapsed": 66611,
"status": "ok",
"timestamp": 1684265266986,
"user": {
"displayName": "Brodie Pearson",
"userId": "05269028596972519847"
},
"user_tz": 420
},
"tags": []
},
"outputs": [],
"source": [
"# to_remove solution\n",
"\n",
"fig, ax = plt.subplots()\n",
"for experiment, color in zip([\"historical\", \"ssp126\", \"ssp585\"], [\"C0\", \"C1\", \"C2\"]):\n",
" datasets = []\n",
" for model in dt_gm_anomaly.keys():\n",
" # calculate annual mean\n",
" annual_sst = (\n",
" dt_gm_anomaly[model][experiment]\n",
" .ds.tos.coarsen(time=12)\n",
" .mean()\n",
" .assign_coords(source_id=model)\n",
" .load()\n",
" )\n",
" datasets.append(\n",
" annual_sst.sel(time=slice(None, \"2100\")).load()\n",
" ) # the french model has a long running member for ssp 126\n",
" # concatenate all along source_id dimension\n",
" da = xr.concat(datasets, dim=\"source_id\", join=\"override\").squeeze()\n",
" # compute ensemble mean and draw time series\n",
" da.mean(\"source_id\").plot(color=color, label=experiment, ax=ax)\n",
" # extract time coordinates\n",
" x = da.time.data\n",
" # calculate the lower and upper bound of the likely range\n",
" da_lower = da.squeeze().quantile(0.17, dim=\"source_id\")\n",
" da_upper = da.squeeze().quantile(0.83, dim=\"source_id\")\n",
" # shade via quantile boundaries\n",
" ax.fill_between(x, da_lower, da_upper, alpha=0.5, color=color)\n",
"\n",
"\n",
"# but now add observations (https://pangeo-forge.org/dashboard/feedstock/43)\n",
"store = \"https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/HadISST-feedstock/hadisst.zarr\"\n",
"ds_obs = xr.open_dataset(store, engine=\"zarr\", chunks={}).convert_calendar(\n",
" \"standard\", use_cftime=True\n",
")\n",
"# mask missing values\n",
"ds_obs = ds_obs.where(ds_obs > -1000)\n",
"weights = np.cos(\n",
" np.deg2rad(ds_obs.latitude)\n",
") # In a regular lon/lat grid, area is ~cos(latitude)\n",
"# calculate weighted global mean for observations\n",
"sst_obs_gm = ds_obs.sst.weighted(weights).mean([\"longitude\", \"latitude\"])\n",
"# calculate anomaly for observations\n",
"sst_obs_gm_anomaly = sst_obs_gm - sst_obs_gm.sel(time=slice(\"1950\", \"1980\")).mean()\n",
"\n",
"sst_obs_gm_anomaly.coarsen(time=12, boundary=\"trim\").mean().plot(\n",
" color=\"0.3\", label=\"Observations\", ax=ax\n",
")\n",
"ax.set_ylabel(\"Global Mean SST with respect to 1950-1980 (°C)\")\n",
"ax.set_xlabel(\"Time (years)\")\n",
"ax.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit your feedback\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Coding_Exercise_1_1\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"### Questions 1.1 Climate Connection\n",
"\n",
"Now that you have a modern and projected time series containing models and observations,\n",
"1. What context and/or validation of the simulations does this information provide?\n",
"2. What additional context/validation can you glean by also considering the paleo proxy information in the figure below? (This figure was shown in the last video)\n",
"\n",
"Note the paleo periods on this figure represent the Mid-Pleiocene Warm Period (MPWP), the Last Inter-glacial (LIG) and the Last Glacial Maximum (LGM)\n",
"\n",
"