Examples

Some notes before we start

This version of poets supports resampling to a daily, weekly, monthly or dekadal interval, with values at the period end. Dekadal resolution provides 3 values per month at day 10, day 20 and the last day of the month.

If you want your data to be resampled and clipped to one or multiple countries, you need to provide a list containing the FIPS country codes for each country. [‘AA’, ‘AC’] for example would represet Aruba and Antigua and Barbuda. If you want to youse your own shapefile, please read Usage of custom shapefiles. If you want the data to be resampled globally, just skip the regions parameter.

In our example, we want to resample MODIS Land Surface Temperature data over the area of Austria to a resolution of 0.1° on a dekadal basis.

But now let’s start...

Setting up a Poet base class

To be able to use poets, a poets.poet.Poet class must be initialized. You can find a description of the class parameters and attribures here: poets.poet.Poet.

In[1]:

import os
from datetime import datetime
from poets.poet import Poet

# poets attributes:
rootpath = os.path.join('D:\\', 'Test')
regions = ['AU'] # clipping to Austria
spatial_resolution = 0.1
temporal_resolution = 'dekad'
start_date = datetime(2000, 1, 1)
nan_value = -99

# initializing Poet class:
p = Poet(rootpath, regions, spatial_resolution, temporal_resolution,
         start_date, nan_value)

Usage of custom shapefiles

Resampling and clipping data to a regions specified in a custom shapefile is available since v0.3.1. A link to the custom shapefile must be set with the poets.poet.Poet shapefile parameter. The shapefile itself must contain one attribute whicht contains a unique ID or Code, which is used to select the desired region/area with the poets.poet.Poet regions parameter. The shapefile must be in given in WGS 84.

The following example extends the code from In[1] with the shapefile parameter. In this case, the shapefile is locally stored at D:\Shapefiles\shapefile1.shp, and we want to clip the data to region1 and region2. Please note that the file-suffix ”.shp” MUST NOT be set in the shapefile parameter!

In[2]:

# use custom shapefile:
shapefile = os.path.join('D:\\', 'Shapefiles', 'shapefile1')
regions = ['region1', 'region2']

# initializing Poet class:
p = Poet(rootpath, regions, spatial_resolution, temporal_resolution,
         start_date, nan_value, shapefile=shapefile)

Adding a source

Data sources can be added via the poets.poet.Poet.add_source method. In order to be able to extract the date out of the filename of imagefiles, it is necessary to provide the location of the date attributes within the filename with the parameter filedate. We know that this is very unfortunate, but we’ll promise to work on a more convenient method.

In[3]:

# source attributes:
name = 'MODIS_LST'
filename = "MOD11C1_D_LSTDA_{YYYY}_{MM}-{DD}.png"
filedate = {'YYYY': (16, 20), 'MM': (21, 23), 'DD': (24, 26)}
temp_res = 'daily'
host = "neoftp.sci.gsfc.nasa.gov"
protocol = 'FTP'
directory = "/gs/MOD11C1_D_LSTDA/"
begin_date = datetime(2000, 1, 1)
nan_value = 255

# initializing the data source:
p.add_source(name, filename, filedate, temp_res, host, protocol,
             directory=directory, begin_date=begin_date,
             nan_value=nan_value)

Download and resample data

After setting up a Poet base class and adding some sources, we can now start downloading and resampling the data by calling the poets.poet.Poet.fetch_data method.

In[4]:

p.fetch_data()

That’s it! The poets.poet.Poet.fetch_data method will now go through all defined sources and start downloading and resampling the data. The resampled NetCDF file will be saved to the DATA folder within the rootpath as defined above.

The fetch_data method will download data starting from the begin_date as defined in the source up to the current date. If you want to download and resample only a specific time period, you can do so by calling the method with the parameters begin and end.

In[5]:

# Download and resample data for January 2000:
p.fetch_data(begin=datetime(2000,1,1), end=datetime(2000,1,31))

# Download and resample data from 2005 on:
p.fetch_data(begin=datetime(2005,1,1))

# Download and resample data until 2005:
p.fetch_data(end=datetime(2004,12,13))

By default, downloaded rawdata will be kept in the TMP folder. However, if you do not need this data you can delete it by setting the delete_rawdata flag as followed:

In[6]:

# Delete rawdata after resampling
p.fetch_data(delete_rawdata=True)

Download and resample data from sources individually

The poets.poet.Poet.fetch_data downloads and resamples data from all sources. However, if you want to fetch data from only one source, you can do so by calling the poets.io.source_base.BasicSource.download_and_resample method. This method can be called within the poets.poet.Poet class by accessing the source as followed:

In[7]:

p.sources['MODIS_LST'].download_and_resample()

You can use the parameters begin, end and delete_rawdata as described in Download and resample data.

Download only

If you only want to download the data without resampling it, you can do so by calling the poets.io.source_base.BasicSource.download method. You can use the parameters begin, end and delete_rawdata as described in Download and resample data.

In[8]:

# for all sources:
p.download()

# for an individual source:
p.sources['MODIS_LST'].download()

Resampling only

If you already downloaded data manually and only want to resample it, you can do so by calling the poets.io.source_base.BasicSource.resample method. You can use the parameters begin, end and delete_rawdata as described in Download and resample data.

In[9]:

# for all sources:
p.resample()

# for an individual source:
p.sources['MODIS_LST'].resample()

Finding and closing gaps in data

Sometimes it can occur that data is temporarily not available at a data repository which can result in gaps in the data. poets can detect and attempt to fill these gaps with poets.poet.Poet.fill_gaps, respectively poets.io.source_base.BasicSource.fill_gaps.

In[10]:

# for all sources:
p.fill_gaps()

# for an individual source:
p.sources['MODIS_LST'].fill_gaps()

Reading and plotting images

This example shows how to read and plot images from the resampled NetCDF file. It presumes that we already set up a Poet class and added the source MODIS_LST as given in Setting up a Poet base class. For plotting, we will use Matplotlib.

Reading the image can be done with the poets.poet.Poet.read_image method. Please note that this method returns the image as numpy array and additionally the longitudes and latitudes of the image.

In[11]:

import matplotlib.pyplot as plt

image, lon, lat, metadata = p.read_image('MODIS_LST', datetime(2000, 5, 31))

plt.figure()
plt.imshow(image)
plt.show()

In this example we plot the image for the last dekad in May 2000.

_images/read_img_austria.png

If poets is set up for multiple regions and/or the defined source has multiple variables, the parameters region and variable must be set. See poets.poet.Poet.read_image for more information.

Reading and plotting time series

This example shows how to read and plot time series data from the resampled NetCDF file. It presumes that we already set up a Poet class and added the source MODIS_LST as given in Setting up a Poet base class. For plotting, we will use Matplotlib.

Reading the image can be done with the poets.poet.Poet.read_timeseries method. You can read time series only for gridpoints given in the defined region(s). To get a list of available gridpoints, you can call poets.poet.Poet.get_gridpoints. It is also possible to call the read_timeseries function with longitude/latitude values. In this case the location parameter must be given as tuple

In[12]:

import matplotlib.pyplot as plt

# Get a list of valid gridpoints
gridpoints = p.get_gridpoints()

# Reading the time series for point 1632
ts = p.read_timeseries('MODIS_LST', 1632)

# Reading the time series with given lon/lat values:
ts = p.read_timeseries('MODIS_LST', (15.391416550, 48.497042624))

# Plot time series
ts.plot()
plt.show()

And this is how the result will look like:

_images/read_ts_austria.png

If poets is set up for multiple regions and/or the defined source has multiple variables, the parameters region and variable must be set. See poets.poet.Poet.read_timeseries for more information.

Web Interface

Once Poets is set up, data is downloaded and resampled it is time to check out the built in web interface. All you have to do is run the poets.poet.Poet.start_app command.

In[13]:

p.start_app()

By default, the app will run on host 127.0.0.1 and port 5000. However, other values can be set with the keywords host and port.

In[14]:

p.start_app(host='111.222.3.44', port=1234)

Using colorbars and units

The default colorbar used for displaying images is matplotlibs jet. You choose any colorbar from this list by setting the poets.poet.Poet.add_source colorbar parameter. Further, if the physical unit of the dataset is not given in its metadata, you can set the unit manually with the unit parameter. In our example the unit would be degree celsius.

In[15]:

# setting the colobar for a source:
p.add_source(..., colorbar='Blues', unit='degree celsius')

Scaling data

By default, poets supports scaling of data if a corresponding parameter is given in the metadata of the source data. Sometimes, this information is missing although the data is scaled. In our example the each MODIS_LST file contains values between 0 and 255, where 255 represents the NaN value. In this case, we need to set the parameters nan_value and data_range when adding a source with poets.poet.Poet.add_source. Further, we need to scale the dataset to its actual value range between min -25°C and max 45°C.

in[16]:

p.add_source(..., nan_value=255, data_range=(0, 254), valid_range=(-25, 45))

Using custom region names

To display custom region names instead of the FIPS Code, the name of the region can be overwritten with the poets.poet.Poet region_names parameter. This parameter must be given as list with the same size as the regions parameter.

Complete Example

In[17]:

import os
from datetime import datetime
from poets.poet import Poet

# poets attributes:
rootpath = os.path.join('D:\\', 'Test') # Wherever the data should be stored
regions = ['AU'] # clipping to Austria
region_names = ['Austria']
spatial_resolution = 0.1
temporal_resolution = 'dekad'
start_date = datetime(2000, 1, 1)
nan_value = -99

# initializing Poet class:
p = Poet(rootpath, regions, spatial_resolution, temporal_resolution,
         start_date, nan_value, region_names=region_names)

# setting source attributes:
name = 'MODIS_LST'
filename = "MOD11C1_D_LSTDA_{YYYY}_{MM}-{DD}.png"
filedate = {'YYYY': (16, 20), 'MM': (21, 23), 'DD': (24, 26)}
temp_res = 'daily'
host = "neoftp.sci.gsfc.nasa.gov"
protocol = 'FTP'
directory = "/gs/MOD11C1_D_LSTDA/"
begin_date = datetime(2000, 2, 24)
nan_value = 255
data_range = (0, 254)
valid_range = (-25, 45)
unit = "degree Celsius"

# adding the source
p.add_source(name, filename, filedate, temp_res, host, protocol,
             directory=directory, begin_date=begin_date,
             nan_value=nan_value, valid_range=valid_range,
             data_range=data_range, unit=unit)


# get the data (in this example from beginning of 2014)
p.fetch_data(begin=datetime(2014,1,1))


# start the web interface
p.start_app()