poets.io package

Submodules

poets.io.download module

Provides download functions for FTP/SFTP, HTTP and local data sources.

poets.io.download.download_ftp(download_path, host, directory, filedate, port=21, username='', password='', dirstruct=None, ffilter='', begin=None, end=None)[source]

Downloads data via FTP.

download_path : str, optional
Path where to save the downloaded files.
host : str
Link to host.
directory : str
Path to data on host.
filedate : dict
Dict which points to the date fields in the filename
port : int, optional
Port to host, defaults to 21.
username : str, optional
Username for source, defaults to emtpy str.
password : str, optional
Passwor for source, defaults to emtpy str.
dirstruct : list of str, optional
Folder structure on host, each list element represents a subdirectory.
ffilter : str, optional
Used for filtering files on a server, defaults to emtpy str.
begin : datetime, optional
Set either to first date of remote repository or date of last file in local repository.
end : datetime, optional
Date until which data should be downloaded.
bool
True if data is available, False if not.
poets.io.download.download_http(download_path, host, directory, filename, filedate, dirstruct, ffilter=None, begin=None, end=datetime.datetime(2015, 8, 13, 9, 36, 17, 127286))[source]

Download data via HTTP

download_path : str, optional
Path where to save the downloaded files.
host : str
Link to host.
directory : str
Path to data on host.
filename : str
Structure/convention of the file name.
filedate : dict
Dict which points to the date fields in the filename.
dirstruct : list of str
Folder structure on host, each list element represents a subdirectory.
ffilter : str, optional
Used for filtering files on a server, defaults to None.
begin : datetime, optional
Set either to first date of remote repository or date of last file in local repository.
end : datetime, optional
Date until which data should be downloaded.
bool
true if data is available, false if not
poets.io.download.download_local(download_path, directory, filedate, dirstruct=None, ffilter='', begin=datetime.datetime(1900, 1, 1, 0, 0), end=datetime.datetime(2015, 8, 13, 9, 36, 17, 127300))[source]

Download data from local path

download_path : str
Path where to save the downloaded files.
directory : str
Path to locally stored data.
filedate : dict
Dict which points to the date fields in the filename.
dirstruct : list of str, optional
Folder structure in directory, each list element represents a subdirectory.
ffilter : str, optional
Used for filtering files on a server, defaults to empty string.
begin : datetime, optional
Set either to first date of remote repository or date of last file in local repository, defaults to datetime(1900, 1, 1).
end : datetime, optional
Date until which data should be downloaded, defaults to datetime.now()
bool
True if data is available, false if not.
poets.io.download.download_sftp(download_path, host, directory, port, username, password, filedate, dirstruct=None, ffilter='', begin=None, end=None)[source]

Download data via SFTP.

download_path : str, optional
Path where to save the downloaded files.
host : str
Link to host.
directory : str
Path to data on host.
port : int
Port to host.
username : str
Username for source.
password : str
Password for source.
filedate : dict
Dict which points to the date fields in the filename.
dirstruct : list of str, optional
Folder structure on host, each list element represents a subdirectory.
ffilter : str, optional
Used for filtering files on a server, defaults to emtpy str.
begin : datetime, optional
Set either to first date of remote repository or date of last file in local repository.
end : datetime, optional
Date until which data should be downloaded.
bool
True if data is available, false if not.
poets.io.download.filesInDir_ftp(path, ftp, filedate, begin, end, filelist)[source]

List all files in directory and subdirectories on an FTP server.

path : str
Path to data on host.
ftp : ftplib connection
Connection to Server.
filedate : dict
Dict which points to the date fields in the filename.
begin : datetime,
Date from which on to download data.
end : datetime
Date until which to download data.
filelist : list of str
List of filepaths or empty list.
filelist : list
List containing all files in directory and subdirectories
poets.io.download.filesInDir_sftp(path, sftp, filedate, begin, end, filelist)[source]

List all files in directory and subdirectories on an SFTP server.

path : str
Path to data on host.
sftp : paramiko Transport
Connection to Server.
filedate : dict
Dict which points to the date fields in the filename.
begin : datetime,
Date from which on to download data.
end : datetime
Date until which to download data.
filelist : list of str
List of filepaths or empty list.
filelist : list
List containing all files in directory and subdirectories
poets.io.download.get_file_date(fname, fdate)[source]

Gets the date from a file name.

fname : str
Filename.
fdate : str
Structure of the date in filename, dict which points to the date fields in the filename.
datetime
Date and, if given, time from filename

poets.io.fileformats module

poets.io.fileformats.check_supported(filename)[source]

Checks if file is in supported format.

filename : str
Filename or filepath.
bool
True if supported, False if not.
poets.io.fileformats.select_file(filelist)[source]

Selects a file out of a list of files, based on their extension.

filelist : list of str
List containing filepaths.
filename : str
Filepath of selected file.
IOError :
If filelist contains no supported file format.

poets.io.source_base module

class poets.io.source_base.BasicSource(name, filename, filedate, temp_res, rootpath, host, protocol, username=None, password=None, port=22, directory=None, dirstruct=None, regions=None, begin_date=None, ffilter=None, colorbar='jet', variables=None, nan_value=None, valid_range=None, unit=None, dest_nan_value=-99, dest_regions=None, dest_sp_res=0.25, dest_temp_res='dekad', dest_start_date=datetime.datetime(2000, 1, 1, 0, 0), data_range=None, src_file=None, labels=None, xticks=None)[source]

Bases: object

Base Class for data sources.

name : str
Name of the data source.
filename : str
Structure/convention of the file name.
filedate : dict
Position of date fields in filename, given as tuple.
temp_res : str
Temporal resolution of the source.
rootpath : str
Root path where all data will be stored.
host : str
Link to data host.
protocol : str
Protocol for data transfer.
username : str, optional
Username for data access.
password : str, optional
Password for data access.
port : int, optional
Port to data host, defaults to 22.
directory : str, optional
Path to data on host.
dirstruct : list of strings, optional
Structure of source directory, each list item represents a subdirectory.
regions : list of str, optional
List of regions where data from source is available. Uses all regions specified in dest_regions if not set.
begin_date : datetime, optional
Date from which on data is available.
variables : string or list of strings, optional
Variables used from data source, defaults to [‘dataset’].
nan_value : int, float, optional
Nan value of the original data as given by the data provider.
valid_range : tuple of int of float, optional
Valid range of data, given as (minimum, maximum).
data_range : tuple of int of float, optional
Range of the values as data given in rawdata (minimum, maximum). Will be scaled to valid_range.
ffilter : str, optional
Pattern that apperas in filename. Can be used to select out not needed files if multiple files per date are provided.
colorbar : str, optional
Colorbar to use, use one from http://matplotlib.org/examples/color/colormaps_reference.html, defaults to jet.
labels : list, optional
Custom tick-labels for the legend in the web-app; must have same dimension as xticks and only works if xticks is set; Defaults to None.
xticks : list of int or float, optional
Custom tick locations for the legend in the web-app; must have same dimension as labels and only works if labels is set; Defaults to None.
unit : str, optional
Unit of dataset for displaying in legend. Does not have to be set if unit is specified in input file metadata. Defaults to None.
dest_nan_value : int, float, optional
NaN value in the final NetCDF file.
dest_regions : list of str, optional
Regions of interest where data should be resampled to.
dest_sp_res : int, float, optional
Spatial resolution of the destination NetCDF file, defaults to 0.25 degree.
dest_temp_res : string, optional
Temporal resolution of the destination NetCDF file, possible values: (‘day’, ‘week’, ‘dekad’, ‘month’), defaults to dekad.
dest_start_date : datetime, optional
Start date of the destination NetCDF file, defaults to 2000-01-01.
src_file : dict of str, optional
Path to file that contains source. Uses default NetCDF file if None. Key of dict must be regions as set in regions attribute.
name : str
Name of the data source.
filename : str
Structure/convention of the file name.
filedate : dict
Position of date fields in filename, given as tuple.
temp_res : str
Temporal resolution of the source.
host : str
Link to data host.
protocol : str
Protocol for data transfer.
username : str
Username for data access.
password : str
Password for data access.
port : int
Port to data host.
directory : str
Path to data on host.
dirstruct : list of strings
Structure of source directory, each list item represents a subdirectory.
regions : list of str
List of regions where data from source is available.
begin_date : datetime
Date from which on data is available.
ffilter : str
Pattern that apperas in filename.
colorbar : str, optional
Colorbar to used.
labels : list
Custom tick-labels for the legend in the web-app.
xticks : list of int or float
Custom tick locations for the legend in the web-app.
unit : str
Unit of dataset for displaying in legend.
variables : list of strings
Variables used from data source.
nan_value : int, float
Not a number value of the original data as given by the data provider.
valid_range : tuple of int of float
Valid range of data, given as (minimum, maximum).
data_range : tuple of int of float
Range of the values as data given in rawdata (minimum, maximum).
dest_nan_value : int, float, optional
NaN value in the final NetCDF file.
tmp_path : str
Path where temporary files are stored.
rawdata_path : str
Path where original files are stored.
data_path : str
Path where resampled NetCDF file is stored.
dest_regions : list of str
Regions of interest where data is resampled to.
dest_sp_res : int, float
Spatial resolution of the destination NetCDF file.
dest_temp_res : string
Temporal resolution of the destination NetCDF file.
dest_start_date : datetime.datetime
First date of the dataset in the destination NetCDF file.
src_file : str, list of str
Path to file that contains source.
check_variable(variable)[source]

Checks if a variable exists in a source and returns it’s correct name.

variable : str
Variable to check.
varname : str
Name of the variable in the source.
download(download_path=None, begin=None, end=None)[source]

“Download data

begin : datetime, optional
start date of download, default to None
end : datetime, optional
start date of download, default to None
check : bool, string
True if download was complete, False if no data were available, Error Message if connection to resource failed.
download_and_resample(download_path=None, begin=None, end=None, delete_rawdata=False, shapefile=None)[source]

Downloads and resamples data.

download_path : str
Path where to save the downloaded files.
begin : datetime.date, optional
set either to first date of remote repository or date of last file in local repository
end : datetime.date, optional
set to today if none given
delete_rawdata : bool, optional
Original files will be deleted from rawdata_path if set True
shapefile : str, optional
Path to shape file, uses “world country admin boundary shapefile” by default.
fill_gaps(begin=None, end=None)[source]

Detects gaps in data and tries to fill them by downloading and resampling the data within these periods.

begin : datetime
Begin date of intervall to check, defaults to None.
end : datetime
End date of intervall to check, defaults to None.
get_variables()[source]

Gets all variables given in the NetCDF file.

variables : list of str
Variables from given in the NetCDF file.
read_img(date, region=None, variable=None, scaled=True)[source]

Gets images from netCDF file for certain date

date : datetime
Date of the image.
region : str, optional
Region of interest, set to first defined region if not set.
variable : str, optional
Variable to display, selects first available variables if None.
scaled : bool, optional
If true, data will be scaled to a predefined range; if false, data will be shown as given in rawdata file; defaults to True.
img : numpy.ndarray
Image of selected date.
lon : numpy.array
Array with longitudes.
lat : numpy.array
Array with latitudes.
metadata : dict
Dictionary containing metadata of the variable.
read_ts(location, region=None, variable=None, shapefile=None, scaled=True)[source]

Gets timeseries from netCDF file for a gridpoint.

location : int or tuple of floats
Either Grid point index as integer value or Longitude/Latitude given as tuple.
region : str, optional
Region of interest, set to first defined region if not set.
variable : str, optional
Variable to display, selects all available variables if None.
shapefile : str, optional
Path to custom shapefile.
scaled : bool, optional
If true, data will be scaled to a predefined range; if false, data will be shown as given in rawdata file; defaults to True
df : pd.DataFrame
Timeseries for selected variables.
resample(begin=None, end=None, delete_rawdata=False, shapefile=None, stepwise=True)[source]

Resamples source data to given spatial and temporal resolution.

Writes resampled images into a netCDF data file. Deletes original files if flag delete_rawdata is set True.

begin : datetime
Start date of resampling.
end : datetime
End date of resampling.
delete_rawdata : bool
Original files will be deleted from rawdata_path if set ‘True’.
shapefile : str, optional
Path to shape file, uses “world country admin boundary shapefile” by default.

poets.io.unpack module

Module for unpacking compressed archives. Based on pyunpack and patool.

poets.io.unpack.check_compressed(filepath)[source]

Checks if a file is compressed using the file extension.

filepath : string
Path to input file.
boolean
True if compressed, False if not.
poets.io.unpack.flatten(outpath)[source]

Flattens directory structure.

outpath : str
Directory to flatten.
OSError :
If file cannot be moved.
poets.io.unpack.unpack(filepath, outpath=None)[source]

Unpacks compressed archives and files recursively and flattens the output.

filepath : str
Path to zipped archive.
outpath : str
Path where decompressed files will be stored.
flatten : bool, optional
If True, output dir will be flattened.
IOError :
If input file does not exist.

Module contents