obsplus.WaveBank¶

class obsplus.WaveBank(base_path='.', path_structure=None, name_structure=None, cache_size=5, format='mseed', ext=None, executor=None)[source]¶

A class to interact with a directory of waveform files.

WaveBank recursively reads each file in a directory and creates an index to allow the files to be efficiently queried.

Implements a superset of the WaveformClient interface.

Parameters:

base_path (str) – The path to the directory containing waveform files. If it does not exist an empty directory will be created.
path_structure (str) –
Define the directory structure of the wavebank that will be used to put waveforms into the directory. Characters are separated by /, regardless of operating system. The following words can be used in curly braces as data specific variables:

year, month, day, julday, hour, minute, second, network, station, location, channel, time

example : streams/{year}/{month}/{day}/{network}/{station} If no structure is provided it will be read from the index, if no index exists the default is {net}/{sta}/{chan}/{year}/{month}/{day}
name_structure (str) – The same as path structure but for the file name. Supports the same variables but requires a period as the separation character. The default extension (.mseed) will be added. The default is {time} example : {seedid}.{time}
cache_size (int) – The number of queries to store. Avoids having to read the index of the bank multiple times for queries involving the same start and end times.
format (str) – The expected format for the waveform files. Any format supported by obspy.read is permitted. The default is mseed. Other formats will be tried after the default parser fails.
ext (str or None) – The extension of the waveform files. If provided, only files with this extension will be read.
executor (Optional[Executor]) – An executor with the same interface as concurrent.futures.Executor, the map method of the executor will be used for reading files and updating indices.

Examples

>>> # --- Create a `WaveBank` from a path to a directory with waveform files.
>>> import obsplus
>>> import obspy
>>> waveform_path = obsplus.copy_dataset('default_test').waveform_path
>>> # init a WaveBank and index the files.
>>> wbank = obsplus.WaveBank(waveform_path).update_index()

>>> # --- Retrieve a stream objects from the bank.
>>> # Load all Z component data (dont do this for large datasets!)
>>> st = wbank.get_waveforms(channel='*Z')
>>> assert isinstance(st, obspy.Stream) and len(st) == 1

>>> # --- Read the index used by WaveBank as a DataFrame.
>>> df = wbank.read_index()
>>> assert len(df) == 3, 'there should be 3 traces in the bank.'

>>> # --- Get availability of archive as dataframe
>>> avail = wbank.get_availability_df()

>>> # --- Get table of gaps in the archive
>>> gaps_df = wbank.get_gaps_df()

>>> # --- yield 5 sec contiguous streams with 1 sec overlap (6 sec total)
>>> # get input parameters
>>> t1, t2 = avail.iloc[0]['starttime'], avail.iloc[0]['endtime']
>>> kwargs = dict(starttime=t1, endtime=t2, duration=5, overlap=1)
>>> # init list for storing output
>>> out = []
>>> for st in wbank.yield_waveforms(**kwargs):
...     out.append(st)
>>> assert len(out) == 6

>>> # --- Put a new stream and into the bank
>>> # get an event from another dataset, keep track of its id
>>> ds = obsplus.load_dataset('bingham_test')
>>> query_kwargs = dict (station='NOQ', channel='*Z')
>>> new_st = ds.waveform_client.get_waveforms(**query_kwargs)
>>> assert len(new_st)
>>> wbank.put_waveforms(new_st)
>>> st2 = wbank.get_waveforms(channel='*Z')
>>> assert len(new_st) + 2

__init__(base_path='.', path_structure=None, name_structure=None, cache_size=5, format='mseed', ext=None, executor=None)[source]¶

Methods

`__init__`([base_path, path_structure, ...])
`availability`([network, station, location, ...])	Get availability for a given group of instruments.
`clear_cache`()	Clear the index cache if the bank is using one.
`ensure_bank_path_exists`([create])	Ensure the bank_path exists else raise an BankDoesNotExistError.
`get_availability_df`(args, *kwargs)	Return a dataframe specifying the availability of the archive.
`get_gaps_df`(*args[, min_gap])	Return a dataframe containing an entry for every gap in the archive.
`get_progress_bar`([bar])	Return a progress bar instance based on bar parameter.
`get_segments_df`(args, *kwargs)	Return a dataframe of contiguous segments for the selected channels
`get_service_version`()	Return the version of obsplus used to create index.
`get_uptime_df`(args, *kwargs)	Return a dataframe with uptime stats for selected channels.
`get_waveforms`([network, station, location, ...])	Get waveforms from the bank.
`get_waveforms_bulk`(bulk[, index])	Get a large number of waveforms with a bulk request.
`load_example_bank`([dataset, path])	Create an example bank which is safe to modify.
`put_waveforms`(stream[, name, update_index])	Add the waveforms in a waveforms to the bank.
`read_index`([network, station, location, ...])	Return a dataframe of the index, optionally applying filters.
`update_index`([bar, paths])	Iterate files in bank and add any modified since last update to index.
`yield_waveforms`([network, station, ...])	Yield time-series segments.

Attributes

`bank_path`
`buffer`
`columns_no_path`
`executor`
`ext`
`hdf_kwargs`	A dict of hdf_kwargs to pass to PyTables
`index_columns`
`index_ints`
`index_name`
`index_path`	Return the expected path to the index file.
`index_str`
`last_updated`	Get the last time (UTC) that the bank was updated.
`last_updated_timestamp`	Return the last modified time stored in the index, else None.
`metadata_columns`
`min_itemsize`
`name_structure`
`namespace`
`path_structure`

obsplus.WaveBank¶

Table of Contents

This Page