{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# WaveBank\n",
"`WaveBank` is an in-process database for accessing seismic time-series data. Any directory structure containing ObsPy-readable waveforms can be used as the data source. `WaveBank` uses a simple indexing scheme and the [Hierarchical Data Format](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) to keep track of each `Trace` in the directory. Without `WaveBank` (or another similar program) applications have implement their own data organization/access logic which is tedious and clutters up application code. `WaveBank` provides a better way. \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Example Data\n",
"This tutorial will demonstrate the use of `WaveBank` on two different [obsplus datasets](../datasets/datasets.ipynb). \n",
"\n",
"The first dataset, [crandall canyon](https://en.wikipedia.org/wiki/Crandall_Canyon_Mine), only has event waveform files. The second only has continuous data from two TA stations. We start by loading these datasets, making a temporary copy, and getting a path to their waveform directories."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:19:58.688668Z",
"iopub.status.busy": "2024-02-28T22:19:58.688492Z",
"iopub.status.idle": "2024-02-28T22:20:00.877905Z",
"shell.execute_reply": "2024-02-28T22:20:00.877328Z"
}
},
"outputs": [],
"source": [
"%%capture\n",
"import obsplus\n",
"\n",
"# make sure datasets are downloaded and copy them to temporary\n",
"# directories to make sure no accidental changes are made\n",
"crandall_dataset = obsplus.load_dataset('crandall_test').copy()\n",
"ta_dataset = obsplus.load_dataset('ta_test').copy()\n",
"\n",
"# get path to waveform directories\n",
"crandall_path = crandall_dataset.waveform_path\n",
"ta_path = ta_dataset.waveform_path"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:00.880847Z",
"iopub.status.busy": "2024-02-28T22:20:00.880262Z",
"iopub.status.idle": "2024-02-28T22:20:00.885826Z",
"shell.execute_reply": "2024-02-28T22:20:00.885263Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"PosixPath('/home/runner/opsdata/crandall_test/waveforms')"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"crandall_path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a WaveBank object\n",
"To create a `WaveBank` instance simply pass the class a path to the waveform directory."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:00.912860Z",
"iopub.status.busy": "2024-02-28T22:20:00.912423Z",
"iopub.status.idle": "2024-02-28T22:20:00.927601Z",
"shell.execute_reply": "2024-02-28T22:20:00.927162Z"
}
},
"outputs": [],
"source": [
"bank = obsplus.WaveBank(crandall_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Utilizing the `udpate_index` method on the bank ensures the index is up-to-date. This will iterate through all files that are timestamped later than the last time `update_index` was run.\n",
"\n",
"Note: If the index has not yet been created or new files have been added, `update_index` needs to be called."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:00.929891Z",
"iopub.status.busy": "2024-02-28T22:20:00.929709Z",
"iopub.status.idle": "2024-02-28T22:20:00.944738Z",
"shell.execute_reply": "2024-02-28T22:20:00.944248Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"WaveBank(base_path=/home/runner/opsdata/crandall_test/waveforms)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank.update_index()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get waveforms\n",
"\n",
"The files can be retrieved from the directory with the `get_waveforms` method. This method has the same signature as the ObsPy client `get_waveform` methods so they can be used interchangeably:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:00.946907Z",
"iopub.status.busy": "2024-02-28T22:20:00.946581Z",
"iopub.status.idle": "2024-02-28T22:20:01.014708Z",
"shell.execute_reply": "2024-02-28T22:20:01.014133Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5 Trace(s) in Stream:\n",
"TA.O15A..BHE | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n",
"TA.O15A..BHN | 2007-08-06T01:44:47.999998Z - 2007-08-06T01:45:47.999998Z | 40.0 Hz, 2401 samples\n",
"TA.O15A..BHZ | 2007-08-06T01:44:47.999998Z - 2007-08-06T01:45:47.999998Z | 40.0 Hz, 2401 samples\n",
"TA.O16A..BHE | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n",
"TA.O16A..BHN | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n"
]
}
],
"source": [
"import obspy\n",
"\n",
"t1 = obspy.UTCDateTime('2007-08-06T01-44-48')\n",
"t2 = t1 + 60\n",
"st = bank.get_waveforms(starttime=t1, endtime=t2)\n",
"print (st[:5]) # print first 5 traces"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`WaveBank` can filter on channels, locations, stations, networks, etc. using linux style search strings or regex. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.017190Z",
"iopub.status.busy": "2024-02-28T22:20:01.016827Z",
"iopub.status.idle": "2024-02-28T22:20:01.031510Z",
"shell.execute_reply": "2024-02-28T22:20:01.030986Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5 Trace(s) in Stream:\n",
"UU.CTU..HHE | 2007-08-06T01:44:47.994000Z - 2007-08-06T01:45:47.994000Z | 100.0 Hz, 6001 samples\n",
"UU.CTU..HHN | 2007-08-06T01:44:47.994000Z - 2007-08-06T01:45:47.994000Z | 100.0 Hz, 6001 samples\n",
"UU.CTU..HHZ | 2007-08-06T01:44:47.994000Z - 2007-08-06T01:45:47.994000Z | 100.0 Hz, 6001 samples\n",
"UU.MPU..HHE | 2007-08-06T01:44:47.992000Z - 2007-08-06T01:45:47.992000Z | 100.0 Hz, 6001 samples\n",
"UU.MPU..HHN | 2007-08-06T01:44:47.992000Z - 2007-08-06T01:45:47.992000Z | 100.0 Hz, 6001 samples\n"
]
}
],
"source": [
"st2 = bank.get_waveforms(network='UU', starttime=t1, endtime=t2)\n",
"\n",
"# ensure only UU traces were returned\n",
"for tr in st2:\n",
" assert tr.stats.network == 'UU'\n",
"\n",
"print(st2[:5]) # print first 5 traces"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.033886Z",
"iopub.status.busy": "2024-02-28T22:20:01.033469Z",
"iopub.status.idle": "2024-02-28T22:20:01.045000Z",
"shell.execute_reply": "2024-02-28T22:20:01.044472Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6 Trace(s) in Stream:\n",
"TA.O15A..BHE | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n",
"TA.O15A..BHN | 2007-08-06T01:44:47.999998Z - 2007-08-06T01:45:47.999998Z | 40.0 Hz, 2401 samples\n",
"TA.O16A..BHE | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n",
"TA.O16A..BHN | 2007-08-06T01:44:48.000000Z - 2007-08-06T01:45:48.000000Z | 40.0 Hz, 2401 samples\n",
"TA.O18A..BHE | 2007-08-06T01:44:47.999998Z - 2007-08-06T01:45:47.999998Z | 40.0 Hz, 2401 samples\n",
"TA.O18A..BHN | 2007-08-06T01:44:47.999998Z - 2007-08-06T01:45:47.999998Z | 40.0 Hz, 2401 samples\n"
]
}
],
"source": [
"st = bank.get_waveforms(starttime=t1, endtime=t2, station='O1??', channel='BH[NE]')\n",
"\n",
"# test returned traces\n",
"for tr in st:\n",
" assert tr.stats.starttime >= t1 - .00001\n",
" assert tr.stats.endtime <= t2 + .00001\n",
" assert tr.stats.station.startswith('O1')\n",
" assert tr.stats.channel.startswith('BH')\n",
" assert tr.stats.channel[-1] in {'N', 'E'}\n",
"\n",
"print(st)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"WaveBank also has a `get_waveforms_bulk` method for efficiently retrieving a large number of streams. "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.047286Z",
"iopub.status.busy": "2024-02-28T22:20:01.046951Z",
"iopub.status.idle": "2024-02-28T22:20:01.103820Z",
"shell.execute_reply": "2024-02-28T22:20:01.103241Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 Trace(s) in Stream:\n",
"TA.O15A..BHZ | 2007-08-06T01:44:42.999998Z - 2007-08-06T01:45:42.999998Z | 40.0 Hz, 2401 samples\n",
"UU.SRU..HHZ | 2007-08-06T01:44:47.995000Z - 2007-08-06T01:45:47.995000Z | 100.0 Hz, 6001 samples\n"
]
}
],
"source": [
"args = [ # in practice this list may contain hundreds or thousands of requests\n",
" ('TA', 'O15A', '', 'BHZ', t1 - 5, t2 - 5,),\n",
" ('UU', 'SRU', '', 'HHZ', t1, t2,),\n",
"]\n",
"st = bank.get_waveforms_bulk(args)\n",
"print(st )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Yield waveforms\n",
"The Bank class also provides a generator for iterating large amounts of continuous waveforms. The following example shows how to get streams of one hour duration with a minute of overlap between the slices. \n",
"\n",
"The first step is to create a bank on a dataset which has continuous data. The example below will use the TA dataset."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.106374Z",
"iopub.status.busy": "2024-02-28T22:20:01.106006Z",
"iopub.status.idle": "2024-02-28T22:20:01.119967Z",
"shell.execute_reply": "2024-02-28T22:20:01.119503Z"
}
},
"outputs": [],
"source": [
"ta_bank = obsplus.WaveBank(ta_path)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.122291Z",
"iopub.status.busy": "2024-02-28T22:20:01.121859Z",
"iopub.status.idle": "2024-02-28T22:20:01.542088Z",
"shell.execute_reply": "2024-02-28T22:20:01.541461Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T00:00:09.999998Z to 2007-02-15T01:00:59.999998Z\n",
"got 6 streams from 2007-02-15T00:59:59.999998Z to 2007-02-15T02:00:59.999998Z\n",
"got 6 streams from 2007-02-15T01:59:59.999998Z to 2007-02-15T03:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T02:59:59.999998Z to 2007-02-15T04:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T03:59:59.999998Z to 2007-02-15T05:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T04:59:59.999998Z to 2007-02-15T06:00:59.999998Z\n",
"got 6 streams from 2007-02-15T05:59:59.999998Z to 2007-02-15T07:00:59.999998Z\n",
"got 6 streams from 2007-02-15T06:59:59.999998Z to 2007-02-15T08:00:59.999998Z\n",
"got 6 streams from 2007-02-15T07:59:59.999998Z to 2007-02-15T09:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T08:59:59.999998Z to 2007-02-15T10:00:59.999998Z\n",
"got 6 streams from 2007-02-15T09:59:59.999998Z to 2007-02-15T11:00:59.999998Z\n",
"got 6 streams from 2007-02-15T10:59:59.999998Z to 2007-02-15T12:00:59.999998Z\n",
"got 6 streams from 2007-02-15T11:59:59.999998Z to 2007-02-15T13:00:59.999998Z\n",
"got 6 streams from 2007-02-15T12:59:59.999998Z to 2007-02-15T14:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T13:59:59.999998Z to 2007-02-15T15:00:59.999998Z\n",
"got 6 streams from 2007-02-15T14:59:59.999998Z to 2007-02-15T16:00:59.999998Z\n",
"got 6 streams from 2007-02-15T15:59:59.999998Z to 2007-02-15T17:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T16:59:59.999998Z to 2007-02-15T18:00:59.999998Z"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T17:59:59.999998Z to 2007-02-15T19:00:59.999998Z\n",
"got 6 streams from 2007-02-15T18:59:59.999998Z to 2007-02-15T20:00:59.999998Z\n",
"got 6 streams from 2007-02-15T19:59:59.999998Z to 2007-02-15T21:00:59.999998Z\n",
"got 6 streams from 2007-02-15T20:59:59.999998Z to 2007-02-15T22:00:59.999998Z\n",
"got 6 streams from 2007-02-15T21:59:59.999998Z to 2007-02-15T23:00:59.999998Z\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"got 6 streams from 2007-02-15T22:59:59.999998Z to 2007-02-16T00:00:59.999998Z\n"
]
}
],
"source": [
"# get a few hours of kemmerer data\n",
"ta_t1 = obspy.UTCDateTime('2007-02-15')\n",
"ta_t2 = obspy.UTCDateTime('2007-02-16')\n",
"\n",
"for st in ta_bank.yield_waveforms(starttime=ta_t1, endtime=ta_t2, duration=3600, overlap=60):\n",
" print (f'got {len(st)} streams from {st[0].stats.starttime} to {st[0].stats.endtime}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Put waveforms\n",
"Files can be added to the bank by passing a stream or trace to the `bank.put_waveforms` method. `WaveBank` does not merge files so overlap in data may occur if care is not taken."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.544595Z",
"iopub.status.busy": "2024-02-28T22:20:01.544222Z",
"iopub.status.idle": "2024-02-28T22:20:01.587175Z",
"shell.execute_reply": "2024-02-28T22:20:01.586596Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 Trace(s) in Stream:\n",
"\n"
]
}
],
"source": [
"# show that no data for RJOB is in the bank\n",
"st = bank.get_waveforms(station='RJOB')\n",
"\n",
"assert len(st) == 0\n",
"\n",
"print(st)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.589428Z",
"iopub.status.busy": "2024-02-28T22:20:01.589242Z",
"iopub.status.idle": "2024-02-28T22:20:01.748728Z",
"shell.execute_reply": "2024-02-28T22:20:01.748167Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3 Trace(s) in Stream:\n",
"BW.RJOB..EHE | 2009-08-24T00:20:03.000000Z - 2009-08-24T00:20:32.990000Z | 100.0 Hz, 3000 samples\n",
"BW.RJOB..EHN | 2009-08-24T00:20:03.000000Z - 2009-08-24T00:20:32.990000Z | 100.0 Hz, 3000 samples\n",
"BW.RJOB..EHZ | 2009-08-24T00:20:03.000000Z - 2009-08-24T00:20:32.990000Z | 100.0 Hz, 3000 samples\n"
]
}
],
"source": [
"# add the default stream to the archive (which contains data for RJOB)\n",
"bank.put_waveforms(obspy.read())\n",
"st_out = bank.get_waveforms(station='RJOB')\n",
"\n",
"# test output\n",
"assert len(st_out)\n",
"for tr in st_out:\n",
" assert tr.stats.station == 'RJOB'\n",
"\n",
"\n",
"print(st_out)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Availability\n",
"`WaveBank` can be used to get the availability of data. The outputs can either be a dataframe or as a list of tuples in the form of [(network, station, location, channel, min_starttime, max_endtime)]. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.751180Z",
"iopub.status.busy": "2024-02-28T22:20:01.750809Z",
"iopub.status.idle": "2024-02-28T22:20:01.799647Z",
"shell.execute_reply": "2024-02-28T22:20:01.799162Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" network | \n",
" station | \n",
" location | \n",
" channel | \n",
" starttime | \n",
" endtime | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 01:44:38.825000 | \n",
" 2007-08-07 21:43:51.124998 | \n",
"
\n",
" \n",
" 1 | \n",
" TA | \n",
" O16A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 01:44:38.825000 | \n",
" 2007-08-07 21:43:51.125000 | \n",
"
\n",
" \n",
" 2 | \n",
" TA | \n",
" O18A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 01:44:38.824998 | \n",
" 2007-08-07 21:43:51.125000 | \n",
"
\n",
" \n",
" 3 | \n",
" TA | \n",
" R16A | \n",
" | \n",
" BHE | \n",
" 2007-08-07 02:04:54.500000 | \n",
" 2007-08-07 21:43:51.125000 | \n",
"
\n",
" \n",
" 4 | \n",
" TA | \n",
" R17A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 01:44:38.825000 | \n",
" 2007-08-07 21:43:51.125000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" network station location channel starttime \\\n",
"0 TA O15A BHE 2007-08-06 01:44:38.825000 \n",
"1 TA O16A BHE 2007-08-06 01:44:38.825000 \n",
"2 TA O18A BHE 2007-08-06 01:44:38.824998 \n",
"3 TA R16A BHE 2007-08-07 02:04:54.500000 \n",
"4 TA R17A BHE 2007-08-06 01:44:38.825000 \n",
"\n",
" endtime \n",
"0 2007-08-07 21:43:51.124998 \n",
"1 2007-08-07 21:43:51.125000 \n",
"2 2007-08-07 21:43:51.125000 \n",
"3 2007-08-07 21:43:51.125000 \n",
"4 2007-08-07 21:43:51.125000 "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# get a dataframe of availability by seed ids and timestamps\n",
"bank.get_availability_df(channel='BHE', station='[OR]*')"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.801946Z",
"iopub.status.busy": "2024-02-28T22:20:01.801586Z",
"iopub.status.idle": "2024-02-28T22:20:01.814908Z",
"shell.execute_reply": "2024-02-28T22:20:01.814310Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[('TA',\n",
" 'O15A',\n",
" '',\n",
" 'BHE',\n",
" 2007-08-06T01:44:38.825000Z,\n",
" 2007-08-07T21:43:51.124998Z),\n",
" ('TA',\n",
" 'O16A',\n",
" '',\n",
" 'BHE',\n",
" 2007-08-06T01:44:38.825000Z,\n",
" 2007-08-07T21:43:51.125000Z),\n",
" ('TA',\n",
" 'O18A',\n",
" '',\n",
" 'BHE',\n",
" 2007-08-06T01:44:38.824998Z,\n",
" 2007-08-07T21:43:51.125000Z),\n",
" ('TA',\n",
" 'R16A',\n",
" '',\n",
" 'BHE',\n",
" 2007-08-07T02:04:54.500000Z,\n",
" 2007-08-07T21:43:51.125000Z),\n",
" ('TA',\n",
" 'R17A',\n",
" '',\n",
" 'BHE',\n",
" 2007-08-06T01:44:38.825000Z,\n",
" 2007-08-07T21:43:51.125000Z)]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# get list of tuples of availability\n",
"bank.availability(channel='BHE', station='[OR]*')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get Gaps and uptime\n",
"`WaveBank` can return a dataframe of missing data with the `get_gaps_df` method, and a dataframe of reliability statistics with the `get_uptime_df` method. These are useful for assessing the completeness of an archive of contiguous data."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.817188Z",
"iopub.status.busy": "2024-02-28T22:20:01.816836Z",
"iopub.status.idle": "2024-02-28T22:20:01.837758Z",
"shell.execute_reply": "2024-02-28T22:20:01.837275Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" network | \n",
" station | \n",
" location | \n",
" channel | \n",
" starttime | \n",
" endtime | \n",
" sampling_period | \n",
" path | \n",
" gap_duration | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 01:45:48.799998 | \n",
" 2007-08-06 08:48:30.024998 | \n",
" 0 days 00:00:00.025000 | \n",
" TA.O15A..BHE__20070806T014438Z__20070806T01454... | \n",
" 0 days 07:02:41.225000 | \n",
"
\n",
" \n",
" 1 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 08:49:39.999998 | \n",
" 2007-08-06 10:47:15.624998 | \n",
" 0 days 00:00:00.025000 | \n",
" TA.O15A..BHE__20070806T084830Z__20070806T08494... | \n",
" 0 days 01:57:35.625000 | \n",
"
\n",
" \n",
" 2 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-06 10:48:25.599998 | \n",
" 2007-08-07 02:04:54.499998 | \n",
" 0 days 00:00:00.025000 | \n",
" TA.O15A..BHE__20070806T104715Z__20070806T10482... | \n",
" 0 days 15:16:28.900000 | \n",
"
\n",
" \n",
" 3 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-07 02:06:04.474998 | \n",
" 2007-08-07 02:14:14.100000 | \n",
" 0 days 00:00:00.025000 | \n",
" TA.O15A..BHE__20070807T020454Z__20070807T02060... | \n",
" 0 days 00:08:09.625002 | \n",
"
\n",
" \n",
" 4 | \n",
" TA | \n",
" O15A | \n",
" | \n",
" BHE | \n",
" 2007-08-07 02:15:24.074998 | \n",
" 2007-08-07 03:44:08.474998 | \n",
" 0 days 00:00:00.025000 | \n",
" TA.O15A..BHE__20070807T021414Z__20070807T02152... | \n",
" 0 days 01:28:44.400000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" network station location channel starttime \\\n",
"0 TA O15A BHE 2007-08-06 01:45:48.799998 \n",
"1 TA O15A BHE 2007-08-06 08:49:39.999998 \n",
"2 TA O15A BHE 2007-08-06 10:48:25.599998 \n",
"3 TA O15A BHE 2007-08-07 02:06:04.474998 \n",
"4 TA O15A BHE 2007-08-07 02:15:24.074998 \n",
"\n",
" endtime sampling_period \\\n",
"0 2007-08-06 08:48:30.024998 0 days 00:00:00.025000 \n",
"1 2007-08-06 10:47:15.624998 0 days 00:00:00.025000 \n",
"2 2007-08-07 02:04:54.499998 0 days 00:00:00.025000 \n",
"3 2007-08-07 02:14:14.100000 0 days 00:00:00.025000 \n",
"4 2007-08-07 03:44:08.474998 0 days 00:00:00.025000 \n",
"\n",
" path gap_duration \n",
"0 TA.O15A..BHE__20070806T014438Z__20070806T01454... 0 days 07:02:41.225000 \n",
"1 TA.O15A..BHE__20070806T084830Z__20070806T08494... 0 days 01:57:35.625000 \n",
"2 TA.O15A..BHE__20070806T104715Z__20070806T10482... 0 days 15:16:28.900000 \n",
"3 TA.O15A..BHE__20070807T020454Z__20070807T02060... 0 days 00:08:09.625002 \n",
"4 TA.O15A..BHE__20070807T021414Z__20070807T02152... 0 days 01:28:44.400000 "
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bank.get_gaps_df(channel='BHE', station='O*').head()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.839963Z",
"iopub.status.busy": "2024-02-28T22:20:01.839603Z",
"iopub.status.idle": "2024-02-28T22:20:01.959898Z",
"shell.execute_reply": "2024-02-28T22:20:01.959250Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" network | \n",
" station | \n",
" location | \n",
" channel | \n",
" starttime | \n",
" endtime | \n",
" duration | \n",
" gap_duration | \n",
" uptime | \n",
" availability | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHE | \n",
" 2007-02-15 00:00:09.999998 | \n",
" 2007-02-24 23:59:59.999998 | \n",
" 9 days 23:59:50 | \n",
" 0 days | \n",
" 9 days 23:59:50 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 1 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHN | \n",
" 2007-02-15 00:00:09.999998 | \n",
" 2007-02-24 23:59:59.999998 | \n",
" 9 days 23:59:50 | \n",
" 0 days | \n",
" 9 days 23:59:50 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 2 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHZ | \n",
" 2007-02-15 00:00:09.999998 | \n",
" 2007-02-24 23:59:59.999998 | \n",
" 9 days 23:59:50 | \n",
" 0 days | \n",
" 9 days 23:59:50 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 3 | \n",
" TA | \n",
" M14A | \n",
" | \n",
" VHE | \n",
" 2007-02-15 00:00:00.000003 | \n",
" 2007-02-25 00:00:00.000003 | \n",
" 10 days 00:00:00 | \n",
" 0 days | \n",
" 10 days 00:00:00 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 4 | \n",
" TA | \n",
" M14A | \n",
" | \n",
" VHN | \n",
" 2007-02-15 00:00:00.000003 | \n",
" 2007-02-25 00:00:00.000003 | \n",
" 10 days 00:00:00 | \n",
" 0 days | \n",
" 10 days 00:00:00 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 5 | \n",
" TA | \n",
" M14A | \n",
" | \n",
" VHZ | \n",
" 2007-02-15 00:00:00.000004 | \n",
" 2007-02-25 00:00:00.000004 | \n",
" 10 days 00:00:00 | \n",
" 0 days | \n",
" 10 days 00:00:00 | \n",
" 1.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" network station location channel starttime \\\n",
"0 TA M11A VHE 2007-02-15 00:00:09.999998 \n",
"1 TA M11A VHN 2007-02-15 00:00:09.999998 \n",
"2 TA M11A VHZ 2007-02-15 00:00:09.999998 \n",
"3 TA M14A VHE 2007-02-15 00:00:00.000003 \n",
"4 TA M14A VHN 2007-02-15 00:00:00.000003 \n",
"5 TA M14A VHZ 2007-02-15 00:00:00.000004 \n",
"\n",
" endtime duration gap_duration uptime \\\n",
"0 2007-02-24 23:59:59.999998 9 days 23:59:50 0 days 9 days 23:59:50 \n",
"1 2007-02-24 23:59:59.999998 9 days 23:59:50 0 days 9 days 23:59:50 \n",
"2 2007-02-24 23:59:59.999998 9 days 23:59:50 0 days 9 days 23:59:50 \n",
"3 2007-02-25 00:00:00.000003 10 days 00:00:00 0 days 10 days 00:00:00 \n",
"4 2007-02-25 00:00:00.000003 10 days 00:00:00 0 days 10 days 00:00:00 \n",
"5 2007-02-25 00:00:00.000004 10 days 00:00:00 0 days 10 days 00:00:00 \n",
"\n",
" availability \n",
"0 1.0 \n",
"1 1.0 \n",
"2 1.0 \n",
"3 1.0 \n",
"4 1.0 \n",
"5 1.0 "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ta_bank.get_uptime_df()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read index\n",
"`WaveBank` can return a dataframe of the the index with the `read_index` method, although in most cases this shouldn't be needed."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"execution": {
"iopub.execute_input": "2024-02-28T22:20:01.962302Z",
"iopub.status.busy": "2024-02-28T22:20:01.961950Z",
"iopub.status.idle": "2024-02-28T22:20:01.971451Z",
"shell.execute_reply": "2024-02-28T22:20:01.970969Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" network | \n",
" station | \n",
" location | \n",
" channel | \n",
" starttime | \n",
" endtime | \n",
" sampling_period | \n",
" path | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHN | \n",
" 2007-02-19 14:59:59.999998 | \n",
" 2007-02-19 15:59:59.999998 | \n",
" 0 days 00:00:10 | \n",
" TA/M11A/VHN/2007-02-19T15-00-00.mseed | \n",
"
\n",
" \n",
" 1 | \n",
" TA | \n",
" M14A | \n",
" | \n",
" VHN | \n",
" 2007-02-19 15:00:00.000003 | \n",
" 2007-02-19 16:00:00.000003 | \n",
" 0 days 00:00:10 | \n",
" TA/M11A/VHN/2007-02-19T15-00-00.mseed | \n",
"
\n",
" \n",
" 2 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHN | \n",
" 2007-02-15 23:59:59.999998 | \n",
" 2007-02-16 00:59:59.999998 | \n",
" 0 days 00:00:10 | \n",
" TA/M11A/VHN/2007-02-16T00-00-00.mseed | \n",
"
\n",
" \n",
" 3 | \n",
" TA | \n",
" M14A | \n",
" | \n",
" VHN | \n",
" 2007-02-16 00:00:00.000003 | \n",
" 2007-02-16 01:00:00.000003 | \n",
" 0 days 00:00:10 | \n",
" TA/M11A/VHN/2007-02-16T00-00-00.mseed | \n",
"
\n",
" \n",
" 4 | \n",
" TA | \n",
" M11A | \n",
" | \n",
" VHN | \n",
" 2007-02-20 15:59:59.999998 | \n",
" 2007-02-20 16:59:59.999998 | \n",
" 0 days 00:00:10 | \n",
" TA/M11A/VHN/2007-02-20T16-00-00.mseed | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" network station location channel starttime \\\n",
"0 TA M11A VHN 2007-02-19 14:59:59.999998 \n",
"1 TA M14A VHN 2007-02-19 15:00:00.000003 \n",
"2 TA M11A VHN 2007-02-15 23:59:59.999998 \n",
"3 TA M14A VHN 2007-02-16 00:00:00.000003 \n",
"4 TA M11A VHN 2007-02-20 15:59:59.999998 \n",
"\n",
" endtime sampling_period \\\n",
"0 2007-02-19 15:59:59.999998 0 days 00:00:10 \n",
"1 2007-02-19 16:00:00.000003 0 days 00:00:10 \n",
"2 2007-02-16 00:59:59.999998 0 days 00:00:10 \n",
"3 2007-02-16 01:00:00.000003 0 days 00:00:10 \n",
"4 2007-02-20 16:59:59.999998 0 days 00:00:10 \n",
"\n",
" path \n",
"0 TA/M11A/VHN/2007-02-19T15-00-00.mseed \n",
"1 TA/M11A/VHN/2007-02-19T15-00-00.mseed \n",
"2 TA/M11A/VHN/2007-02-16T00-00-00.mseed \n",
"3 TA/M11A/VHN/2007-02-16T00-00-00.mseed \n",
"4 TA/M11A/VHN/2007-02-20T16-00-00.mseed "
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ta_bank.read_index().head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Similar Projects\n",
"`WaveBank` is a useful tool, but it may not be a good fit for every application. Check out the following items as well:\n",
"\n",
"Obspy has a way to visualize availability of waveform data in a directory using [obspy-scan](https://docs.obspy.org/tutorial/code_snippets/visualize_data_availability_of_local_waveform_archive.html). If you prefer a graphical option to working with `DataFrame`s this might be for you.\n",
"\n",
"Obspy also has [filesystem client](https://docs.obspy.org/master/packages/autogen/obspy.clients.filesystem.sds.Client.html#obspy.clients.filesystem.sds.Client) for working with SeisComP structured archives.\n",
"\n",
"[IRIS](https://www.iris.edu/hq/) released a mini-seed indexing program called [mseedindex](https://github.com/iris-edu/mseedindex) which has an [ObsPy API](https://github.com/obspy/obspy/pull/2206)."
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}