Event Bank

The EventBank class is used to interact with a local directory of event files. The get_events method of Event Bank are compatible with the get_events method of the FDSN client in obspy. Additionally there are several useful features for managing events on disk.

Quickstart

[1]:
import obspy
import obsplus
[2]:
%%capture
# make sure the dataset is downloaded and supress output.
obsplus.load_dataset('crandall_test')
[3]:
# copy the Crandall dataset to a temporary directory
crandall = obsplus.copy_dataset('crandall_test')

# path to directory where events are stored
event_path = crandall.event_path

# init an EventBank instance
bank = obsplus.EventBank(event_path)

# ensure index is up-to-date
bank.update_index()
[3]:
EventBank(base_path=/tmp/tmpfpqpe12v/crandall_test/events)

Accessing the index

The index can be accessed directly to get a summary of the events contained in the archive. Depending on the task, it may be more natural to work with the index dataframe rather than the obspy catalog objects directly.

[4]:
index = bank.read_index()
index
[4]:
time latitude longitude depth magnitude event_description associated_phase_count azimuthal_gap event_id horizontal_uncertainty ... standard_error used_phase_count station_count vertical_uncertainty updated author agency_id creation_time version path
0 2007-08-06 08:48:40.010 39.4635 -111.2277 410.0 4.37 LR 0.0 NaN smi:local/248839 NaN ... 1.7356 0.0 134.0 NaN 2024-02-28 22:19:00.794070528 DC NIOSH 2018-10-10 20:33:13.618111 2007/08/06/2007-08-06T08-48-40_48839.xml
1 2007-08-07 02:14:24.080 39.4632 -111.2230 4180.0 1.26 LR 0.0 NaN smi:local/248883 NaN ... 0.8834 0.0 14.0 NaN 2024-02-28 22:19:00.794070528 DC NIOSH 2018-10-10 21:10:26.864045 2007/08/07/2007-08-07T02-14-24_48883.xml
2 2007-08-07 03:44:18.470 39.4625 -111.2152 4160.0 1.45 LR 0.0 NaN smi:local/248887 NaN ... 0.5716 0.0 15.0 NaN 2024-02-28 22:19:00.798070784 DC NIOSH 2018-10-10 21:10:27.576204 2007/08/07/2007-08-07T03-44-18_48887.xml
3 2007-08-07 07:13:05.760 39.4605 -111.2242 3240.0 2.24 0.0 NaN smi:local/248891 NaN ... 0.9901 0.0 35.0 NaN 2024-02-28 22:19:00.802070528 NaT 2007/08/07/2007-08-07T07-13-05_48891.xml
4 2007-08-07 02:05:04.490 39.4648 -111.2255 1790.0 2.08 LR 0.0 NaN smi:local/248882 NaN ... 0.9935 0.0 35.0 NaN 2024-02-28 22:19:00.806070528 DC NIOSH 2018-10-10 21:15:19.190404 2007/08/07/2007-08-07T02-05-04_48882.xml
5 2007-08-06 10:47:25.600 39.4615 -111.2317 2050.0 1.57 LR 0.0 NaN smi:local/248843 NaN ... 0.8237 0.0 29.0 NaN 2024-02-28 22:19:00.810070528 DC NIOSH 2018-10-10 20:33:27.110914 2007/08/06/2007-08-06T10-47-25_48843.xml
6 2007-08-07 21:42:51.130 39.4627 -111.2200 4620.0 1.65 0.0 NaN smi:local/248925 NaN ... 0.5704 0.0 19.0 NaN 2024-02-28 22:19:00.814070784 DC NIOSH 2018-10-11 22:08:54.236916 2007/08/07/2007-08-07T21-42-51_48925.xml
7 2007-08-06 01:44:48.810 39.4617 -111.2378 6570.0 1.78 LR 0.0 NaN smi:local/248828 NaN ... 0.8936 0.0 23.0 NaN 2024-02-28 22:19:00.818070528 DC NIOSH 2018-10-10 20:26:49.642650 2007/08/06/2007-08-06T01-44-48_48828.xml

8 rows × 28 columns

The index contains the following columns:

[5]:
print(index.columns)
Index(['time', 'latitude', 'longitude', 'depth', 'magnitude',
       'event_description', 'associated_phase_count', 'azimuthal_gap',
       'event_id', 'horizontal_uncertainty', 'local_magnitude',
       'moment_magnitude', 'duration_magnitude', 'magnitude_type',
       'p_phase_count', 's_phase_count', 'p_pick_count', 's_pick_count',
       'standard_error', 'used_phase_count', 'station_count',
       'vertical_uncertainty', 'updated', 'author', 'agency_id',
       'creation_time', 'version', 'path'],
      dtype='object')

Get events

The EventBank can be used to get obspy event objects based on query parameters.

[6]:
catalog = bank.get_events(minmagnitude=2)
[7]:
print(catalog)
3 Event(s) in Catalog:
2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2  mb
2007-08-07T07:13:05.760000Z | +39.461, -111.224 | 2.55 ml
2007-08-07T02:05:04.490000Z | +39.465, -111.225 | 2.44 ml

Put events

Events can be saved to disk using the put_events method. If an event with the same resource_id already exists in the bank it will be overwritten.

[8]:
print(f'The bank has {len(bank.read_index())} events before put_events call.')
The bank has 8 events before put_events call.
[9]:
bank.put_events(obspy.read_events())
print(f'The bank has {len(bank.read_index())} events after the put_events call.')
The bank has 11 events after the put_events call.
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333549302300000000 1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158807002143744 1709158807002143744 1709158807002143744]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333557650000000000 1333557650000000000 1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)

Organizing event directories

EventBank can also be used to (re)organize event directories. The events are saved in the following structure by default: {year/month/day/year-month-dayThour-minute-second-short_id.xml} (where “short_id” means the last 5 characters of the event id).

The structure of the event directories can be reorganized to {year/month/short_id.xml}. The following code demonstrates the process used in changing the event directory structure.

[10]:
from pathlib import Path

import tempfile

temp_dir = Path(tempfile.mkdtemp())

kwargs = dict(
    path_structure="{year}/{month}",
    name_structure="{event_id_short}",
)
[11]:
print(bank.get_events())
11 Event(s) in Catalog:
2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2  mb
2007-08-07T02:14:24.080000Z | +39.463, -111.223 | 1.17 ml
...
2012-04-04T14:18:37.000000Z | +39.342,  +41.044 | 4.3  ML | manual
2012-04-04T14:08:46.000000Z | +38.017,  +37.736 | 3.0  ML | manual
To see all events call 'print(CatalogObject.__str__(print_all=True))'
[12]:
bank2 = obsplus.EventBank(temp_dir, **kwargs)
bank2.put_events(bank)
print(bank2.read_index()['path'])
0     2007/08/48839.xml
1     2007/08/48883.xml
2     2007/08/48887.xml
3     2007/08/48891.xml
4     2007/08/48882.xml
5     2007/08/48843.xml
6     2007/08/48925.xml
7     2007/08/48828.xml
8     2012/04/00041.xml
9     2012/04/00038.xml
10    2012/04/00039.xml
Name: path, dtype: object
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1186390120010000000 1186452864080000000 1186458258470000000
 1186470785760000000 1186452304490000000 1186397245600000000
 1186522971130000000 1186364688810000000 1333549302300000000
 1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158808474145280 1709158808474145280 1709158808478145280
 1709158808482145280 1709158808486145280 1709158808490145280
 1709158808494145280 1709158808498145280 1709158808502145280
 1709158808502145280 1709158808502145280]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[ 1539203593618111000  1539205826864045000  1539205827576204000
 -9223372031854775808  1539206119190404000  1539203607110914000
  1539295734236916000  1539203209642650000  1333557650000000000
  1333557650000000000  1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
  df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)

Notes

Unlike the WaveBank, which uses HDF5 to index waveforms, EventBank uses a SQLite which is more suitable for frequent updates and CRUD usage patterns.