Event Bank¶
The EventBank class is used to interact with a local directory of event files. The get_events
method of Event Bank are compatible with the get_events
method of the FDSN client in obspy. Additionally there are several useful features for managing events on disk.
Quickstart¶
[1]:
import obspy
import obsplus
[2]:
%%capture
# make sure the dataset is downloaded and supress output.
obsplus.load_dataset('crandall_test')
[3]:
# copy the Crandall dataset to a temporary directory
crandall = obsplus.copy_dataset('crandall_test')
# path to directory where events are stored
event_path = crandall.event_path
# init an EventBank instance
bank = obsplus.EventBank(event_path)
# ensure index is up-to-date
bank.update_index()
[3]:
EventBank(base_path=/tmp/tmpfpqpe12v/crandall_test/events)
Accessing the index¶
The index can be accessed directly to get a summary of the events contained in the archive. Depending on the task, it may be more natural to work with the index dataframe rather than the obspy catalog objects directly.
[4]:
index = bank.read_index()
index
[4]:
time | latitude | longitude | depth | magnitude | event_description | associated_phase_count | azimuthal_gap | event_id | horizontal_uncertainty | ... | standard_error | used_phase_count | station_count | vertical_uncertainty | updated | author | agency_id | creation_time | version | path | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2007-08-06 08:48:40.010 | 39.4635 | -111.2277 | 410.0 | 4.37 | LR | 0.0 | NaN | smi:local/248839 | NaN | ... | 1.7356 | 0.0 | 134.0 | NaN | 2024-02-28 22:19:00.794070528 | DC | NIOSH | 2018-10-10 20:33:13.618111 | 2007/08/06/2007-08-06T08-48-40_48839.xml | |
1 | 2007-08-07 02:14:24.080 | 39.4632 | -111.2230 | 4180.0 | 1.26 | LR | 0.0 | NaN | smi:local/248883 | NaN | ... | 0.8834 | 0.0 | 14.0 | NaN | 2024-02-28 22:19:00.794070528 | DC | NIOSH | 2018-10-10 21:10:26.864045 | 2007/08/07/2007-08-07T02-14-24_48883.xml | |
2 | 2007-08-07 03:44:18.470 | 39.4625 | -111.2152 | 4160.0 | 1.45 | LR | 0.0 | NaN | smi:local/248887 | NaN | ... | 0.5716 | 0.0 | 15.0 | NaN | 2024-02-28 22:19:00.798070784 | DC | NIOSH | 2018-10-10 21:10:27.576204 | 2007/08/07/2007-08-07T03-44-18_48887.xml | |
3 | 2007-08-07 07:13:05.760 | 39.4605 | -111.2242 | 3240.0 | 2.24 | 0.0 | NaN | smi:local/248891 | NaN | ... | 0.9901 | 0.0 | 35.0 | NaN | 2024-02-28 22:19:00.802070528 | NaT | 2007/08/07/2007-08-07T07-13-05_48891.xml | ||||
4 | 2007-08-07 02:05:04.490 | 39.4648 | -111.2255 | 1790.0 | 2.08 | LR | 0.0 | NaN | smi:local/248882 | NaN | ... | 0.9935 | 0.0 | 35.0 | NaN | 2024-02-28 22:19:00.806070528 | DC | NIOSH | 2018-10-10 21:15:19.190404 | 2007/08/07/2007-08-07T02-05-04_48882.xml | |
5 | 2007-08-06 10:47:25.600 | 39.4615 | -111.2317 | 2050.0 | 1.57 | LR | 0.0 | NaN | smi:local/248843 | NaN | ... | 0.8237 | 0.0 | 29.0 | NaN | 2024-02-28 22:19:00.810070528 | DC | NIOSH | 2018-10-10 20:33:27.110914 | 2007/08/06/2007-08-06T10-47-25_48843.xml | |
6 | 2007-08-07 21:42:51.130 | 39.4627 | -111.2200 | 4620.0 | 1.65 | 0.0 | NaN | smi:local/248925 | NaN | ... | 0.5704 | 0.0 | 19.0 | NaN | 2024-02-28 22:19:00.814070784 | DC | NIOSH | 2018-10-11 22:08:54.236916 | 2007/08/07/2007-08-07T21-42-51_48925.xml | ||
7 | 2007-08-06 01:44:48.810 | 39.4617 | -111.2378 | 6570.0 | 1.78 | LR | 0.0 | NaN | smi:local/248828 | NaN | ... | 0.8936 | 0.0 | 23.0 | NaN | 2024-02-28 22:19:00.818070528 | DC | NIOSH | 2018-10-10 20:26:49.642650 | 2007/08/06/2007-08-06T01-44-48_48828.xml |
8 rows × 28 columns
The index contains the following columns:
[5]:
print(index.columns)
Index(['time', 'latitude', 'longitude', 'depth', 'magnitude',
'event_description', 'associated_phase_count', 'azimuthal_gap',
'event_id', 'horizontal_uncertainty', 'local_magnitude',
'moment_magnitude', 'duration_magnitude', 'magnitude_type',
'p_phase_count', 's_phase_count', 'p_pick_count', 's_pick_count',
'standard_error', 'used_phase_count', 'station_count',
'vertical_uncertainty', 'updated', 'author', 'agency_id',
'creation_time', 'version', 'path'],
dtype='object')
Get events¶
The EventBank
can be used to get obspy event objects based on query parameters.
[6]:
catalog = bank.get_events(minmagnitude=2)
[7]:
print(catalog)
3 Event(s) in Catalog:
2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2 mb
2007-08-07T07:13:05.760000Z | +39.461, -111.224 | 2.55 ml
2007-08-07T02:05:04.490000Z | +39.465, -111.225 | 2.44 ml
Put events¶
Events can be saved to disk using the put_events
method. If an event with the same resource_id already exists in the bank it will be overwritten.
[8]:
print(f'The bank has {len(bank.read_index())} events before put_events call.')
The bank has 8 events before put_events call.
[9]:
bank.put_events(obspy.read_events())
print(f'The bank has {len(bank.read_index())} events after the put_events call.')
The bank has 11 events after the put_events call.
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333549302300000000 1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158807002143744 1709158807002143744 1709158807002143744]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333557650000000000 1333557650000000000 1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
Organizing event directories¶
EventBank
can also be used to (re)organize event directories. The events are saved in the following structure by default: {year/month/day/year-month-dayThour-minute-second-short_id.xml}
(where “short_id” means the last 5 characters of the event id).
The structure of the event directories can be reorganized to {year/month/short_id.xml}
. The following code demonstrates the process used in changing the event directory structure.
[10]:
from pathlib import Path
import tempfile
temp_dir = Path(tempfile.mkdtemp())
kwargs = dict(
path_structure="{year}/{month}",
name_structure="{event_id_short}",
)
[11]:
print(bank.get_events())
11 Event(s) in Catalog:
2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2 mb
2007-08-07T02:14:24.080000Z | +39.463, -111.223 | 1.17 ml
...
2012-04-04T14:18:37.000000Z | +39.342, +41.044 | 4.3 ML | manual
2012-04-04T14:08:46.000000Z | +38.017, +37.736 | 3.0 ML | manual
To see all events call 'print(CatalogObject.__str__(print_all=True))'
[12]:
bank2 = obsplus.EventBank(temp_dir, **kwargs)
bank2.put_events(bank)
print(bank2.read_index()['path'])
0 2007/08/48839.xml
1 2007/08/48883.xml
2 2007/08/48887.xml
3 2007/08/48891.xml
4 2007/08/48882.xml
5 2007/08/48843.xml
6 2007/08/48925.xml
7 2007/08/48828.xml
8 2012/04/00041.xml
9 2012/04/00038.xml
10 2012/04/00039.xml
Name: path, dtype: object
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1186390120010000000 1186452864080000000 1186458258470000000
1186470785760000000 1186452304490000000 1186397245600000000
1186522971130000000 1186364688810000000 1333549302300000000
1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158808474145280 1709158808474145280 1709158808478145280
1709158808482145280 1709158808486145280 1709158808490145280
1709158808494145280 1709158808498145280 1709158808502145280
1709158808502145280 1709158808502145280]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[ 1539203593618111000 1539205826864045000 1539205827576204000
-9223372031854775808 1539206119190404000 1539203607110914000
1539295734236916000 1539203209642650000 1333557650000000000
1333557650000000000 1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.
df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)
Notes¶
Unlike the WaveBank, which uses HDF5 to index waveforms, EventBank uses a SQLite which is more suitable for frequent updates and CRUD usage patterns.