{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Event Bank\n", "\n", "The EventBank class is used to interact with a local directory of event files. The `get_events` method of Event Bank are compatible with the `get_events` method of the FDSN client in obspy. Additionally there are several useful features for managing events on disk. \n", "\n", "\n", "## Quickstart" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:04.190253Z", "iopub.status.busy": "2024-02-28T22:20:04.190079Z", "iopub.status.idle": "2024-02-28T22:20:06.300240Z", "shell.execute_reply": "2024-02-28T22:20:06.299623Z" } }, "outputs": [], "source": [ "import obspy\n", "import obsplus" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.303233Z", "iopub.status.busy": "2024-02-28T22:20:06.302681Z", "iopub.status.idle": "2024-02-28T22:20:06.404251Z", "shell.execute_reply": "2024-02-28T22:20:06.403623Z" } }, "outputs": [], "source": [ "%%capture\n", "# make sure the dataset is downloaded and supress output.\n", "obsplus.load_dataset('crandall_test')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.407022Z", "iopub.status.busy": "2024-02-28T22:20:06.406660Z", "iopub.status.idle": "2024-02-28T22:20:06.464416Z", "shell.execute_reply": "2024-02-28T22:20:06.463818Z" } }, "outputs": [ { "data": { "text/plain": [ "EventBank(base_path=/tmp/tmpfpqpe12v/crandall_test/events)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# copy the Crandall dataset to a temporary directory\n", "crandall = obsplus.copy_dataset('crandall_test')\n", "\n", "# path to directory where events are stored\n", "event_path = crandall.event_path\n", "\n", "# init an EventBank instance\n", "bank = obsplus.EventBank(event_path)\n", "\n", "# ensure index is up-to-date\n", "bank.update_index() " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Accessing the index \n", "The index can be accessed directly to get a summary of the events contained in the archive. Depending on the task, it may be more natural to work with the index dataframe rather than the obspy catalog objects directly." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.492737Z", "iopub.status.busy": "2024-02-28T22:20:06.492539Z", "iopub.status.idle": "2024-02-28T22:20:06.521852Z", "shell.execute_reply": "2024-02-28T22:20:06.521255Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
timelatitudelongitudedepthmagnitudeevent_descriptionassociated_phase_countazimuthal_gapevent_idhorizontal_uncertainty...standard_errorused_phase_countstation_countvertical_uncertaintyupdatedauthoragency_idcreation_timeversionpath
02007-08-06 08:48:40.01039.4635-111.2277410.04.37LR0.0NaNsmi:local/248839NaN...1.73560.0134.0NaN2024-02-28 22:19:00.794070528DCNIOSH2018-10-10 20:33:13.6181112007/08/06/2007-08-06T08-48-40_48839.xml
12007-08-07 02:14:24.08039.4632-111.22304180.01.26LR0.0NaNsmi:local/248883NaN...0.88340.014.0NaN2024-02-28 22:19:00.794070528DCNIOSH2018-10-10 21:10:26.8640452007/08/07/2007-08-07T02-14-24_48883.xml
22007-08-07 03:44:18.47039.4625-111.21524160.01.45LR0.0NaNsmi:local/248887NaN...0.57160.015.0NaN2024-02-28 22:19:00.798070784DCNIOSH2018-10-10 21:10:27.5762042007/08/07/2007-08-07T03-44-18_48887.xml
32007-08-07 07:13:05.76039.4605-111.22423240.02.240.0NaNsmi:local/248891NaN...0.99010.035.0NaN2024-02-28 22:19:00.802070528NaT2007/08/07/2007-08-07T07-13-05_48891.xml
42007-08-07 02:05:04.49039.4648-111.22551790.02.08LR0.0NaNsmi:local/248882NaN...0.99350.035.0NaN2024-02-28 22:19:00.806070528DCNIOSH2018-10-10 21:15:19.1904042007/08/07/2007-08-07T02-05-04_48882.xml
52007-08-06 10:47:25.60039.4615-111.23172050.01.57LR0.0NaNsmi:local/248843NaN...0.82370.029.0NaN2024-02-28 22:19:00.810070528DCNIOSH2018-10-10 20:33:27.1109142007/08/06/2007-08-06T10-47-25_48843.xml
62007-08-07 21:42:51.13039.4627-111.22004620.01.650.0NaNsmi:local/248925NaN...0.57040.019.0NaN2024-02-28 22:19:00.814070784DCNIOSH2018-10-11 22:08:54.2369162007/08/07/2007-08-07T21-42-51_48925.xml
72007-08-06 01:44:48.81039.4617-111.23786570.01.78LR0.0NaNsmi:local/248828NaN...0.89360.023.0NaN2024-02-28 22:19:00.818070528DCNIOSH2018-10-10 20:26:49.6426502007/08/06/2007-08-06T01-44-48_48828.xml
\n", "

8 rows × 28 columns

\n", "
" ], "text/plain": [ " time latitude longitude depth magnitude \\\n", "0 2007-08-06 08:48:40.010 39.4635 -111.2277 410.0 4.37 \n", "1 2007-08-07 02:14:24.080 39.4632 -111.2230 4180.0 1.26 \n", "2 2007-08-07 03:44:18.470 39.4625 -111.2152 4160.0 1.45 \n", "3 2007-08-07 07:13:05.760 39.4605 -111.2242 3240.0 2.24 \n", "4 2007-08-07 02:05:04.490 39.4648 -111.2255 1790.0 2.08 \n", "5 2007-08-06 10:47:25.600 39.4615 -111.2317 2050.0 1.57 \n", "6 2007-08-07 21:42:51.130 39.4627 -111.2200 4620.0 1.65 \n", "7 2007-08-06 01:44:48.810 39.4617 -111.2378 6570.0 1.78 \n", "\n", " event_description associated_phase_count azimuthal_gap event_id \\\n", "0 LR 0.0 NaN smi:local/248839 \n", "1 LR 0.0 NaN smi:local/248883 \n", "2 LR 0.0 NaN smi:local/248887 \n", "3 0.0 NaN smi:local/248891 \n", "4 LR 0.0 NaN smi:local/248882 \n", "5 LR 0.0 NaN smi:local/248843 \n", "6 0.0 NaN smi:local/248925 \n", "7 LR 0.0 NaN smi:local/248828 \n", "\n", " horizontal_uncertainty ... standard_error used_phase_count \\\n", "0 NaN ... 1.7356 0.0 \n", "1 NaN ... 0.8834 0.0 \n", "2 NaN ... 0.5716 0.0 \n", "3 NaN ... 0.9901 0.0 \n", "4 NaN ... 0.9935 0.0 \n", "5 NaN ... 0.8237 0.0 \n", "6 NaN ... 0.5704 0.0 \n", "7 NaN ... 0.8936 0.0 \n", "\n", " station_count vertical_uncertainty updated author \\\n", "0 134.0 NaN 2024-02-28 22:19:00.794070528 DC \n", "1 14.0 NaN 2024-02-28 22:19:00.794070528 DC \n", "2 15.0 NaN 2024-02-28 22:19:00.798070784 DC \n", "3 35.0 NaN 2024-02-28 22:19:00.802070528 \n", "4 35.0 NaN 2024-02-28 22:19:00.806070528 DC \n", "5 29.0 NaN 2024-02-28 22:19:00.810070528 DC \n", "6 19.0 NaN 2024-02-28 22:19:00.814070784 DC \n", "7 23.0 NaN 2024-02-28 22:19:00.818070528 DC \n", "\n", " agency_id creation_time version \\\n", "0 NIOSH 2018-10-10 20:33:13.618111 \n", "1 NIOSH 2018-10-10 21:10:26.864045 \n", "2 NIOSH 2018-10-10 21:10:27.576204 \n", "3 NaT \n", "4 NIOSH 2018-10-10 21:15:19.190404 \n", "5 NIOSH 2018-10-10 20:33:27.110914 \n", "6 NIOSH 2018-10-11 22:08:54.236916 \n", "7 NIOSH 2018-10-10 20:26:49.642650 \n", "\n", " path \n", "0 2007/08/06/2007-08-06T08-48-40_48839.xml \n", "1 2007/08/07/2007-08-07T02-14-24_48883.xml \n", "2 2007/08/07/2007-08-07T03-44-18_48887.xml \n", "3 2007/08/07/2007-08-07T07-13-05_48891.xml \n", "4 2007/08/07/2007-08-07T02-05-04_48882.xml \n", "5 2007/08/06/2007-08-06T10-47-25_48843.xml \n", "6 2007/08/07/2007-08-07T21-42-51_48925.xml \n", "7 2007/08/06/2007-08-06T01-44-48_48828.xml \n", "\n", "[8 rows x 28 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index = bank.read_index()\n", "index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The index contains the following columns:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.524304Z", "iopub.status.busy": "2024-02-28T22:20:06.523936Z", "iopub.status.idle": "2024-02-28T22:20:06.527464Z", "shell.execute_reply": "2024-02-28T22:20:06.526964Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['time', 'latitude', 'longitude', 'depth', 'magnitude',\n", " 'event_description', 'associated_phase_count', 'azimuthal_gap',\n", " 'event_id', 'horizontal_uncertainty', 'local_magnitude',\n", " 'moment_magnitude', 'duration_magnitude', 'magnitude_type',\n", " 'p_phase_count', 's_phase_count', 'p_pick_count', 's_pick_count',\n", " 'standard_error', 'used_phase_count', 'station_count',\n", " 'vertical_uncertainty', 'updated', 'author', 'agency_id',\n", " 'creation_time', 'version', 'path'],\n", " dtype='object')\n" ] } ], "source": [ "print(index.columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get events\n", "The `EventBank` can be used to get obspy event objects based on query parameters." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.529721Z", "iopub.status.busy": "2024-02-28T22:20:06.529369Z", "iopub.status.idle": "2024-02-28T22:20:06.956580Z", "shell.execute_reply": "2024-02-28T22:20:06.955955Z" } }, "outputs": [], "source": [ "catalog = bank.get_events(minmagnitude=2)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.959220Z", "iopub.status.busy": "2024-02-28T22:20:06.958841Z", "iopub.status.idle": "2024-02-28T22:20:06.962460Z", "shell.execute_reply": "2024-02-28T22:20:06.961874Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3 Event(s) in Catalog:\n", "2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2 mb\n", "2007-08-07T07:13:05.760000Z | +39.461, -111.224 | 2.55 ml\n", "2007-08-07T02:05:04.490000Z | +39.465, -111.225 | 2.44 ml\n" ] } ], "source": [ "print(catalog)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Put events\n", "Events can be saved to disk using the `put_events` method. If an event with the same resource_id already exists in the bank it will be overwritten." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.964660Z", "iopub.status.busy": "2024-02-28T22:20:06.964312Z", "iopub.status.idle": "2024-02-28T22:20:06.980061Z", "shell.execute_reply": "2024-02-28T22:20:06.979487Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The bank has 8 events before put_events call.\n" ] } ], "source": [ "print(f'The bank has {len(bank.read_index())} events before put_events call.')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:06.982207Z", "iopub.status.busy": "2024-02-28T22:20:06.981880Z", "iopub.status.idle": "2024-02-28T22:20:07.068506Z", "shell.execute_reply": "2024-02-28T22:20:07.067867Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The bank has 11 events after the put_events call.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333549302300000000 1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n", "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158807002143744 1709158807002143744 1709158807002143744]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n", "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1333557650000000000 1333557650000000000 1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n" ] } ], "source": [ "bank.put_events(obspy.read_events())\n", "print(f'The bank has {len(bank.read_index())} events after the put_events call.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Organizing event directories\n", "`EventBank` can also be used to (re)organize event directories. The events are saved in the following structure by default: `{year/month/day/year-month-dayThour-minute-second-short_id.xml}` (where \"short_id\" means the last 5 characters of the event id). \n", "\n", "The structure of the event directories can be reorganized to `{year/month/short_id.xml}`. The following code demonstrates the process used in changing the event directory structure.\n", "\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:07.070906Z", "iopub.status.busy": "2024-02-28T22:20:07.070537Z", "iopub.status.idle": "2024-02-28T22:20:07.073787Z", "shell.execute_reply": "2024-02-28T22:20:07.073293Z" } }, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "import tempfile\n", "\n", "temp_dir = Path(tempfile.mkdtemp())\n", "\n", "kwargs = dict(\n", " path_structure=\"{year}/{month}\",\n", " name_structure=\"{event_id_short}\",\n", ")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:07.075984Z", "iopub.status.busy": "2024-02-28T22:20:07.075566Z", "iopub.status.idle": "2024-02-28T22:20:07.789385Z", "shell.execute_reply": "2024-02-28T22:20:07.788784Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11 Event(s) in Catalog:\n", "2007-08-06T08:48:40.010000Z | +39.464, -111.228 | 4.2 mb\n", "2007-08-07T02:14:24.080000Z | +39.463, -111.223 | 1.17 ml\n", "...\n", "2012-04-04T14:18:37.000000Z | +39.342, +41.044 | 4.3 ML | manual\n", "2012-04-04T14:08:46.000000Z | +38.017, +37.736 | 3.0 ML | manual\n", "To see all events call 'print(CatalogObject.__str__(print_all=True))'\n" ] } ], "source": [ "print(bank.get_events())" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2024-02-28T22:20:07.791927Z", "iopub.status.busy": "2024-02-28T22:20:07.791473Z", "iopub.status.idle": "2024-02-28T22:20:09.256418Z", "shell.execute_reply": "2024-02-28T22:20:09.255767Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 2007/08/48839.xml\n", "1 2007/08/48883.xml\n", "2 2007/08/48887.xml\n", "3 2007/08/48891.xml\n", "4 2007/08/48882.xml\n", "5 2007/08/48843.xml\n", "6 2007/08/48925.xml\n", "7 2007/08/48828.xml\n", "8 2012/04/00041.xml\n", "9 2012/04/00038.xml\n", "10 2012/04/00039.xml\n", "Name: path, dtype: object\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1186390120010000000 1186452864080000000 1186458258470000000\n", " 1186470785760000000 1186452304490000000 1186397245600000000\n", " 1186522971130000000 1186364688810000000 1333549302300000000\n", " 1333549117000000000 1333548526000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n", "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1709158808474145280 1709158808474145280 1709158808478145280\n", " 1709158808482145280 1709158808486145280 1709158808490145280\n", " 1709158808494145280 1709158808498145280 1709158808502145280\n", " 1709158808502145280 1709158808502145280]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n", "/home/runner/work/obsplus/obsplus/src/obsplus/utils/pd.py:120: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[ 1539203593618111000 1539205826864045000 1539205827576204000\n", " -9223372031854775808 1539206119190404000 1539203607110914000\n", " 1539295734236916000 1539203209642650000 1333557650000000000\n", " 1333557650000000000 1333557650000000000]' has dtype incompatible with datetime64[ns], please explicitly cast to a compatible dtype first.\n", " df.loc[:, cols] = df.loc[:, cols].fillna(nat_value).astype(np.int64)\n" ] } ], "source": [ "bank2 = obsplus.EventBank(temp_dir, **kwargs)\n", "bank2.put_events(bank)\n", "print(bank2.read_index()['path'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notes\n", "Unlike the [WaveBank](wavebank.pynb), which uses HDF5 to index waveforms, EventBank uses a [SQLite](https://www.sqlite.org/index.html) which is more suitable for frequent updates and [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) usage patterns." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }