Miscellaneous Utilities

The follow demonstrates some miscellaneous utilities included in ObsPlus.

Geodetics

It is often necessary to calculate geometric parameters (distance, azimuth, etc.) for pairs of entities in two different groups. For example, distance from each event in a catalog to each receiver in an inventory. ObsPlus provides a simple class for completing this task.

[1]:
# Load the catalog and inventory from the crandall dataset
import obsplus
from obsplus.utils.geodetics import SpatialCalculator

crandall = obsplus.load_dataset("crandall_test")
cat = crandall.event_client.get_events()
inv = crandall.station_client.get_stations()
[2]:
# init a SpatialCalculator instance (defaults to Earth's params)
spatial_calc = SpatialCalculator()
[3]:
# create distance dataframe
df = spatial_calc(entity_1=cat, entity_2=inv)
df.head()
/home/runner/micromamba/envs/test/lib/python3.12/site-packages/obspy/taup/taup_geo.py:105: UserWarning: Assuming spherical planet when calculating epicentral distance. Install the Python module 'geographiclib' to solve this.
  warnings.warn(msg)
[3]:
distance_m azimuth back_azimuth distance_degrees vertical_distance_m
id1 id2
smi:local/248891 UU.TMU..HHE 18259.939091 355.666512 175.656341 0.164032 5971.0
smi:local/248883 UU.TMU..HHE 18551.448547 356.054669 176.045258 0.166650 6911.0
smi:local/248887 UU.TMU..HHE 18439.619518 358.119265 178.114804 0.165646 6891.0
smi:local/248925 UU.TMU..HHE 18480.008700 356.841380 176.833873 0.166009 7351.0
smi:local/248882 UU.TMU..HHE 18744.598035 355.436056 175.425059 0.168386 4521.0

Since a multi-index is used it provides a fairly intuitive way to look up particular event-channel pairs using a tuple of (event_id, seed_id) in conjunction with the .loc DataFrame attribute like so:

[4]:
event_id = str(cat[0].resource_id)
seed_id = "UU.MPU..HHZ"

[5]:
# or just get a particular parameter

The distances can be converted to km and the distrtibutions can be described.

[6]:
# Convert add km columns and delete m columns
m_columns = [x for x in df.columns if x.endswith("_m")]
km_columns = [x.replace("_m", "_km") for x in m_columns]

df_km = df.assign(**{x: df[y] / 1000.0 for x, y in zip(km_columns, m_columns)}).drop(
    columns=m_columns
)

# Calculate stats for source reseiver distances
df_km.describe().round(decimals=2)
[6]:
azimuth back_azimuth distance_degrees distance_km vertical_distance_km
count 408.00 408.00 408.00 408.00 408.00
mean 201.06 211.69 0.82 91.24 5.22
std 104.37 98.06 0.33 36.55 1.85
min 9.16 48.81 0.16 18.26 1.70
25% 126.47 122.45 0.63 70.46 3.69
50% 163.87 189.79 0.82 90.85 5.50
75% 293.44 307.27 1.12 124.36 6.25
max 358.12 344.77 1.29 144.00 9.31

Time

Working with dates/times can be frustrating, especially since (as of 2020) ObsPy, numpy, and pandas all use slightly different methods for working with time. ObsPlus provides some utilities to make things a little easier.

[7]:

import numpy as np import obspy import obsplus from obsplus.utils import to_datetime64, to_timedelta64, to_utc

Note: All ObsPlus datafames use numpy/pandas datatypes.

[8]:
df = obsplus.events_to_df(obspy.read_events())
[9]:
df["time"]
[9]:
0   2012-04-04 14:21:42.300
1   2012-04-04 14:18:37.000
2   2012-04-04 14:08:46.000
Name: time, dtype: datetime64[ns]

Numpy and ObsPy time differences

One difference between numpy’s datatime64 and ObsPy’s UTCDateTime is how offsets are applied. For ObsPy, numbers are simply taken as seconds, but numpy requires explicitly using timedeta64 instances.

[10]:
time_str = "2020-01-03T11:00:00"
utc = obspy.UTCDateTime(time_str)
dt64 = np.datetime64(time_str, "ns")
[11]:
# add 1/2 second to UTCDateTime
utc2 = utc + 0.5
[12]:
# however doing the same thing with datetime64 raises a TypeError
try:
    dt64 + 0.5
except TypeError:
    pass
[13]:
# so you need to use a timedelta64
dt64_2 = dt64 + np.timedelta64(500, "ms")
[14]:
# This is, of course, also the case with datetime64 columns
df["time"] + np.timedelta64(500, "ms")
[14]:
0   2012-04-04 14:21:42.800
1   2012-04-04 14:18:37.500
2   2012-04-04 14:08:46.500
Name: time, dtype: datetime64[ns]

ObsPlus provides some common converters between ObsPy and Pandas/Numpy time objects.

Converting between ObsPy and Numpy time datatypes

Conversion between ObsPy UTCDateTime and numpy datetime64 objects when needed.

[15]:
# Convert a time column to an array of ObsPy UTCDateTime objects
utc_array = to_utc(df["time"])
[16]:
# Convert back to datetime64
dt64_array = to_datetime64(utc_array)

Note: datetime64 arrays are much more efficient in terms of memory usage and computational efficiency than arrays of UTCDateTime objects.

If you prefer not to manually define timedelta64 to perform offsets, to_timedelta64 simply converts a real number to an offset in seconds.

[17]:
to_timedelta64(3.255)
[17]:
numpy.timedelta64(3255000000,'ns')