Miscellaneous Utilities

The follow demonstrates some miscellaneous utilities included in ObsPlus.

Geodetics

It is often necessary to calculate geometric parameters (distance, azimuth, etc.) for pairs of entities in two different groups. For example, distance from each event in a catalog to each receiver in an inventory. ObsPlus provides a simple class for completing this task.

[1]:
# Load the catalog and inventory from the crandall dataset
import obsplus
from obsplus.utils.geodetics import SpatialCalculator


crandall = obsplus.load_dataset('crandall_test')
cat = crandall.event_client.get_events()
inv = crandall.station_client.get_stations()
[2]:
# init a SpatialCalculator instance (defaults to Earth's params)
spatial_calc = SpatialCalculator()
[3]:
# create distance dataframe
df = spatial_calc(entity_1=cat, entity_2=inv)
df.head()
/usr/share/miniconda3/envs/test/lib/python3.10/site-packages/obspy/taup/taup_geo.py:105: UserWarning: Assuming spherical planet when calculating epicentral distance. Install the Python module 'geographiclib' to solve this.
  warnings.warn(msg)
[3]:
distance_m azimuth back_azimuth distance_degrees vertical_distance_m
id1 id2
smi:local/248839 UU.CTU..HHE 143559.550849 161.740301 342.076813 1.289617 2141.0
smi:local/248883 UU.CTU..HHE 143716.194014 161.590945 341.930482 1.291025 5911.0
smi:local/248887 UU.CTU..HHE 143999.566314 161.346595 341.691152 1.293570 5891.0
smi:local/248891 UU.CTU..HHE 143969.268983 161.667028 342.005783 1.293298 4971.0
smi:local/248882 UU.CTU..HHE 143480.660281 161.650623 341.988556 1.288909 3521.0

Since a multi-index is used it provides a fairly intuitive way to look up particular event-channel pairs using a tuple of (event_id, seed_id) in conjunction with the .loc DataFrame attribute like so:

[4]:
event_id = str(cat[0].resource_id)
seed_id = 'UU.MPU..HHZ'

print(df.loc[(event_id, seed_id)])
distance_m             70463.858017
azimuth                  150.303413
back_azimuth             330.562735
distance_degrees           0.632988
vertical_distance_m     2319.000000
Name: (smi:local/248839, UU.MPU..HHZ), dtype: float64
[5]:
# or just get a particular parameter
print(df.loc[(event_id, seed_id), 'azimuth'])
150.30341278123922

The distances can be converted to km and the distrtibutions can be described.

[6]:
# Convert add km columns and delete m columns
m_columns = [x for x in df.columns if x.endswith('_m')]
km_columns = [x.replace('_m', '_km') for x in m_columns]

df_km = (
    df.assign(**{x: df[y] / 1000. for x,y in zip(km_columns, m_columns)}).
    drop(columns=m_columns)
)

# Calculate stats for source reseiver distances
df_km.describe().round(decimals=2)
[6]:
azimuth back_azimuth distance_degrees distance_km vertical_distance_km
count 408.00 408.00 408.00 408.00 408.00
mean 201.06 211.69 0.82 91.24 5.22
std 104.37 98.06 0.33 36.55 1.85
min 9.16 48.81 0.16 18.26 1.70
25% 126.47 122.45 0.63 70.46 3.69
50% 163.87 189.79 0.82 90.85 5.50
75% 293.44 307.27 1.12 124.36 6.25
max 358.12 344.77 1.29 144.00 9.31

Time

Working with dates/times can be frustrating, especially since (as of 2020) ObsPy, numpy, and pandas all use slightly different methods for working with time. ObsPlus provides some utilities to make things a little easier.

[7]:
from contextlib import suppress

import numpy as np
import obspy
import obsplus
from obsplus.utils import to_datetime64, to_timedelta64, to_utc

Note: All ObsPlus datafames use numpy/pandas datatypes.

[8]:
df = obsplus.events_to_df(obspy.read_events())
[9]:
df['time']
[9]:
0   2012-04-04 14:21:42.300
1   2012-04-04 14:18:37.000
2   2012-04-04 14:08:46.000
Name: time, dtype: datetime64[ns]

Numpy and ObsPy time differences

One difference between numpy’s datatime64 and ObsPy’s UTCDateTime is how offsets are applied. For ObsPy, numbers are simply taken as seconds, but numpy requires explicitly using timedeta64 instances.

[10]:
time_str = '2020-01-03T11:00:00'
utc = obspy.UTCDateTime(time_str)
dt64 = np.datetime64(time_str, 'ns')
[11]:
# add 1/2 second to UTCDateTime
utc2 = utc + 0.5
print(utc2)
2020-01-03T11:00:00.500000Z
[12]:
# however doing the same thing with datetime64 raises a TypeError
try:
    dt64 + 0.5
except TypeError as e:
    print(e)
ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('float64')
[13]:
# so you need to use a timedelta64
dt64_2 = dt64 + np.timedelta64(500, 'ms')
print(dt64_2)
2020-01-03T11:00:00.500000000
[14]:
# This is, of course, also the case with datetime64 columns
df['time'] + np.timedelta64(500, 'ms')
[14]:
0   2012-04-04 14:21:42.800
1   2012-04-04 14:18:37.500
2   2012-04-04 14:08:46.500
Name: time, dtype: datetime64[ns]

ObsPlus provides some common converters between ObsPy and Pandas/Numpy time objects.

Converting between ObsPy and Numpy time datatypes

Conversion between ObsPy UTCDateTime and numpy datetime64 objects when needed.

[15]:
# Convert a time column to an array of ObsPy UTCDateTime objects
utc_array = to_utc(df['time'])
print(utc_array)
[UTCDateTime(2012, 4, 4, 14, 21, 42, 300000)
 UTCDateTime(2012, 4, 4, 14, 18, 37) UTCDateTime(2012, 4, 4, 14, 8, 46)]
[16]:
# Convert back to datetime64
dt64_array = to_datetime64(utc_array)
print(dt64_array)
['2012-04-04T14:21:42.300000000' '2012-04-04T14:18:37.000000000'
 '2012-04-04T14:08:46.000000000']

Note: datetime64 arrays are much more efficient in terms of memory usage and computational efficiency than arrays of UTCDateTime objects.

If you prefer not to manually define timedelta64 to perform offsets, to_timedelta64 simply converts a real number to an offset in seconds.

[17]:
to_timedelta64(3.255)
[17]:
numpy.timedelta64(3255000000,'ns')