Working with Catalogs¶
Obspy’s event representation is based on the FDSN QuakeML standard, which is very comprehensive, and arguably the best standard available. However, It can be a bit difficult to work with the Catalog
object (and friends) for a few reasons:
1. Often the desired data is deeply nested and hard to aggregate
2. Identifying data relations depends on the complex behavior of Obspy's `ResourceIdentifier`
3. Preferred objects (eg origin, magnitude, etc.) are often not set
ObsPlus tries to solve all of these problems. The first is addressed by the DataFrame Extractor and other tree transversal tools. The second and third are addressed by a collection of catalog validators
Catalog Validation¶
In addition to being difficult to navigate (point 1), validating and ensuring catalogs work as expected (points 2-3) is non-trivial. ObsPlus’s validate_catalog
helps by ensuring all resource_ids point to the correct objects, preferred objects are set, and preforming other sanity checks. The default event validation function in ObsPlus is a bit opinionated and was built specifically for the NIOSH style of QuakeML, but you may still find it useful. Additionally, you can create your own
validation namespace and define validators for your own data/schema as described by the validators documentation.
Catalog setup¶
Let’s create a catalog that has the following problems:
resource_id on arrivals no longer point to the correct picks (only possible to break on Obspy versions <= 1.1.0)
no preferred origin/magnitudes are set
ObsPlus will go through and set the resource_ids to point to the correct objects, and set all the preferred_{whatever} to the last element in the {whatever}s list (for whatever in [‘magnitude’, ‘origin’, ‘focal_mechanism’]).
[4]:
# create catalog 1
def create_cat1():
"""A catalog with an arrival that doesn't refer to any pick"""
time = obspy.UTCDateTime("2017-09-22T08:35:00")
wid = ev.WaveformStreamID(
network_code="UU", station_code="TMU", location_code="", channel_code="HHZ"
)
pick = ev.Pick(time=time, phase_hint="P", waveform_id=wid)
arrival = ev.Arrival(pick_id=pick.resource_id, waveform_id=wid)
origin = ev.Origin(time=time, arrivals=[arrival], latitude=45.5, longitude=-111.1)
description = ev.EventDescription(create_cat1.__doc__)
event = ev.Event(origins=[origin], picks=[pick], event_descriptions=[description])
cat = ev.Catalog(events=[event])
# create a copy of the catalog. In older versions this would screw up
# the resource ids, but the issue seems to be fixed now.
cat.copy()
return cat
cat = create_cat1()
event = cat[0]
[5]:
arrival = event.origins[-1].arrivals[-1]
pick = event.picks[-1]
Validate¶
These two problems can be fixed in place with the validate_catalog function
[6]:
obsplus.validate_catalog(cat)
[6]:
1 Event(s) in Catalog:
2017-09-22T08:35:00.000000Z | +45.500, -111.100
[7]:
arrival = event.origins[0].arrivals[0]
# now we will get the correct pick through the arrival object, even on older versions of obspy
Fail fast¶
For issues that obsplus doesn’t know how to fix, an AssertionError
will be raised. If you are generating or downloading catalogs it may be useful to run them through the validation function right away so that you know there is an issue before trying to perform any meaningful analysis.
For example, if there was an arrival that didn’t refer to any known pick this could be a quality issue that you might like to know about.
[8]:
# create a problem with the catalog
old_pick_id = cat[0].origins[0].arrivals[0].pick_id
cat[0].origins[0].arrivals[0].pick_id = None
try:
obsplus.validate_catalog(cat)
except AssertionError:
pass
# undo the problem
cat[0].origins[0].arrivals[0].pick_id = old_pick_id
Adding custom validators¶
See the validators section to learn how to create your own validators. The following example shows how to use a subset of ObsPlus validators.
[9]:
# import the validators that are desired
import obspy.core.event as ev
from obsplus.events.validate import (
attach_all_resource_ids,
check_arrivals_pick_id,
check_duplicate_picks,
)
from obsplus.utils.validate import validate, validator
# create new validator namespace
namespace = "_new_test"
validator(namespace, ev.Event)(attach_all_resource_ids)
validator(namespace, ev.Event)(check_arrivals_pick_id)
validator(namespace, ev.Event)(check_duplicate_picks)
# run the new validator
validate(cat, namespace)
[9]: