Seismology provides an attractive tool to investigate physical processes in deforming systems. The seismic signals from active glaciers, for example, could enable monitoring of mechanisms including basal sliding [6, 13, 24], fracturing , melt water drainage , and iceberg calving [14, 15]. The detection of seismic events from the recorded continuous seismic waveform data is a vital first step in any analysis. Event catalogues thus constructed are needed for local seismicity studies, comparisons between locations, or detection of change over extended time periods. However, the automated detection of seismic events is complicated in environmental and geotechnical seismology by the diverse populations of signal generation mechanisms. Those generated by active glaciers, for example, can be expected to span several orders of magnitude in duration and amplitude . Volcanoes , landslides  and mining activity  similarly produce a broad range of seismic signals. As a result, the majority of cryoseismology studies to date use manual identification of events [6, 17, 18]. Manual techniques are not readily scalable nor exactly reproducible: in particular, they are not a first choice for monitoring applications nor for the data-driven detection of change.
Established event detection algorithms have largely been developed for earthquake seismology [1, 2, 3, 7]. These algorithms, including STA/LTA (short-term average/long-term average)  and template matching , are applied in real-time to seismic data to detect earthquakes and produce event catalogues [2, 7, 8, 23]. The core classes in the obspy software package are therefore designed assuming event metadata is available online alongside the waveform data ; additional functions are included outside the core classes to select events directly from waveform data. These standard algorithms are generally only applicable to non-earthquake signals by employing an experimental approach to parameter selection [5, 16]. The seismic attributes library provides software tools to download seismic waveform data from online repositories and detect environmental and geotechnical seismic events using a choice of algorithms. Algorithm options include the classic, recursive and delayed STA/LTA algorithms , and the newly developed multi-STA/LTA algorithm . The multi-STA/LTA can simultaneously extract both short and long duration events of very different signal-to-noise levels and potentially enables real-time monitoring of cryoseismic events.
Event catalogues constructed using the seismic attributes library would be expected to comprise diverse signals generated by a range of mechanisms as noted above. The various signals typically need to be separated into related clusters based on the characteristics of their waveforms in order to study the events further [10, 19]. For example, Provost et al.  consider 71 attributes based on waveform data in their study of landslide seismicity. These are broadly split into four categories: (1) waveform attributes (e.g. duration, energy, kurtosis); (2) spectral and spectrogram attributes (e.g. discrete Fourier transform); (3) network attributes (e.g. station with maximum amplitude); and (4) polarity attributes (e.g. azimuth, inclination). The seismic attributes library includes functions to calculate a number of standard signal properties: duration, ratio between ascending and descending time, energy in the autocorrelation function, energy in the frequency filtered spectrum, and the direction of wave propagation. These are provided in three bundles of attribute functions describing the waveform, spectrum and polarity of the signal (Table 1); we do not include network attributes as these are unordered, discrete variables (and largely application dependent). User-defined functions can be added to derive customised characteristics within our software architecture. The correlation between attributes can be investigated using a plotting function to inform the removal of redundant variables, if appropriate, for the subsequent clustering application to a given set of events. Data-driven techniques more generally, such as machine learning algorithms, may be readily applied to the calculated attributes to inform the current and future state of the glacier (i.e. identify signals in the lead-up to a large event ).
|2||Ratio of the mean over the maximum of the envelope signal|
|3||Ratio of the median over the maximum of the envelope signal|
|4||Ratio between ascending and descending time|
|5||Kurtosis of the raw signal (peakness of the signal)|
|6||Kurtosis of the envelope|
|7||Skewness of the raw signal|
|8||Skewness of the envelope|
|10||Energy in the first third of the autocorrelation function|
|11||Energy in the remaining part of the autocorrelation function|
|12||Ratio of 11 and 10|
|13–17||Energy of the signal filtered in 5–10 Hz, 10–50 Hz, 5–70 Hz, 50–100 Hz, and 5–100 Hz|
|18–22||Kurtosis of the signal in 5–10 Hz, 10–50 Hz, 5–70 Hz, 50–100 Hz, and 5–100 Hz frequency range|
|24||Mean of the discrete Fourier transform (DFT)||Spectral|
|25||Maximum of the DFT|
|26||Frequency at the maximum|
|27||Central frequency of the 1st quartile|
|28||Central frequency of the 2nd quartile|
|29||Median of the normalized DFT|
|30||Variance of the normalized DFT|
|34–37||Energy in DFT for|
|38||‘Spectral centroid’ (as defined by Provost et al.)|
|40||Spectral centroid width|
The analysis of waveform data from environmental and geotechnical seismic deployments requires multiple distinct steps. Following the principle that one should write programs that do one thing well, and write programs that work together , the seismic waveforms library is correspondingly split into three primary functions to streamline the workflow. The name and purpose of these functions are as follows:
The library includes the numerous attribute functions that are called by get_attributes(). Additional functions are also included to analyse the output of the primary functions (e.g. plotting), together with several private functions that support the primary workflow (documented in the code itself). The implementation of the three primary functions is outlined below.
Seismic datasets can build to a high data volume (several terabytes) if high sampling rates are required over extended periods of time, or if using data from seismic arrays with multiple seismometers. This can result in memory problems when processing the waveforms and also disk space limitations. The first function in our workflow therefore acts as a gateway, to manage the volume of data downloaded from online repositories using obspy clients. Waveform data from each seismometer is written into separate files comprising only a single day of data for a single component (e.g. vertical, north, east). The user can specify the write location of the files to split different time periods across several external drives. This same function checks if the requested data have previously been downloaded (by default) and can optionally not check for, or not download missing data (e.g. in the case of known data gaps). The waveform data are written as .mseed files using standard obspy functions.
The get_waveforms function recombines downloaded or locally stored files into a single obspy Trace object for the requested time period, thus preparing the downloaded files for the next steps in the workflow. The user should examine small subsets of the time period of interest (e.g. a week at a time) based on factors including data volume and computer hardware performance. The seismic waveforms for the requested time period and components of a given seismometer are output as an obspy Stream object. The streams from different seismometers are combined using the ‘+’ operator into a single object.
As noted in the introduction, several algorithms exist to trigger events from single component waveform data. The get_events code first separates the signals from multiple seismometers and combines their (usually) three components into a single waveform. The user can optionally specify whether the component waveforms are combined as the sum-of-squares of their amplitude (i.e. Euclidean norm) to give the wave energy, or the absolute value of the wave amplitude. Further, to ensure that taking the absolute value of the amplitude does not affect the results (e.g. doubling frequency) we tested a computationally intensive option to fit the (time-varying) principle component of the direction of wave oscillation to obtain a signed amplitude; this gave identical event detections to the absolute amplitude and so is not included in our published version of the code due to the significantly longer computation time.
We use an adapted version of the obspy coincidence_trigger function to create a reference catalogue of events based on the STA/LTA characteristic function at each seismometer for their combined component waveforms. Our version of the coincidence_trigger function includes adjustments in the algorithm to better align with the outputs needed in our workflow and improvements to computational performance that are possible for our narrower use of this function (see Figure 1 for schematic of algorithm). This function can use the standard STA/LTA algorithms available within obspy , and optionally the recently developed multi-STA/LTA algorithm . In all cases, location-specific algorithm parameters are likely to be more successful for environmental seismology applications. The start and stop times of the detected event from a single seismometer can be extended to consider small gaps in between triggers (up to length thr_event_join), creating a single, longer-duration record. The reference event catalogue is created by finding times when n (i.e. thr_coincidence_sum) of the N seismometers in the array have temporally coincident records. The physical dimensions of the seismic array are calculated at this step to estimate an upper bound on the delay in arrival time of the signal at different seismometers. This delay is added to the duration of the single seismometer records to ensure the reference events are not artificially shortened. The reference event catalogue, which includes a reference start time and duration for each event, is output as a pandas DataFrame to provide a convenient format to write to file and interrogate.
Our adapted version of the coincidence_trigger function provides additional utility for other researchers by outputting a secondary catalogue of trace metadata, for each seismometer, for all identified reference events. The triggered records at each seismometer that occur during (at least) part of the time period covered by a given reference event are joined together (if not already a single record). This record may extend beyond the time period of the reference event, be contained entirely within it, or have no detection at all in the case of weaker events. This trace (metadata) catalogue, which includes the trigger start time and duration for all seismometers, is similarly output as a pandas DataFrame. The trace metadata can be used to extract waveform data for identified events in the catalogue in a format that will be familiar to seismologists, for example, as shown in Figure 2 for a low-frequency event detected using four seismometers on the Whillans Ice Stream in Antarctica from 13:35:37 on 16 December 2010.
The get_attributes function is written following further principles , that one should write flexible and open programs. This function extracts the waveform data for a given seismometer for the duration of a given event listed in the reference event, or trace (metadata) catalogue. The waveforms for each component are stored as separate Trace objects in a single stream. This Stream object is passed to one or multiple attribute functions to derive the characteristics of the given event as measured by a given seismometer. The values of the requested attributes for the array of seismometers are output as a pandas DataFrame. The correlation between the spectral attributes included in the library are shown in Figure 3 for events detected at Ilulissat, Greenland (DK.ILULI) on 1 January 2018 using the recursive STA/LTA algorithm.
Optionally, custom attribute functions may be added to calculate any chosen characteristic of a waveform. Custom functions must take a Stream object containing one or more components as an input, and output the attribute name and value. Public functions to combine the component waveforms into the wave energy or absolute value of the wave amplitude are included in the seismic attributes library to aid the user. The attribute functions (as many as are required) are passed to the get_attributes function as optional parameters, with return values stored as above.
We have tested the seismic attributes library to find any software bugs and to ensure outputs are reproducible by other researchers running the code or using different platforms . We first created test cases based on glacier seismic signals from a field campaign seismic array on the Whillans Ice Stream in West Antarctica . Plausible uses of the code were tested by selecting some or all of the seismometers in the array, some or all of the three components of the signal recorded at each seismometer (i.e. vertical, north and east), and considering different durations of requested time (with and without small data gaps). The optional function inputs were also tested in this manner. In application, e.g. , the event catalogue produced using the seismic attributes workflow compared favourably to events that were visually classified using the same set of seismometers . Finally, the attribute functions included in the library were tested on mock waveform data to ensure they produce the theoretically expected results.
Testing was primarily carried out using a Python 3.8 installation on a macOS 10/11 system. The key features of the seismic attributes library were tested on other platforms for compatibility, including both Python 2.7 and 3.7/8 for the most up to date version of obspy, numpy, pandas and seaborn available for that installation. The code was further tested on both macOS 10/11 and Microsoft Windows operating systems, especially to verify functionality of the different file systems. The reference event and trace (metadata) catalogues were compared between these platforms to assess the consistency of code functionality.
macOS 10/11, GNU/Linux and Microsoft Windows.
Python 2 or 3.
Memory and disk space will limit the volume of data that can be processed in a contiguous chunk; the code will work on any system that can support obspy. Internet access is required to download new waveform data from online repositories.
The minimal dependency for use is Python 2/3 with obspy, numpy, pandas and seaborn packages installed.
Name: An ObsPy library for event detection and seismic attribute calculation: preparing waveforms for automated analysis
Persistent identifier: https://doi.org/10.5281/zenodo.5496794
Licence: GPL version 3
Date published: 10/09/2021
Persistent identifier: https://github.com/rossjturner/seismic_attributes
Licence: GPL version 3
Date published: 10/09/2021
Python 2 or 3.
The main purpose behind the seismic attributes library is to provide a streamlined workflow for researchers to detect and categorise seismic events from environmental and geotechnical sources in a rigorous and reproducible manner. The code will find extensive use in areas of seismology where a diverse population of seismic signals are present, including glaciers, volcanoes, landslides and mine sites. The get_events function is expected to be especially useful, in particular in glacier seismology, to provide a consistent method for creating event catalogues for ongoing machine learning and conventional seismological analysis. We anticipate that our get_attributes function may also be useful in applications outside seismology to aid in the construction of a catalogue of waveform attributes for use in machine learning. The code would need only minor modifications to handle time series data, outside of seismology, stored in different formats. It could conceivably find applications other areas of science (e.g. astrophysics), economics and many other potential applications. Limited support may be provided by contacting the corresponding author.
This research was supported under Australian Research Council’s (ARC) Special Research Initiative for Antarctic Gateway Partnership (project ID SR140300001), and ARC Discovery Projects DP190100418 and DP2101000834.
The authors have no competing interests to declare.
Allen R. Automatic phase pickers: Their present use and future prospects. Bulletin of the Seismological Society of America. 1982; 72(6B): S225–S242. DOI: https://doi.org/10.1785/BSSA07206B0225
Allen RV. Automatic earthquake recognition and timing from single traces. Bulletin of the Seismological Society of America. 1978; 68(5): 1521–1532, 10. DOI: https://doi.org/10.1785/BSSA0680051521
Anstey NA. The Sectional Auto-Correlogram and the Sectional Retro-Correlogram. Part I: The Sectional Auto-Correlogram. Geophysical Prospecting. 1966; 14(4): 389–426. DOI: https://doi.org/10.1111/j.1365-2478.1966.tb02245.x
Aso N, Tsai VC, Schoof C, Flowers GE, Whiteford A, Rada C. Seismologically Observed Spatiotemporal Drainage Activity at Moulins. Journal of Geophysical Research: Solid Earth. 2017; 122(11): 9095–9108. DOI: https://doi.org/10.1002/2017JB014578
Aster RC, Winberry JP. Glacial seismology. Reports on Progress in Physics. 2017; 80(12): 126801. DOI: https://doi.org/10.1088/1361-6633/aa8473
Barcheck CG, Tulaczyk S, Schwartz SY, Walter JI, Winberry JP. Implications of basal micro-earthquakes and tremor for ice stream mechanics: Stick-slip basal sliding and till erosion. Earth and Planetary Science Letters. 2018; 486: 54–60. DOI: https://doi.org/10.1016/j.epsl.2017.12.046
Earle PS, Shearer PM. Characterization of global seismograms using an automatic-picking algorithm. Bulletin of the Seismological Society of America. 04 1994; 84(2): 366–376. DOI: https://doi.org/10.1785/BSSA0840020366
Houliston D, Waugh G, Laughlin J. Automatic real-time event detection for seismic networks. Computers & Geosciences. 1984; 10(4): 431–436. DOI: https://doi.org/10.1016/0098-3004(84)90043-8
Kavanaugh J, Schultz R, Andriashek L, Baan M, Ghofrani H, Atkinson G, Utting D. A New Year’s Day Icebreaker: Icequakes on Lakes in Alberta, Canada. Canadian Journal of Earth Sciences. 01 2019; 56: 183–200. DOI: https://doi.org/10.1139/cjes-2018-0196
Köhler A, Ohrnberger M, Scherbaum F. Unsupervised pattern recognition in continuous seismic wavefield records using Self-Organizing Maps. Geophysical Journal International. 2010; 182(3): 1619–1630. DOI: https://doi.org/10.1111/j.1365-246X.2010.04709.x
LeVeque RJ, Mitchell IM, Stodden V. Reproducible research for scientific computing: Tools and strategies for changing the culture. Computing in Science & Engineering. 2012; 14(4): 13–17. DOI: https://doi.org/10.1109/MCSE.2012.38
Lipovsky BP, Meyer CR, Zoet LK, McCarthy C, Hansen DD, Rempel AW, Gimbert F. Glacier sliding, seismicity and sediment entrainment. Annals of Glaciology. 2019; 60(79): 182–192. DOI: https://doi.org/10.1017/aog.2019.24
Nettles M, Ekström G. Glacial Earthquakes in Greenland and Antarctica. Annual Review of Earth and Planetary Sciences. 2010; 38(1): 467–491. DOI: https://doi.org/10.1146/annurev-earth-040809-152414
Olsen KG, Nettles M. Constraints on Terminus Dynamics at Greenland Glaciers From Small Glacial Earthquakes. Journal of Geophysical Research: Earth Surface. 2019; 124(7): 1899–1918. DOI: https://doi.org/10.1029/2019JF005054
Podolskiy EA, Walter F. Cryoseismology. Reviews of Geophysics. 2016; 54(4): 708–758. DOI: https://doi.org/10.1002/2016RG000526
Pomeroy J, Brisbourne A, Evans J, Graham D. The search for seismic signatures of movement at the glacier bed in a polythermal valley glacier. Annals of Glaciology. 2013; 54(64): 149–156. DOI: https://doi.org/10.3189/2013AoG64A203
Pratt MJ, Winberry JP, Wiens DA, Anandakrishnan S, Alley RB. Seismic and geodetic evidence for grounding-line control of Whillans Ice Stream stick-slip events. Journal of Geophysical Research: Earth Surface. 2014; 119(2): 333–348. DOI: https://doi.org/10.1002/2013JF002842
Provost F, Hibert C, Malet J-P. Automatic classification of endogenous landslide seismicity using the Random Forest supervised classifier. Geophysical Research Letters. 2017; 44(1): 113–120. DOI: https://doi.org/10.1002/2016GL070709
Rouet-Leduc B, Hulbert C, Lubbers N, Barros K, Humphreys CJ, Johnson PA. Machine learning predicts laboratory earthquakes. Geophysical Research Letters. 2017; 44(18): 9276–9282. DOI: https://doi.org/10.1002/2017GL074677
Vaezi Y, Van der Baan M. Comparison of the STA/LTA and power spectral density methods for microseismic event detection. Geophysical Supplements to the Monthly Notices of the Royal Astronomical Society. 2015; 203(3): 1896–1908. DOI: https://doi.org/10.1093/gji/ggv419
Winberry JP, Anandakrishnan S, Wiens DA, Alley RB. Nucleation and seismic tremor associated with the glacial earthquakes of Whillans Ice Stream, Antarctica. Geophysical Research Letters. 2013; 40(2): 312–315. DOI: https://doi.org/10.1002/grl.50130
Zhou Z, Cheng R, Cai X, Ma D, Jiang C. Discrimination of Rock Fracture and Blast Events Based on Signal Complexity and Machine Learning. Shock and Vibration. 02 2018; 2018: 1–10. DOI: https://doi.org/10.1155/2018/9753028