The 2009 L'Aquila earthquake, with a moment magnitude (Mw) of 6.1, was a highly impactful and damaging seismic event in the South-Central Apennines region. This study presents an extensive analysis of the aftershock sequence spanning 254 days from April 6th to December 20th 2009, thus providing a valuable asset and allowing for unprecedented insight analysis in aftershocks characterization with Machine Learning (ML) approaches. The dataset has been conveniently formatted for a variety of machine learning applications and frameworks in computer science, such as Seisbench. It employs the widely used Hierarchical Data Format (HDF5) for storing and organizing large waveform data, as well as classic CSV files for managing associated metadata. To simplify accessibility and provide greater flexibility to end-users, the dataset has been divided in a day-wise manner using the DOY nomenclature. This approach allows for easy data retrieval and utilization, allowing more freedom in its usage for the end-user. We packaged a comprehensive dataset comprising 63,704 earthquakes, resulting in over 1.25 million 3-component seismic traces. The dataset includes absolute (HypoEllipse) and relative (HypoDD) earthquake locations, derived from data collected by three seismic networks (MN, IV, XJ) encompassing 67 stations. We distribute 3-component waveforms (70 seconds long) sorted in an event-station manner and trimmed around to the associated first-arrival phase. A key aspect of this study is the collection of multiple P and S wave pick arrivals within the same cut (up to 8 total picks per trace), which significantly enhances the potential for current ML algorithms development, especially concerning seismic-phase picking. We collect and provide 201 metadata fields offering detailed information on stations, traces, sources, paths, and data quality. The data encompasses the seismic events count and velocity information, enabling a thorough understanding of the aftershock dynamics. This dataset aims to improve ML algorithms in seismology and advance earthquake detection, characterization, hazard assessment, and understanding of seismic processes. In a nutshell:
- 254 days of aftershocks (2009-04-06 up to 2009-12-20).
- 63,704 earthquakes
- 1,258,006 3-component traces cut of 70 seconds
- Absolute (HypoEllipse) and relative (HypoDD) locations.
- 3 seismic networks (MN, IV, XJ) for a total of 67 stations.
- Multiple P and S pick arrivals (up to 8 per cut) that allow improvements in current Machine Learning algorithms development and analysis.
- 201 metadata providing information on the station, trace, source, path, and data quality.
- Metadata information on both counts and velocity.