With the continuous increasing density of the seismic network and the improvement of the seismograph observation capability, the number of observed seismic events has increased dramatically and the location accuracy has been continuously improved. Therefore, obtaining fault geometry and its parameters from massive seismic data has become an essential method for seismogenic structure research. At present, in the research of obtaining faults and their parameters based on seismic data, there are two main methods of selecting data: One is to select seismic data empirically based on the understanding of fault structures and the spatial distribution of seismic data, and then fit the fault plane from these data. However, it depends on prior information, i.e. the knowledge of existing fault structures and the linear distribution of earthquakes, and it is difficult to process relatively poor linear trends. The other is based on the spatial clustering of seismic data, which adopts unsupervised clustering technology in machine learning to select data. This method avoids the dependence on experience and is more suitable for fault segment data obtained from massive seismic data. Fault parameters can be inversed by fault segment data to determine the fault structure and give its quantitative parameters. However, the current clustering technique for obtaining fault parameters has some limitations, such as the selection of the optimal parameters being difficult, data with different densities being dealt with by the same parameters, and poor method generality. In order to automatically identify faults and obtain fault parameters based on the spatial distribution of earthquakes, and avoid the aforementioned limitations, a new method based on the improved DBSCAN algorithm is presented in this study.
The method proposed in this study uses the k-average nearest neighbor method(K-ANN)and the mathematical expectation method to generate the candidate sets of eps and minPts threshold parameters, which are selected as optimal parameters based on the density hierarchy stability. Considering the spatial density differences of seismic events on different faults and the same fault, this study performs layer-by-layer density clustering from high density to low density. First, the above steps achieve the automatic selection of optimal parameters for clustering and identifying fault segments. Secondly, the fault parameters of the identified fault segments are calculated by the combination of the simulated annealing(SA)global search method and the local search method of Gaussian Newton(GN). Then, the adjacent similar fault segments are merged. Finally, the faults and their parameters are obtained.
The reliability of the automatic fault identification method was verified by synthetic data and the double-difference location catalog of Tangshan area, China. The following results were obtained: Ⅰ. The improved DBSCAN algorithm can automatically identify the fault segments, which is verified by the application of synthetic data and the double-difference location data of the Tangshan area. Ⅱ. Based on the double-difference location data of the Tangshan area, eight fault segments were identified using the improved DBSCAN algorithm. The specific names of the 8 faults are as follows: Douhe fault segment, Weishan-Fengnan fault segment, Luanxian-Laoting fault segment, Lulong fault segment, Xujialou-Wangxizhuang fault segment, Luanxian fault north segment, Leizhuang fault segment, and Chenguantun fault segment, and their strike and dip angle are 229.1°, 230.4°, 132.2°, 31.7°, 191.3°, 31°, 229.5°, 84.9°, and 51.6°, 88.4°, 89.3°, 88.6°, 88.4°, 88.2°, 73.8° and 85.4°, respectively. The parameters of the first five faults are mostly consistent with those of previous research results. The last three faults are the newly identified faults in this study based on the seismic catalog, and the parameters of two of them have been confirmed by previous research results or focal mechanism parameters on the faults.
In a word, the improved DBSCAN algorithm in this study can realize fault segment automatic identification, but there are still some problems that need to be improved urgently. In the follow-up research, we will continue to improve the automatic fault identification method and increase its ability of automatic fault identification so as to provide more accurate fault data for related research.