基于PCA-GAM的阿拉伯海公海鸢乌贼资源量空间分布预测模型研究

Research on the prediction model of spatial distribution of Sthenoteuthis oualaniensis in the open sen Arabian Sea based on PCA-GAM

  • 摘要: 为了科学预测鸢乌贼资源量的分布,更加合理开发和利用其资源,实验利用2017—2019年阿拉伯海公海灯光围网鸢乌贼生产数据,结合同期的盐度、温度、混合层厚度、海面高度异常、叶绿素a浓度、海表流速、经度和纬度数据构建了阿拉伯海鸢乌贼渔场的PCA-GAM预报模型。环境因子间的相关性会形成多重共线性,易造成模型过拟合,降低模型的预报能力。基于主成分分析 (principal component analysis,PCA)降维技术,将环境数据转变成少数几个不相关但保留重要信息的主成分 (PCs),将前8个PCs作为广义加性模型(GAM)的解释变量构建模型。利用交叉验证得到预报值和实际单位捕捞努力量渔获量(CPUE)经过ln(CPUE+1)变换相关系数均值为0.532 7,回归模型斜率的均值为0.708 7,截断的均值为1.471 1。模型预报的鸢乌贼资源量分布和实际的CPUE经过ln(CPUE+1)变换在空间上重叠度较高,表明PCA-GAM模型能够较好地预报阿拉伯海鸢乌贼资源量的空间分布。

     

    Abstract: In order to scientifically predict the distribution of Sthenoteuthis oualaniensis and toutilize its resources,this study established the PCA-GAM prediction model of S. oualaniensis was established based on the production data of light seine in the open sea Arabian Sea from 2017 to 2019, combined with the data of salinity, temperature, 0, 50, 100, 150 and 200 m water layers, mixed layer thickness, sea level anomaly, chlorophyll a concentration, sea surface velocity, longitude and latitude. The correlation between environmental factors will cause multicollinearity, resulting in over-fitting of the model, and reducing the prediction ability of the model. The environmental data were transformed into a few unrelated principal components (PCs) which retained important information of these environmental factors based on the application of dimension reduction techniques such as principle component analysis (PCA). The average variance explanation rate of the top 8 PCs accounted for 87.34% (±0.86%). The top 8 PCs were taken as explanatory variables of the GAM model to construct the prediction model of the distribution of S. oualaniensis. The establishment of PCA-GAM prediction model was divided into two-stage GAM. The first stage GAM is to estimate the presence probability of S. oualaniensis. The second stage GAM is to estimate the log-transformed CPUE of S. oualaniensis. The overall log-transformed CPUE was the product of the results from the first and second stages of the GAM. The eight fold cross-validation results showed that the mean of the correlation coefficients between the predicted values and the practical CPUE (log-transformed) was 0.532 7, the mean of the slopes of the regression models was 0.708 7, and the mean of truncation values was 1.471 1. The degree of overlap between the predicted values and the practical CPUE (log-transformed) from January to April and September to December 2019 was very high in spatial distribution, which indicated that the PCA-GAM model was able to predict the spatial distribution of S. oualaniensis in the Arabian Sea.

     

/

返回文章
返回