Testing a Generalizable Machine Learning Workflow for Aquatic Invasive Species on Rainbow Trout (Oncorhynchus mykiss) in Northwest Montana

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Biological invasions are accelerating worldwide, causing major ecological and economic impacts in aquatic ecosystems. The urgent decision-making needs of invasive species managers can be better met by the integration of biodiversity big data with large-domain models and data-driven products. Remotely sensed data products can be combined with existing invasive species occurrence data via machine learning models to provide the proactive spatial risk analysis necessary for implementing coordinated and agile management paradigms across large scales. We present a workflow that generates rapid spatial risk assessments on aquatic invasive species using occurrence data, spatially explicit environmental data, and an ensemble approach to species distribution modeling using five machine learning algorithms. For proof of concept and validation, we tested this workflow using extensive spatial and temporal hybridization and occurrence data from a well-studied, ongoing, and climate-driven species invasion in the upper Flathead River system in northwestern Montana, USA. Rainbow Trout (RBT; Oncorhynchus mykiss), an introduced species in the Flathead River basin, compete and readily hybridize with native Westslope Cutthroat Trout (WCT; O. clarkii lewisii), and the spread of RBT individuals and their alleles has been tracked for decades. We used remotely sensed and other geospatial data as key environmental predictors for projecting resultant habitat suitability to geographic space. The ensemble modeling technique yielded high accuracy predictions relative to 30-fold cross-validated datasets (87% 30-fold cross-validated accuracy score). Both top predictors and model performance relative to these predictors matched current understanding of the drivers of RBT invasion and habitat suitability, indicating that temperature is a major factor influencing the spread of invasive RBT and hybridization with native WCT. The congruence between more time-consuming modeling approaches and our rapid machine-learning approach suggest that this workflow could be applied more broadly to provide data-driven management information for early detection of potential invaders.

Original languageEnglish
Article number734990
JournalFrontiers in Big Data
Volume4
DOIs
StatePublished - Oct 18 2021

Funding

We thank two reviewers and the handling editor, Bin Peng, for insightful comments that helped improve and clarify this article. We also thank Ryan Kovach and other staff from Montana Fish, Wildlife, and Parks and Montana Natural Heritage Program for stewarding and collating genetic data on Montana fish, which were used in this project. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. This research was funded primarily by a NASA ROSES Sustainable Living Systems grant (80NSSC19K0185) and a U.S. Geological Survey Northwest Climate Adaptation Science Center award G17AC000218 to CBvR.

FundersFunder number
G17AC000218
National Aeronautics and Space Administration80NSSC19K0185

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 13 - Climate Action
      SDG 13 Climate Action

    Keywords

    • big data analytics
    • early detection and rapid response
    • invasive species
    • machine learning
    • remote sensing
    • species distribution modeling

    Fingerprint

    Dive into the research topics of 'Testing a Generalizable Machine Learning Workflow for Aquatic Invasive Species on Rainbow Trout (Oncorhynchus mykiss) in Northwest Montana'. Together they form a unique fingerprint.

    Cite this