ABSTRACT: A variety of stakeholders require information about marine systems. In the open ocean, pilots of marine vessels require knowledge about environmental conditions for safe passage and route planning. On the coastline, communities rely on information about nearshore dynamics to increase safety from coastal hazards such as nearshore pollutants, coastal …
See more
ABSTRACT: A variety of stakeholders require information about marine systems. In the open ocean, pilots of marine vessels require knowledge about environmental conditions for safe passage and route planning. On the coastline, communities rely on information about nearshore dynamics to increase safety from coastal hazards such as nearshore pollutants, coastal erosion, or dangerous recreational conditions (e.g., rip currents). Models provide information for environmental health and safety in the form of forecasts or general knowledge of the marine science systems. Large volumes of data from a variety of marine sensors are now available thanks to progress in computer processing and data storage. These data should be leveraged to advance the boundaries of marine science knowledge. Herein, Machine Learning (ML) techniques are applied to improve different types of marine science models and increase the knowledge of marine science systems. Two different types of ML techniques are considered; traditional machine learning and deep learning. The techniques are applied in a transparent way, ensuring that the ML routine has made predictions with appropriate reasoning. Also, the transferability of the ML routines is assessed to determine the limits of ML routine generalizability. The thesis is organized in a manuscript format, where the first and last chapters serve as overall Introduction and Conclusions, respectively. The central three chapters are individual manuscripts. The second chapter applies a traditional ML technique called a decision tree to numerical wave model output. The decision tree predicts corrections of 24-hour time horizon significant wave height forecasts generated by a numerical wave model. The wave model output was located at buoy locations offshore of the United States Pacific Northwest coastline. The application of the decision tree increased wave model skill more for winter than for summer. The decision tree also made accurate predictions in a geospatial transfer experiment, where the decision tree predicted error for a location that was not used in training data. However, the decision tree predictions were less accurate when it was applied to a different time period. The transparent nature of the algorithm allowed for inspection of the algorithm’s architecture, finding consistent underestimations of significant wave height for data points associated with mid-wave periods (6-12s). The third chapter develops an automated technique to recognize morphological shapes within coastal imagery using a Convolutional Neural Network (CNN). The morphological shapes are morphological patterns that occur frequently in the nearshore called beach states. The input to the CNN was coastal imagery from two different study sites and the output was beach state labels. The two different study sites were Narrabeen, Sydney, Australia and Duck, North Carolina, United States. Three ensembles of CNNs were trained; two single-site CNNs (trained at individual locations) and one multi-site CNN (trained at both locations). The CNNs were applied to both locations to determine skill at the location it was trained (the original location) in a self-test and skill at the location where it was not trained (the alternate location) in a transfer-test. For the self-tests, the CNN skill was comparable to inter-labeller agreement, with skill at Duck higher than skill at Narrabeen (F-scores of 0.8 for Duck and 0.59 for Narrabeen). The CNN skill was reduced in the transfer tests. However, if at least 25% of the training data came from the alternate location, the skill increased to within 10% of the skill at the original location. A visualization technique (Guided Grad-CAM) re-vealed areas of importance within input images for CNN decision making, and confirmed that the CNN identified the appropriate morphological characteristics (e.g., terraces or rip currents) for each classification. The fourth chapter builds off the third, and applies a CNN to a long (>20 years) dataset to detect alongshore variability of beach state quantified as a beach probability simplex, thereby advancing the beach state framework from discrete space to continuous space. The approach from the third chapter is modified to detect alongshore differences in beach state using a windowing technique. The CNN produced beach probability simplices from a 28-year dataset of images from Duck, NC, and results showed that most (67%) of the resulting beach probability simplices encompassed more than one state. The 28-year time series was dominated by an annual cycle, where simplices that encompassed onshore states occurred in summer, offshore states in winter, and intermediate states in fall or spring. The mean value of the beach probability simplex exhibited a strong relationship with significant wave height (28-year daily average R=0.77) and mean wave direction (28-year daily average R=0.84). The simplices that encompassed the highest number of states (three) were most likely to occur in fall, specifically the month of September.
See less