<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Machine Learning to Classify Vortex Wakes of Energy Harvesting Oscillating Foils</title></titleStmt>
			<publicationStmt>
				<publisher>AIAA</publisher>
				<date>03/01/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10477018</idno>
					<idno type="doi">10.2514/1.J062091</idno>
					<title level='j'>AIAA Journal</title>
<idno>0001-1452</idno>
<biblScope unit="volume">61</biblScope>
<biblScope unit="issue">3</biblScope>					

					<author>Bernardo Luiz Ribeiro</author><author>Jennifer A. Franck</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<p>A machine learning model is developed to establish wake patterns behind oscillating foils for energy harvesting. The role of the wake structure is particularly important for array deployments of oscillating foils since the unsteady wake highly influences the performance of downstream foils. This work explores 46 oscillating foil kinematics, with the goal of parameterizing the wake based on the input kinematic variables and grouping vortex wakes through image analysis of vorticity fields. A combination of a convolutional neural network with long short-term memory units is developed to classify the wakes into three classes. To fully verify the physical wake differences among foil kinematics, a convolutional autoencoder combined with [Formula: see text]-means++ clustering is used to reveal four wake patterns via an unsupervised method. Future work can use these patterns to predict the performance of foils placed in the wake and build optimal foil arrangements for tidal energy harvesting.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. Introduction</head><p>This paper utilizes a machine learning approach to classify vortex wake structures behind an energy harvesting oscillating foil. Understanding the vortex formation and resulting wake structure reveals information about the upstream disturbance, which could potentially be linked to the underlying flow conditions and/or oscillating foil kinematics.</p><p>Although commonly used in propulsive applications, oscillating foils can also extract energy from the flow in a similar manner as a rotational turbine <ref type="bibr">[1,</ref><ref type="bibr">2]</ref>. Furthermore, due to the coherent vortex wake of opposite vortex signs generated from the upstroke and downstroke foil motion there is a potential for cooperative motion within tightly packed array configurations to improve performance <ref type="bibr">[3]</ref>. In order to create control laws to optimize performance in array configurations, it is critical to fully understand and model the wake structure as a function of flow conditions and foil kinematics.</p><p>Traditional wake structure characterization is commonly investigated for cylindrical bluff bodies, such as the canonical work of Williamson and Roshko <ref type="bibr">[4]</ref> who described vortex wakes with a 'mS + nP' notation, where m is the number of single vortices (S) shed per cycle, and n is the number of clockwise/counter-clockwise vortex pairs (P). This notation has been propagated to oscillating foils with some success when the foil is in pure pitching <ref type="bibr">[5,</ref><ref type="bibr">6]</ref> or plunging <ref type="bibr">[7]</ref> motion. However, previous investigations focused on kinematics for propulsive foils and did not consider the wakes generated by an oscillating foil in the energy harvesting regime. When used for energy harvesting, the foil kinematics are characterized by a lower non-dimensional frequency and higher pitch/heave amplitudes compared to oscillating foil propulsion. The high amplitudes result in a rich variety of wakes with multiple vortices shed each foil stroke that are often chaotic and not easily identified by the 'mS + nP' notation <ref type="bibr">[8,</ref><ref type="bibr">9]</ref>.</p><p>To assist in the wake modeling of bluff bodies outside the canonical characterization, various machine learning techniques can be utilized. Particularly, convolutional neural networks (CNN) receive much interest due to the ability to process data from images for pattern recognition and prediction <ref type="bibr">[10]</ref>. For instance, a CNN is implemented to analyze cylinder and airfoil flow for various Reynolds numbers and obtained accurate predictions of the velocity field <ref type="bibr">[11,</ref><ref type="bibr">12]</ref> and force <ref type="bibr">[13]</ref> when compared to numerical data. Another example of a CNN is for vortex identification procedure <ref type="bibr">[14]</ref> which does not require user-input for thresholding such as Q-criterion <ref type="bibr">[15]</ref> or _ 2 criterion <ref type="bibr">[16]</ref>.</p><p>When analyzing unsteady flows, the time evolution of structures can be captured with recurrent neural networks with the use of long short-term memory (LSTM) <ref type="bibr">[17]</ref>, which can predict flow quantities by holding information from an input sequence, and not simply from a single input <ref type="bibr">[18]</ref>. Using flow information from the past five timesteps, Nakamura et al. <ref type="bibr">[19]</ref> predicted turbulent structures in a channel flow using a convolutional autoencoder combined with LSTM.</p><p>LSTM was also implemented in a dynamic wind farm wake model that predicts the main features of unsteady wind turbine wakes almost as well as high-fidelity computational models <ref type="bibr">[20]</ref>. The integration of convolutional layers and LSTM units has also predicted unsteady flow fields behind bluff bodies such as a cylinder and a foil <ref type="bibr">[18]</ref>.</p><p>The goal of this investigation is to develop a neural network that integrates convolutional layers and LSTM units towards classification of the bluff body wake structures behind oscillating foils. Using such classification will enable predictive models for these chaotic wakes, and inform performance optimization of arrays of foils operating as energy harvesters. Classification models have been previously implemented for wakes of propulsive foils <ref type="bibr">[21,</ref><ref type="bibr">22]</ref> and cylinders <ref type="bibr">[23]</ref>. This work classifies and clusters the drag-based wake from an oscillating foil in energy harvesting mode, which distinguishes itself from propulsive oscillating foils due to the lower oscillating frequency, higher pitch, and higher heave amplitudes <ref type="bibr">[24]</ref>. Prior work has also largely classified wake structures from point measurements within the wake. In contrast, this research classifies vortex wake structures from vorticity flow fields. Using images as the input data allows for connections linking wake features such as the size and strength of vortices with the underlying foil kinematics. Due to the unsteady interactions between vortices that affect the wake trajectory behind each foil configuration, the LSTM network aims to provide information on how each wake image is linked throughout time, improving the classification outcomes. The results of the supervised classification groupings are compared directly against an unsupervised convolutional autoencoder (CAE) and clustering methodology <ref type="bibr">[25]</ref> to confirm and update the boundaries between classes. Finally, the physics of the wake structure in each of the classified groups are explained and correlated with energy harvesting efficiency and foil kinematics.</p><p>Section II gives an overview of the foil kinematics and simulation methods followed by the initial groupings for the classification model in Section III. An overview of the classification network architecture and results are presented in Section IV, and Section V describes the groupings, and updates obtained from an unsupervised clustering of the input images with conclusions presented in Section VI.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. Computational Methods</head><p>This section discusses the computational fluid dynamics methods, including the foil kinematics, flow solver and mesh.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Foil Parameters</head><p>The foil kinematics are defined by three parameters, namely pitch amplitude, \ &gt; , heave amplitude, &#8984; &gt; , and reduced frequency, 5 2/* 1 , where 2 is the foil's chord length and * 1 , the freestream velocity. To generate a range of energy harvesting vortex wake structures, 46 unique kinematics are prescribed to a 10% thick elliptical foil through numerical simulations. The foil motion is sinusoidal in pitch and heave, utilizing a range of frequencies and amplitudes previously established as effective at energy harvesting <ref type="bibr">[8,</ref><ref type="bibr">26]</ref>. As opposed to an airfoil geometry, a thin elliptical foil is utilized as geometry has shown to have minor effects on efficiency <ref type="bibr">[27]</ref> and the fore-aft symmetry is desirable for harvesting energy from tidal flows. The kinematics with respect to time C are described in lab-fixed coordinates as</p><p>and</p><p>where &#8984;(C) and \ (C) are the prescribed heave and pitch motions, respectively, with pitching about the center-chord.</p><p>The phase difference between pitch and heave is fixed at c/2, which is found to yield the optimal energy harvesting performance <ref type="bibr">[28]</ref>. At C = 0 the foil is at the bottom of its heave stroke. Heave and pitch are changing simultaneously during foil motion, which creates a time-varying relative angle of attack with respect to the freestream flow given by</p><p>with &#167; &#8984;(C) representing the time derivative of the heave motion. A characteristic relative angle of attack is evaluated when the foil is at maximum pitch and maximum heave velocity, which occurs at one quarter of the cycle period ), or</p><p>The foil motion along with the foil parameters are illustrated in Figure <ref type="figure">1</ref>. The primary vortex, or the first vortex formed on the suction surface during each half stroke, is highlighted for the case 5 2/* 1 = 0.10; &#8984; &gt; = 1.00; \ &gt; = 75 .</p><p>Previous vortices originating from prior oscillating cycles provide an overall view of the wake. To evaluate performance the energy harvesting efficiency is defined as</p><p>which is the ratio of the average power extracted, %, to the power available in the freestream velocity throughout the swept area . ? . The power extracted is defined as</p><p>where H and " I are the vertical force and spanwise moment on the foil, respectively. To remove small cycle-to-cycle 90 variations the efficiency is phase-averaged over the last three cycles of simulation.</p><p>A wide range of kinematics within the energy harvesting regime is considered, all of which directly influence 5 &#8676; = 0.12; &#8984; &#8676; &gt; = 0.75; \ &gt; = 50 20.5 5 &#8676; = 0.10; &#8984; &#8676; &gt; = 1.25; \ &gt; = 41.9</p><p>(&#8676;,+) Table <ref type="table">1</ref> Summary of all kinematics with their computed " Z /4 values. Footnote refers to the unsupervised clustering model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Flow Solver and Mesh</head><p>The computations utilize an incompressible Navier-Stokes solver performed using a second-order accurate finite volume, pressure-implicit split-operator (PISO) method implemented in OpenFOAM <ref type="bibr">[29]</ref>. The Reynolds number of '4 2 = 1000 is selected for all simulations to enable a broad sweep of 46 kinematics within a tractable computational time. Prior work comparing experiments with low and high Reynolds number simulations has demonstrated only minor differences between the power generation and wake characteristics across a Reynolds number regime of 1000 50, 000 <ref type="bibr">[8,</ref><ref type="bibr">9]</ref>.</p><p>All simulations are performed with a 2D unstructured mesh, with foil motion generated through the boundary condition of a dynamic mesh solver that updates the position of all nodes in the domain at every timestep. The dynamic mesh algorithm utilized in this manuscript is previously validated against a stationary mesh in the work from Ribeiro et al. <ref type="bibr">[9]</ref>. The total domain size is 1062 in the horizontal direction and 1002 in the vertical direction, with the foil located 502 upstream in the G direction and vertically centered when the foil is at the bottom of its heave stroke (C/) = 0). The mesh is generated using Gmsh <ref type="bibr">[30]</ref>, with a subset of the mesh displayed in Figure <ref type="figure">2</ref>.</p><p>Fig. <ref type="figure">2</ref> Computational domain zoomed in on the foil mesh. The mesh 3A is displayed with its characteristics outlined in Table <ref type="table">2</ref>.</p><p>The boundary conditions entail a no-slip condition at the foil surface with zero pressure gradient, inlet flow on the left boundary, and outlet flow on the top, bottom and right boundaries. Simulations are run for a total of six oscillation cycles, and become stationary after three cycles.</p><p>Mesh refinement is evaluated through eight meshes with varying resolution near-foil and in the wake, where # corresponds to the total number of nodes. All mesh characteristics are displayed in Table <ref type="table">2</ref>, where the characteristic wake resolution is measured at G/2 = 3.0; H/2 = 0. This resolution is held approximately constant until 82 downstream as displayed in Figure <ref type="figure">2</ref>. Along the foil surface, a characteristic G is measured along the mid-chord, and # \ represents the total number of nodes on the body in the azimuthal direction. The CPU time is calculated for one oscillation cycle using a single processor on the Intel Cascade architecture. Foil efficiency and spanwise vorticity flow fields are evaluated across all meshes. Foil efficiency decreases with increasing wake resolution, demonstrating a [ of 3% between mesh 2 and mesh 3 which has twice the wake resolution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>120</head><p>As shown in the vorticity fields, the same number of coherent vortices are observed between meshes 3 and 4, with only small variations in vortex positioning in the wake as displayed in Figure <ref type="figure">3</ref>. There is little difference between meshes A and B, indicating both have sufficient near-foil resolution. Therefore, as a balance in computational cost and accuracy of both the foil forces and flow fields, mesh 3A is utilized in all simulations. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. Initial Class Selection 125</head><p>The kinematics outlined in Table <ref type="table">1</ref> cover a large parameter space, which contributes to a range of energy harvesting modes, which range in efficiency from close to zero up to 30%. These results demonstrate that the optimal performance is not strongly correlated with a single set of foil kinematics. As displayed in Figure <ref type="figure">4a</ref> high energy harvesting efficiency is found within the range 5 &#8676; = 0.12 0.15; \ &gt; = 65 80 ; &#8984; &#8676; &gt; = 0.50 1.00 with no clear correlation with a single kinematic parameter. Thus, it is convenient to reduce the parameter space into a single and representative variable,</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>U</head><p>)/4 , the characteristic relative angle of attack as defined in Equation <ref type="formula">4</ref>, which allows foil efficiency to be expressed as a simpler function of foil kinematics as shown in Figure 4b <ref type="bibr">[26]</ref>. The efficiency increases monotonically until</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>U</head><p>)/4 &#8673; 28.0 and then varies for higher U )/4 values due to a high degree of flow separation <ref type="bibr">[8,</ref><ref type="bibr">9]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. Supervised Classification</head><p>In this Section, the wake is analyzed through an image-based supervised learning algorithm, where classes are defined based on the predetermined groupings motivated in Section III.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Data Pre-Processing</head><p>The input to the classification model are images of 2D spanwise vorticity extracted from a 7.52 by 7.52 window located in the wake and interpolated onto a cartesian grid of 128 by 128 pixels as illustrated in Figure <ref type="figure">6</ref>. The window size is selected such that it contains all vortices shed from the foil in the H-direction in all kinematics from Table <ref type="table">1</ref> and<ref type="table">7</ref>.52 corresponds to a typical inter-foil spacing in foil-arrays under the energy harvesting regime. For each set of kinematics, three oscillation cycles within the steady state regime are used as input with data sampled at every C* 1 /2 = 0.1, for a total of 11, 846 samples. The fixed sampling rate is used in order to maintain the difference between consecutive wake images independent of the foil's reduced frequency, thus the number of samples differs for each frequency. Contour levels of vorticity are chosen to display the wake structures in all kinematics, and six levels ( 2, 1, 0.5, 0.5, 1, 2) are consistently drawn for each wake image as shown in Figure <ref type="figure">6</ref>.</p><p>Since the vortices shed from the foil may affect each other's trajectory <ref type="bibr">[9,</ref><ref type="bibr">31,</ref><ref type="bibr">32]</ref>, the time evolution is considered in the classification neural network through the use of LSTM units. A sequence of five images is given as the input data, which provided higher accuracy compared with a sequence of ten images. To avoid overfitting, a data augmentation technique is also implemented, which duplicated the number of samples from 11, 846 to 23, 692. This technique not only took an input sequence at consecutive 0.1C* 1 /2 units, but also with 0.1C* 1 /2 skipped between samples, i.e.</p><p>0.1, 0.3, 0.5, 0.7, 0.9C* 1 /2, following a similar strategy by Chong and Tay <ref type="bibr">[33]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Classification Model Architecture</head><p>Using the Python based libraries TensorFlow <ref type="bibr">[34]</ref> and Keras <ref type="bibr">[35]</ref>, the classification model is built from a combination of convolutional layers, LSTM units and dense layers, as outlined in Figure <ref type="figure">7</ref>. The 2D convolutional layers (Conv2D) are applied on each sample to extract the most significant features of the wake. These key features are detected with filters, which create feature maps through a convolutional operation on the preceding layer <ref type="bibr">[25]</ref>. Each convolutional layer uses a linear activation function with multiple filters of fixed kernel size, 3 &#8677; 3, that reduces the matrix dimensions while simultaneously keeping the most pertinent features. The number of feature maps define the depth of each convolutional layer and within each layer, a downsample operation is performed with a 2 &#8677; 2 stride, resulting in a reduction factor of 2 in each matrix dimension while the depth remains constant. Each input sequence passes through four convolutional layers, decreasing the dimension of each sample from 128 &#8677; 128 &#8677; 1 to an 8 &#8677; 8 feature map with eight channels.</p><p>Each sample is then flattened into a 1D vector with 512 elements where 90 LSTM units analyze the correlation between each wake image within the input sequence, with the goal of detecting patterns between each wake structure and its time evolution behind each foil configuration. The final section of the model contains a dropout layer of rate equals to 0.1 that is placed between two dense layers in order to decrease overfitting. The dense layers classify each image according to the predetermined classes. The first layer contains 90 neurons and the second dense layer has three neurons corresponding to class A, B or C. Both dense layers use a sigmoid activation function to normalize the output from the previous layer into a 0 1 range. To update the neural network weights, the Adam optimization algorithm <ref type="bibr">[36]</ref> is implemented in the model. To prevent overfitting, the early stopping technique [37] with 100 training epochs is utilized.</p><p>The following hyperparameters in the classification model are tuned: number of filters in the convolutional layers, LSTM units, and number of neurons in the first dense layer. The sequences of (64, 32, 16, 8), <ref type="bibr">(32,</ref><ref type="bibr">16,</ref><ref type="bibr">8,</ref><ref type="bibr">4)</ref>, (128, 64, 32, 16) filters for the convolutional layers are tested and (64, 32, 16, 8) obtained best performance. The LSTM units and number of neurons in the first dense layer are tuned using a range from 20 100 units and neurons and it is found that 90 units and 90 neurons provided a higher model accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Model Accuracy in Predicting Class Label of Test Data</head><p>A five-fold stratified cross-validation is performed in the data, comprised of matrices of the vorticity fields and their respective class labels. This procedure helps decrease overfitting and it assures representative information of all prelabeled classes in each test subset <ref type="bibr">[35,</ref><ref type="bibr">38,</ref><ref type="bibr">39]</ref>.</p><p>Using the three classes defined in Section III (see Figure <ref type="figure">4b</ref>), the model is trained and an average accuracy of 92% is obtained across the five folds. This indicates that approximately 92% of all samples processed by the algorithm are labelled the same as their prelabels.</p><p>A confusion matrix, which is a common tool for summarizing the performance of a classification algorithm, and the corresponding mislabeled test data distribution for the worst performance fold (fold 4) are shown in Figure <ref type="figure">8</ref>. The   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. Unsupervised Clustering</head><p>The classification model presented in Section IV divides the wake kinematics into three classes predetermined by the researcher (based on prior vortex analysis). In this Section, an unsupervised algorithm is utilized to group similar wake kinematics, and the clusters are subsequently compared with the classes from the supervised model. The unsupervised clustering groups the vortex images, individually, without any prior knowledge or relationship between wake kinematics and the respective wake structures. Therefore, the clusters obtained through this method will assist in verifying the classification results and potentially modifying the class boundaries determined by the researcher.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Clustering Model Architecture and Convergence</head><p>The unsupervised model architecture follows closely from the CAE clustering algorithm by Calvet et al. <ref type="bibr">[25]</ref>,</p><p>including the same number of convolutional layers, filter and skip connections. The architecture consists of an autoencoder with five sequences of convolutional and max-pooling layers for the encoder portion. For the decoder portion, five sequences of convolutional and up sampling layers are used. The only hyperparameter retuned is the batch size, in which an online learning method (batch size equals to 1), is implemented within the autoencoder. With this algorithm there is no prelabeling of images, but the user must determine the number of clusters, which is explained below.</p><p>The input data is the same vorticity images described in Section IV, except that there are no time sequences provided, and therefore each instantaneous wake image is treated independently. This results in 11, 846 unique samples from 46 simulations. The training and validation data for the autoencoder utilize 27 of the 46 simulated kinematics. Within the 27 kinematics, approximately 30% of the samples are used for validation, corresponding to the first out of the three oscillation cycles, as outlined in Table <ref type="table">1</ref>. Using samples of the same foil kinematics in the training/validation sets improved performance of the autoencoder while retuning. The remaining sets of foil kinematics <ref type="bibr">(19)</ref> are used as test data after the autoencoder is tuned.</p><p>Due to the stochastic nature of the model, the CAE clustering algorithm is trained for multiple iterations to check for convergence under a developed criteria. After each independent iteration, a set of kinematics is designated to the cluster that contains the majority of its samples (since samples are treated independently, samples from the same kinematics can end up in different clusters). For each iteration, the cluster label at which a set of kinematics is assigned to is recorded.</p><p>Convergence is reached if a set of kinematics remains in the same cluster (same label) as the previous iteration. Figure <ref type="figure">9</ref> shows the percentage of foil kinematics that converged when four clusters are considered. A total of 56 random clusterings is performed and after 35 iterations, the algorithm consistently maintains convergence of at least 96% of all foil kinematics being assigned to a unique cluster. A combination of the elbow and the silhouette score methods are utilized to determine the optimal number clusters for the provided samples, following the approach from Calvet et al. <ref type="bibr">[25]</ref>. The elbow method <ref type="bibr">[40]</ref> computes the total within-cluster sum of square error, known as distortion, and the number closer to the 'elbow' of the curve is the indicator of the approximate number of clusters that best separates the data. The silhouette score method <ref type="bibr">[41]</ref> determines how well each image lies within its cluster by estimating cohesion (intra-class) and separation (inter-class) as Euclidean distances. The score is a combination of both factors, and ranges from zero to one, where a higher score indicates better clustering. The distortion and silhouette scores are averaged over all iterations with results displayed in Figure <ref type="figure">10</ref>. The elbow, represented by the intersection point of the tangent dashed lines to both ends of the distortion curve in Fig. <ref type="figure">10</ref>, is located between four and five clusters (see green shaded region) but a slightly higher silhouette score is found when the data is divided into four clusters. Therefore, the optimal number of clusters is set to four. Due to the small differences in the averaged silhouette score, an analysis is also performed using five clusters but does not provide any additional To summarize, when four classes are defined, the highest classification accuracy is obtained when the class boundaries are placed at U )/4 = 11.7 , U )/4 = 23.0 and U )/4 = 29.3 (orange lines from Figure <ref type="figure">11b</ref>). The results of these tests reveal that the accuracy is just as good as the original three classes, with an average fold accuracy of 91%.</p><p>Furthermore, mismatched labels only occur close to the boundary divisions. For instance, 100% of the samples from class A in Folds 1 and 2 have a label match between predicted and prelabelled, with the same occurring in Fold 4 for classes B and C. For Fold 1, only the foil kinematics with U</p><p>)/4 = 28.0 is mislabelled between classes C and D, which can be explained by the proximity of this kinematics to its neighboring class. All other arrangements of class boundaries that are tested yield an average accuracy lower than 91%.</p><p>To illustrate the kinematics that are commonly mislabeled by the algorithm, the confusion matrix for the worst performance fold (fold 5) is displayed in Figure <ref type="figure">12a</ref>. Even for this scenario, at least 79% of samples are properly labelled. Those not accurately predicted are shown in Figure <ref type="figure">8b</ref>, with only a single foil kinematics with U</p><p>)/4 = 31.7 ( 5 &#8676; = 0.15; &#8984; &#8676; &gt; = 1.00; \ &gt; = 75 ) containing a label mismatch higher than 50%. An outlier is the kinematics with</p><p>) which contains a label mismatch lower than 50% and is not close to a neighbouring class. A possible explanation is the roll-up of the shear layer into multiple weak vortices generates a wake trajectory that is confused with the wake pattern of a higher U )/4 . The wake structures found in this foil kinematics are displayed in Figure <ref type="figure">5</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>&#8984;</head><p>, which varied from 0 &lt; U )/4 &lt; 40 and 0.2 &lt; (C &lt; 1.2. Due to the higher reduced frequency of propulsive foils, more wake structures are found closer to the foil and this contributes to a higher contrast of wake patterns across clusters compared with those in Figure <ref type="figure">13</ref>. This contrast also contributes to the difference in the number of wake patterns between foil regimes. While the work by Calvet et al. obtained six wake patterns behind propulsive foils, the analysis performed here identified four distinct wakes under the energy harvesting regime.</p><p>Class D demonstrates high variation in efficiency within its kinematics. This is observed between U</p><p>)/4 = 40.5 and</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>U</head><p>)/4 = 41.9 where efficiency drops by approximately 9% (see Figure <ref type="figure">11b</ref>). This could be described as a bifurcation in the efficiency curve at U )/4 &gt; 28.0 as illustrated by an upper and lower branches. However, neither the classification nor clustering models could discern differences in the wakes between the higher and lower branches of efficiency within class D. To further investigate these branches, the foil parameters corresponding to each kinematics are explored and it is noticed that all foil kinematics in the lower branch have a reduced frequency of 5 2/* 1 = 0.10 and the upper branch, 5 2/* 1 = 0.12 0.15.</p><p>The new updated classes also provide patterns in power extraction, as displayed in Figure <ref type="figure">14</ref>. Each curve corresponds to the phase-averaged total power extracted in a half-cycle from a representative foil kinematics in each class. All classes display a power peak close to the mid-stroke position (C/) = 0.25), which corresponds to the foil's maximum heave velocity and thus typically is where maximum power is reached. Class A shows a smooth power profile with a lower amplitude compared to the other classes as expected due to the absence of coherent vortices generated by the foil. With the formation of LEVs as U )/4 increases, class B still highlights a smooth profile and class C indicates a higher power magnitude and higher unsteadiness on C/) = 0.3 0.5. This unsteady behavior is most likely caused by secondary vortices formed on the foil due to a higher U )/4 in class C compared to class B. This unsteadiness is more apparent in the lower branch of class D where large and strong vortices are formed and shed from the foil. The power profile in the representative kinematics of the upper branch is similar to the lower branch in the region C/) = 0 0.3 but it displays a second power peak in the remaining portion of the half-cycle. This peak is caused by the higher reduced frequency of the kinematics in the upper branch, which contributes to a delay in the vortex shedding and thus more power is extracted. To visualize those vortex structures, the wakes from both upper and lower branches are illustrated in Figure <ref type="figure">15</ref>. As observed in Section III and by Ribeiro et al. <ref type="bibr">[8]</ref>, a lower reduced frequency is correlated with vortex structures spending less time on foil surface, which decreases the pressure gradient around the foil. The wake structures between upper and lower branches are different with more vortices located within the wake window in the upper branch due to higher foil's reduced frequency but still no pattern can be visualized. Although neither the classification nor clustering models could discern the differences just described in these branches, a possible solution would be to provide additional information about the kinematics of each wake image to the convolutional neural network like the reduced frequency, similar to the method implemented by Morimoto et al. <ref type="bibr">[13]</ref>, but it is not explored in this investigation. Another potential solution would be to have more foil kinematics and hence more data in class D. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. Conclusion</head><p>A machine learning model is developed to classify wake structures behind an oscillating foil in the energy harvesting regime of flapping foil kinematics. The goal of the paper is to utilize the machine learning algorithm to sort and classify wake modes using the vorticity fields downstream of the oscillating foil and correlate the kinematics with associated wake patterns. This model gives insight on wake similarity among various foil kinematics, which is important to build predictive models of oscillating foil arrays for energy harvesting.</p><p>Data is obtained through simulations of oscillating foils at 46 unique kinematics, and time-dependent vorticity flow fields are extracted at equal times across three simulation cycles to form a total of 23, 692 samples. Based on previous work <ref type="bibr">[9]</ref>, three initial classes are defined based on values of the relative angle of attack, U )/4 . The classification model consists of four convolutional layers and 90 LSTM units applied on multiple input sequences of five samples each. The model's output consist of three neurons corresponding to the classes A, B, C. After the model is trained and tuned, the average test accuracy among all folds is 92% with the majority of foil kinematics showing a label mismatch percentage less than 50% between actual and predicted, demonstrating the model's ability to discern wakes among classes.</p><p>Although the classification model is successful in finding wake patterns the class divisions are predetermined by the researcher, and assumed to correlate with only one parameter, U</p><p>)/4 , thus biasing the relationship between wake structure and flapping kinematics. Therefore, an unsupervised approach is performed through a CAE clustering algorithm, which does not require any prelabelling or bias. The results indicate that there is still a strong correlation with U )/4 , and that clusters naturally align with this kinematic parameter. Furthermore, analysis shows that four clusters are optimal instead of three that were originally proposed.</p><p>In summary, the clustering model provided validation that U )/4 was a predictive kinematic parameter for wake structure, and outlined an additional grouping previously undetected by the researcher. A final configuration of four new classes is proposed which results in an average fold accuracy of 91% using the classification algorithm. The four classes offer new physical insight on wake patterns within each range of foil kinematics based on vortex strength and oscillatory wake structure. Further analysis is performed in the class with the highest U</p><p>)/4 values and additional wake patterns are obtained that could not be captured by either the classification or clustering algorithms. This research builds upon the knowledge of how wake patterns and kinematics are correlated, which is instrumental in developing predictive models of oscillating foil arrays in which vortex wakes directly impact the energy harvesting of downstream foils.</p></div></body>
		</text>
</TEI>
