<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Consistent Predictability of the Ocean State Ocean Model Using Information Theory and Flushing Timescales</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>07/01/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10340596</idno>
					<idno type="doi">10.1029/2020JC016875</idno>
					<title level='j'>Journal of Geophysical Research: Oceans</title>
<idno>2169-9275</idno>
<biblScope unit="volume">126</biblScope>
<biblScope unit="issue">7</biblScope>					

					<author>Aakash Sane</author><author>Baylor Fox‐Kemper</author><author>David S. Ullman</author><author>Christopher Kincaid</author><author>Lewis Rothstein</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[This paper introduces the ROMS-OSOM, a Regional Ocean Modeling System (ROMS) implementation simulating Rhode Island waterways called the Ocean State Ocean Model (OSOM).• The predictability of the OSOM is evaluated using information theory and initial condition ensembles in summer and winter conditions.• The flushing time scale (freshwater and salinity) of Narragansett and Mt. Hope Bays are calculated and resemble the predictability timescales, indicating that predictability is largely governed by the estuarine circulation in this model.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Coastal marine forecast systems are in use or development in a number of regions worldwide (e.g. <ref type="bibr">Wilkin et al., 2018;</ref><ref type="bibr">Moore et al., 2011;</ref><ref type="bibr">Lellouche et al., 2018;</ref><ref type="bibr">Pinardi &amp; Coppini, 2010;</ref><ref type="bibr">Mel &amp; Lionello, 2014;</ref><ref type="bibr">Raboudi et al., 2019)</ref>. As each region is unique, the length of forecast window and relative levels of forced to internal variability differ among these systems. The Ocean State Ocean Model (OSOM) is a new model in development, which is an extension and synthesis of past prototype models <ref type="bibr">(Bergondo, 2004;</ref><ref type="bibr">Bergondo &amp; Kincaid, 2007;</ref><ref type="bibr">Liu et al., 2016;</ref><ref type="bibr">Wertman, 2018;</ref><ref type="bibr">Ullman, 2019;</ref><ref type="bibr">McManus et al., 2020)</ref> being evaluated for potential use as a forecast system. In this evaluation, key questions are: How often should a forecast be made?</p><p>How far into the future can forecasts be skillful? How long does the model take to spin up? How accurate must surface and boundary forcing be to arrive at useful forecasts, given that these data would also be predictions (e.g., from numerical weather prediction models)? Which regional societal challenges are better framed as changes to the region's climatology (i.e., projections) rather than as predictable futures that depend on the model's initial conditions (i.e., forecasts)? In this paper, a framework for addressing these questions is developed by adapting methods from information theory and ensemble-based measures of predictability, internal variability, and forced variability. The OSOM is taken as a test example of these methods and, as a coastal model in development with unique characteristics, the specific results of this study are useful for the future development of this particular model.</p><p>Forecasting hydrodynamic parameters is pertinent for an estuary as they play a vital role in controlling the physical as well as biogeochemical changes. An important aspect of forecasting is finding the predictability/forecasting timescales that limit the degree to which initial conditions govern the future behavior of the numerical model for individual parameters. These timescales quantify the persistence of anomalies and are a feature of the numerical model. Predictability is a measure of a model's ability to forecast or predict the evolution of anomalies in the future from initial conditions given prescribed external forcing. By contrast, changing forcing due to climate change (e.g., <ref type="bibr">Xiu et al., 2018)</ref>, altered topography via erosion or dredging <ref type="bibr">(Hayward et al., 2018)</ref>, changes to wastewater treatment or power plant effluent <ref type="bibr">(Mustard et al., 1999)</ref>, etc., are external factors affecting boundary conditions rather than initial conditions whose impact can be assessed using projections of future climatology with altered boundary conditions over a variety of plausible initial conditions. Thus, predictability measures a model's potential to predict or forecast a future state which is distinct from climatology, which is distinct from projecting the changes to climatology forced from changes to boundary conditions. The state of the system in a forecast can be only considered in a probabilistic way and hence predictability is a property involving two distributions <ref type="bibr">(DelSole, 2004)</ref>: predictability quantifies the departure of a forecast distribution from the climatology distribution <ref type="bibr">(Shukla, 1981;</ref><ref type="bibr">Leung &amp; North, 1990)</ref>. Quantifying this departure involves measurement of uncertainty in the forecast signal. The uncertainties in the initial conditions can be thought of as anomalies which eventually are forgotten by the model, or overwhelmed by chaotic variability or the influence of boundary conditions as time proceeds until the forecast statistical distribution becomes indistinguishable from the climatology distribution. Beyond this time scale a forecast provides no additional information beyond climatology, and forecasts are then no more useful than projections of the future climatological range of possibilities. This article has three purposes: (1) To describe the OSOM; <ref type="bibr">(2)</ref> To use ensemble simulations to find predictability timescales; (3) To find estuarine flushing timescales for fresh and saline water masses and compare these to <ref type="bibr">(2)</ref>. The model is forced by winds, tides, river runoff, evaporation, precipitation and also forced by heat fluxes and open boundary conditions. So, unlike the numerical weather prediction models for which the information theory techniques applied here were developed, the OSOM is a forced model where much of the variability comes from external forcing that may determine the trend of the evolution of the state parameters, or alternatively internal variability (e.g., hydrodynamic instabilities and chaos) may dominate. A companion paper by the authors to this one develops a non-parametric information theory approach to quantifying the amount of internal vs. forced variability similar to the ensemble approach of <ref type="bibr">(Llovel et al., 2018)</ref>, and uses this metric to quantify the relative importance of different choices in boundary forcing. As the balance of sources of variability depends on forcing, resolution, classes of flow, etc., the measured forced vs. intrinsic variability depends on the specifics of the model, rather than being a general description of the waterways under study. So, too, do the predictability metrics describe the specific model being studied rather than the system. However here a comparison to traditional estuarine flushing timescales serves to illustrate that the model is governed by physical principles, so quantifying these based on the real-rather than simulated-world may nonetheless be useful in establishing physical guidelines underlying limits on predictability. Metrics from information theory provide a natural way of quantifying distances between two probability distributions <ref type="bibr">(Cover &amp; Thomas, 2012)</ref>. Information theory metrics have been used in myriad ways in other fields (e.g., electronic communications, image processing, and molecular biology). Using information theory metrics for weather prediction and climate projection is well established <ref type="bibr">(Leung &amp; North, 1990;</ref><ref type="bibr">Schneider &amp; Griffies, 1999;</ref><ref type="bibr">Roulston &amp; Smith, 2002;</ref><ref type="bibr">Kleeman, 2002;</ref><ref type="bibr">DelSole, 2004;</ref><ref type="bibr">Haven et al., 2005)</ref>, but they are not commonly used in coastal modeling. <ref type="bibr">DelSole (2004)</ref> relates the requirement to quantify uncertainty with the usage of metrics from information theory. The most commonly used metrics are entropy, relative entropy, and mutual information <ref type="bibr">(Shannon, 1948)</ref>, although other variants are also useful <ref type="bibr">(Kleeman, 2002;</ref><ref type="bibr">Leung &amp; North, 1990)</ref>. A key advantage for use of these metrics in coastal modeling is that they can be ascribed to a variety of physical or biogeochemical variables; here we examine salinity, temperature, and kinetic energy over regions and at observation locations, but in future work we will examine biogeochemical variables in the OSOM.</p><p>An important time scale for an estuary is the flushing time scale or residence time scale <ref type="bibr">(Knudsen, 1900)</ref>, which is defined as the average residence time of a parcel of fluid inside the estuary (e.g., <ref type="bibr">Monsen et al., 2002)</ref>, and thus also the average retention time of water masses in the estuary. As the numerical model represents the physical domain, there is an inherent relation between the forecasting timescales and the flushing time scale, because eventually tracer anomalies present in the initial conditions will be flushed from the estuary, and the flushing timescale is an estimate of how long this process will take (assuming the anomalies are conserved on each water parcel). Here these timescales are found for the OSOM, a model developed specifically for Narragansett Bay and connected waterways.</p><p>Narragansett Bay (NB) is a medium-sized estuary and a natural harbor. As per the classification of estuaries based on physical and hydrological attributes, NB is a class 8 estuary (a moderate area, volume, and freshwater flow estuary that is deep and salty: <ref type="bibr">Engle et al., 2007)</ref>. It is a prime example of a coastal plain estuary, also known as a drowned river valley, which is the most common type of estuary in temperate climates. The bay covers an area of &#8764; 400 km 2 <ref type="bibr">(Pilson, 1985)</ref>. It is 16 km wide (East-West), 32 km long (North-South), and has 412 km of shoreline. The Bay extends from the Providence and Seekonk rivers in the north to Rhode Island Sound in the South.</p><p>To the east, it connects to Mount Hope Bay, fed by the Tauton River and connected by the Sakonnet River to Rhode Island Sound. The whole of the Narragansett Bay, Mount Hope Bay, associated rivers, and Rhode Island Sound is simulated in OSOM (Figure <ref type="figure">1</ref>), but the emphasis in this paper is variables within NB and Mount Hope Bay. manuscript submitted to JGR: Oceans navigation channels. The Bay provides a natural habitat for many living things and is of commercial and ecological importance to the local community. Commercial fishing and shell fishing are important economic activities and the Bay has also been used for recreational sports such as a harbor for the America's Cup and the Volvo Ocean Race sailing competitions. Recently pollution has prevented these activities; bacteria and viruses have caused beach closures, harmful blooms, and shell fishing bans, and hypoxia is frequent and sometimes induces large fish kills. OSOM will be used to simulate the physics of the Bay and predict the physical and biogeochemical conditions conducive to these events, as well as assess the impact of different management and mitigation practices. The predictability timescales studied here help reveal the utility of the model to forecast the physical conditions for harmful events. This article has been structured as follows: Section 2 provides detail of the computational model OSOM. Section 3 describes the theory of using mutual information to find predictability timescales. Section 4 contains the ensemble simulation setup for forecasting and climatology sets. Application of mutual information to the ensembles has also been described in Section 4. Section 5 states the results for various cases and also gives the flushing timescales obtained via OSOM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Ocean State Ocean Model</head><p>The Ocean State Ocean Model (OSOM) is an application of the Regional Oceanic Modelling System -ROMS <ref type="bibr">(Shchepetkin &amp; McWilliams, 2005)</ref>. The curvilinear terrainfollowing coordinate system employed in ROMS is well suited for coastal applications since the bathymetric variations in coastal systems and estuaries are large. The model has curvilinear varying horizontal resolution as well, from &#8764; 50m towards the North to around 200m in the south of the modelled domain. The horizontal grid consists of 1000 &#215; 1100 grid cells and 15 terrain-following sigma levels in the vertical. The Generic Length Scale (GLS) scheme is used to represent unresolved turbulence <ref type="bibr">(Umlauf &amp; Burchard, 2003)</ref>.</p><p>The offshore forcing at the open boundaries is provided by surface elevation and depth-averaged velocity using 9 tidal constituents (M2, S2, N2, K2, K1, O1, Q1, M4, M6) from the Eastcoast tidal constituent database <ref type="bibr">(Mukai et al., 2002)</ref> and, at subtidal timescales, with low-pass filtered output of the hindcast version of the Northeast manuscript submitted to JGR: Oceans 1. Connecticut <ref type="bibr">River,</ref><ref type="bibr">2. Thames River,</ref><ref type="bibr">3. Pawcatuck River,</ref><ref type="bibr">4. Maskerchugg River,</ref><ref type="bibr">5. Hunt River,</ref><ref type="bibr">6. Hardig Brook,</ref><ref type="bibr">7. Pawtuxet River,</ref><ref type="bibr">8. Woonasquatucket and Moshassuck River,</ref><ref type="bibr">9.</ref> Blackstone River, 10 ten Mile River, 11. Palmer River, 12. Taunton River. Coastal Ocean Forecast System (NECOFS), a regional model covering the northeast U. S. coastal ocean <ref type="bibr">(Beardsley &amp; Chen, 2014)</ref>. The surface elevation and depth-averaged velocity forcing are implemented using the <ref type="bibr">Chapman (1985)</ref> and <ref type="bibr">Flather (1976)</ref> methodologies respectively. The depth-dependent velocity, temperature, and salinity at the open boundaries are forced using the <ref type="bibr">Marchesiello et al. (2001)</ref> combined radiation and nudging open boundary condition using low-pass filtered NECOFS output.</p><p>The nudging timescales vary with stronger nudging on inflow (timescale of 1.6h) than on outflow (timescale of 24h).</p><p>Surface heat and momentum fluxes are estimated from meteorological variables obtained from models and local observations using the updated COARE bulk formulae <ref type="bibr">(Fairall et al., 2003)</ref>. All meteorological forcing except for winds are assumed to be spatially uniform over the model domain. Spatially variable winds for the region were obtained from the North American Mesoscale (NAM) analyses, a data-assimilating, high resolution (12 km) meteorological simulation (<ref type="url">https://www.ncei.noaa.gov/data/</ref> north-american-mesoscale-model/access/historical/analysis). Air temperature and barometric pressure were estimated by averaging the measurements at the six stations of the Narragansett Bay PORTS system (<ref type="url">http://www.co-ops.nos.noaa</ref> .gov/ports.html). Precipitation and relative humidity are from observations at T. F. Green Airport, in Warwick, RI. Net shortwave and downward longwave radiative fluxes were taken from the nearest ocean gridpoint of NOAAs North American Regional Reanalysis model (<ref type="url">http://www.emc.ncep.noaa.gov/mmb/rreanl</ref>). Upward longwave radiation was computed based on the ocean surface temperature in the model simulations.</p><p>Freshwater discharge from local rivers and the major waste water treatment facilities (WWTF) discharging into NB were applied as point source inflows. The discharges of many of the rivers are measured at United States Geological Survey (USGS) gauging stations <ref type="bibr">(Hunt, Palmer, Moshassuck, Woonasquatucket, Blackstone, Ten Mile, Pawtuxet, Taunton, Pawcatuck, Connecticut, Quinebaug, Yantic, and Shetucket Rivers)</ref>.</p><p>The Moshassuck and Woonasquatucket Rivers, which discharge into the upper Providence River, were combined in the model. Likewise the gauged discharges of the Quinebaug, Yantic, and Shetucket Rivers were combined to form the model Thames River. For the small rivers entering Greenwich Bay (Maskerchugg River and Hardig Brook) which are presently not gauged, historical flow measurements were used with simultaneous measurements from the nearby Hunt River to develop a linear regression model predicting the discharge of the former from gauged measurements from the latter river. The gauging stations varied in their proximity to the locations at which the rivers discharge into the model domain. In order to account for the river discharge from the portion of the watershed downstream of the gauging station, the measured discharges were scaled up using estimates of the drainage areas upstream and downstream of the gauge under the assumption that discharge/drainage area downstream is equal to its value upstream of the gauge. Discharges from four WWTFs (Fields Point, Bucklin Point, East Providence, and East Greenwich) in the upper/mid Bay region were obtained from the plant operators.</p><p>The WWTF point sources were implemented at a single ROMS gridpoint but the discharges for the rivers are spread over 2-5 gridpoints to reduce the tendency for model instability. River forcing in ROMS requires, in addition to the river discharge discussed above, specification of the vertical profile of the river inflow transport and the concentration of tracers in the inflowing water. The vertical profile of the river inflow was specified as linearly varying with zero transport at the bottom. Salinity of the inflowing water was set to 0. In the simulations discussed here, the river water temperature was also set to 0 which eventually leads to artificially cold rivers, but experimentation versus using more realistic temperatures reveals modestly lower temperatures at the observation sites in the Bay over the integration times used (especially in winter). Setting river temperature to 0 only affected the temperatures in zone 1 and 5 for winter where rivers have more influence (Figure <ref type="figure">4</ref> illustrates zone boundaries).</p><p>The cold bias found was about 4-6 K in zone 1 and 1-2 K in zone 5. The temperature at the grid points closest to buoys were not affected as all the observation locations shown in Figure <ref type="figure">2</ref> are sufficiently away from river sources. However, it is recommended for future operational simulations that time varying river water temperature be estimated using a regression equation involving air temperature as well as water temperature on the previous day.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Basic model validation</head><p>The model output has been compared with buoy data obtained from the Rhode Island Data Discovery Center (<ref type="url">http://ridatadiscoverycenter.org</ref>), where a variety of regional data are accessible. In particular, the model has been compared with moored observations collected at locations shown in Figure <ref type="figure">2</ref>. Figure <ref type="figure">3</ref> illustrates the best and worst matches for temperature and salinity of the model with the historical observations. Comparison of the model versus surface temperatures derived from LandSat also confirms that the patterns of heating and cooling are similar to the satellite data, although seasonality in OSOM is somewhat larger than in the satellite record (by roughly 1 &#8226; C in climatological comparisons).</p><p>71.5 71.4 71.3 71.2 71.1 71.0 Longitude 41.4 41.5 41.6 41.7 41.8 41.9 42.0 Lattitude CP BR NP MtV MtHB QP PP GB Stations Locations  -10-ESSOAr | <ref type="url">https://doi.org/10.1002/essoar.10504826.1</ref> | CC_BY_NC_4.0 | First posted online: Fri, 20 Nov 2020 07:52:10 | This content has not been peer reviewed. manuscript submitted to JGR: Oceans 0 10 20 30 40 50 60 15 20 25 30 surface temperature o C a Model-Observations temperature at MtHB 0 10 20 30 40 50 60 Days in July-August 16 18 20 22 24 bottom temperature o C c 0 10 20 30 40 50 60 10 15 20 25 30 surface salinity, psu b Model-Observations salinity at MtHB 0 10 20 30 40 50 60 Days in July-August 20.0 22.5 25.0 27.5 30.0 bottom salinity, psu d Observations Model 0 10 20 30 40 50 60 20 22 24 26 28 30 surface temperature o C e Model-Observations temperature at GB 0 10 20 30 40 50 60 Days in July-August 20 22 24 26 bottom temperature o C g 0 10 20 30 40 50 60 17.5 20.0 22.5 25.0 27.5 30.0 surface salinity, psu f Model-Observations salinity at GB 0 10 20 30 40 50 60 Days in July-August 22 24 26 28 30 bottom salinity, psu h Observations Model Figure 3. (a-d) Comparison of the Mount Hope Bay moored buoy observations of salinity and temperature at the surface (a, b) and maximum depth (c, d). This case is the closest match of the OSOM to the observations during the two months shown: July and August of 2006. (e, f, g, h) Comparison of the Greenwich Bay moored buoy observations of salinity and temperature at the surface (e, f) and maximum depth (g, h). This case is the poorest match of the OSOM to the observations during the two months shown: July and August of 2006. Red color represents the observed values and different colors show different ensemble members. Figures S1 to S6 in the supporting information compare the rest of the marked observation locations.</p><p>1K at the bottom, and 3 and 2 psu at the surface and bottom of MtHB. At GB, the errors at surface and bottom are up to 5 K and 6 psu and 2 K and 4 psu respectively.</p><p>The emphasis of this paper is on measuring the basic predictability of the OSOM as modeled in this version. It is not necessary for this assessment for the OSOM to be completely realistic, but these basic comparisons show that it has skill in reproducing realistic variability in temperature and salinity. Future work will address improvements in the model setup to reduce biases and errors, such as improving the assumed temperature of river inflows, parameterizations of mixing, evaluation of tides, different products for surface and offshore boundary conditions, etc.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Predictability using information theory</head><p>DelSole &amp; Tippett (2007) state that the two guiding principles for measuring predictability of a variable by contrasting the forecast and a climatology distribution should be 1) separate, non-identical measures for a given prediction, and 2) the measure of predictability should be invariant to linear transformation <ref type="bibr">(Schneider &amp; Griffies, 1999;</ref><ref type="bibr">Majda et al., 2002)</ref>. Measures of predictability using information theory are naturally invariant to linear transformations and will be explained in general in the following paragraphs.</p><p>Consider a signal, such as a variable or regional average of a variable modeled by the OSOM, X, having a probability distribution p i (x) when considered over a particular time or space interval. The probability distribution p i (x) is of the i th event (i th bin) after dividing the data into N bins. A fundamental quantity in information theory is the Shannon entropy <ref type="bibr">(Shannon, 1948)</ref> defined by</p><p>.</p><p>(1)</p><p>The entropy (with base 2 logarithm) is quantified in units of bits, because the Shannon entropy effectively measures the average amount of digital storage required to capture the information present in the variability of X.</p><p>To understand Equation 1 begin with the innermost term. <ref type="bibr">Hartley (1928)</ref> first proposed using the logarithmic function log 2 (1/p(x i )) to quantify information or uncertainity in an event having probability p(x i ). The formulation log 2 (1/p(x i )) implies that low probability events have higher uncertainty. <ref type="bibr">Shannon (1948)</ref> completed this measure by additionally weighting the logarithm with probability giving rise to the entropy definition Equation <ref type="formula">1</ref>, which resembles the thermodynamic entropy function in statistical mechanics resulting from a system that visits a set of equally probably states (e.g. <ref type="bibr">Sethna et al., 2006</ref>). Shannon's entropy is formulated so that high probability events reduce uncertainty with a strong weighting because they occur often <ref type="bibr">(Cover &amp; Thomas, 2012)</ref>. Shannon entropy quantifies uncertainty and the number of states needed to categorize a single probability distribution.</p><p>To compare two distributions p(x) and p(y) relative entropy and mutual information measures are useful comparative metrics. <ref type="bibr">Kleeman (2002)</ref> recommends the relative entropy (a.k.a., Kullback-Leibler distance Cover &amp; Thomas, 2012) for climate modelling, which is R =</p><p>Here, let X be the forecast and Y be the climatology. Recall that predictability measures the information contained in a particular forecast that is not present in the climatology, i.e., the information which stems from the forecast initial conditions. It is easy to see that if the forecast probability p(x i ) equals the climatology forecast p(y i ), R goes to zero indicating no distance or difference in information between the forecast and climatology. As a forecast evolves, during the time interval before R reaches zero, p(x) and p(y) are distinguishable (un-der similar levels of unpredictable noise) and after R reaches zero they are not, thus this time interval is the predictability window.</p><p>Within the predictability window, interchanging p(x i ) and p(y i ) changes the value of R, not just by sign from the logarithm, but also by magnitude due to the prefactor p(x). Thus, the relative entropy R depends on both p(x) and p(y) asymmetrically and will change if they are interchanged (i.e., the metric depends on which variable is considered the climatology and which is considered the forecast). Our potential predictability will compare different ensemble members where one is taken as forecast member, and from same ensemble a different member is taken as a climatology reference <ref type="bibr">(Kumar et al., 2014)</ref>. As the different ensemble members should be interchangeable in this approach, the magnitude of our metric (in contrast to R) should not change by interchanging the forecast and climatology, hence a different metric is preferred: mutual information.</p><p>Mutual information, I(X; Y ), is symmetric in X and Y , and hence is a natural metric of distance between these variables without direction. Let two random variables X and Y have joint probability p(x i , y j ) and marginal probability p(x i ) and p(y j ). X and Y are divided into N bins each (they can also be divided into different bins but we have used the same number of bins for simplicity). The mutual information I(X; Y ) between them is <ref type="bibr">(Cover &amp; Thomas, 2012</ref>)</p><p>Mutual information resembles relative entropy. In fact, it measures the relative entropy between the joint distribution p(x i , y j ) and the product of the marginal distributions (p(x i )p(y j )). If X and Y are independent variables, then p(x i , y j ) = p(x i )p(y j ) and thus I(X; Y ) = 0. However, if they are not independent, so that one contains information about the other, then there is mutual information shared and I(X, Y ) &gt; 0. If they are totally dependent, i.e., knowing the value of X reveals the value of Y and vice versa, then p(x i , y j ) = p(x i ) = p(y j ) for each value of i, j and the mutual information equals the Shannon entropy:</p><p>is the metric of the information shared by X and Y versus if they were independent variables. Mutual information between X and Y is symmetric and measures a distance between the two probability distributions. It quantifies the amount of information one variable contains about the other (again in bits). It can also measure the reduction in uncertainty of one distribution given knowledge of a second distribution, or the degree to which they are not independent <ref type="bibr">(Cover &amp; Thomas, 2012</ref>): I(X; Y ) measures the degree of statistical constraint of X on Y and vice versa <ref type="bibr">(Fano, 1961)</ref>. Mutual information is easily extended to more than one variable leading to a multivariate predictibilty analysis <ref type="bibr">(DelSole &amp; Shukla, 2010)</ref>.</p><p>Unlike relative entropy R, mutual information I(X; Y ) does not go to zero when p(x) approaches p(y), instead it approaches the Shannon entropy H(X) from Eq. 1. We use the property that I(X; Y ) approaches H(X) to delimit the predictability window, taken as when the probability distribution of the forecast and the climatology become effectively indistinguishable, taken to be the first time when I(X; Y ) reaches within 90% of H(X). This threshold is somewhat arbitrary, as convergence is not typically monotonic or complete, so any threshold will tend to have "near misses" and later signs of potential predictability as will be illustrated in a variety of figures in the text and supplementary material. However, to compare to the flushing timescales in later sections, a threshold is a simple test, and a range of predictability timescales is then formulated by comparing to individual climatology ensemble members as well as the climatology ensemble mean to appropriately gauge the level of certainty.</p><p>DelSole &amp; Shukla (2010) state that mutual information itself is a measure of forecast skill and provide skill scores founded on mutual information and relative entropy.</p><p>The metrics in Equations 1-2 are based on the probabilities of events, not the units or dimensions of the events, so their use on various parameters and between forecasts and climatology can be compared regardless of the type of variable: physical variables, biological variables, chemical variables, or sociological variables of arbitrary units can be compared. For this reason, these information theory metrics are ideal for evaluating forecast skill in a model like OSOM where a variety of applications are intended.</p><p>The metrics are also invariant under linear transformation of the signal and hence are robust to trivial changes such as changes of the units of measurement (DelSole &amp; Tippett (2007)), unlike alternatives such as the root mean square technique for skill assessment (for example, <ref type="bibr">Jin et al., 2018)</ref> which require normalization.</p><p>To find the predictability time scales of ROMS-OSOM we will compare ensembles members which differ in initial conditions. Hence our focus is on finding the potential predictability (model-model comparison) instead of actual predictability or model forecast skill (model -observation comparison, for example, <ref type="bibr">Kumar et al., 2014)</ref>. The climatology comes from the model simulations and is a result of past or historical forcings (hindcasts) with unperturbed initial conditions. It will be compared to forecasts with an anomaly of perturbed initial conditions that will eventually decay or be flushed out. The time it takes for the forecast to approach the climatology is the predictability time scale. In other words, the convergence between forecast member and climatology member signals the end of the predictability time period. After this period running the forecast is of no utility, and it will statistically resemble any climatological estimate without predictable consequences remaining from its initial anomaly. This decay occurs because even though an anomaly is introduced, the forcings and boundary conditions are identical between the climatology and the forecast. In a realistic forecast, the model would be initialized with observations and run with historical external forcings as future external forcings are unknown a priori. The initialization due to observations would create anomalies which are similar to perturbations we add to initial conditions in hindcasts to find potential predictability. Also, in a realistic forecast, the forecast signal will begin to diverge away from future observations and converge towards the model climatology signal-another sign marking the predictability time scale.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Ensemble setup</head><p>To begin, temperature and salinity were interpolated from hindcasts of the FV-COM model <ref type="bibr">(Beardsley &amp; Chen, 2014</ref>) and velocities were taken to be zero. From these conditions, the model was spun up for two months before analysis begins. Two months were estimated to be sufficient as the average flushing time in NB is about one month <ref type="bibr">(Pilson, 1985)</ref>, and post-analysis estimates of the predictability timescale confirm this conjecture. The initial conditions used for ensemble simulations were derived from one single spun-up simulation for each season taken from the boundary conditions for the year 2006. Simulations were performed in each of two seasons:</p><p>January-February (JF) and July-August (JA). The months JA were chosen because NB faces hypoxia during those months <ref type="bibr">(Codiga et al. (2009)</ref>), and JF was chosen as a contrasting alternative. For each season (JF, JA) there is a set of climatology ensemble members that were simulated consisting of 7 and 10 members respectively. The JF and JA climatology ensemble has two sets of corresponding forecast ensembles: one initialized by perturbing only temperature, and the other set initialized by perturbing only salinity.</p><p>Each climatology ensemble member is forced in the same way, but each has realistic initial conditions chosen from consecutive days selected from the spin-up run before the simulation start day <ref type="bibr">(Smith et al., 2007)</ref>. This method of building a climatology ensemble is perhaps unfamiliar to some readers, and differs from the typical average across multiple years of simulations (where the climatology is across varying forcing, rather than varying initial conditions). To create a larger contrast, the same The number of ensemble members is justified by deciphering whether external forcing (wind, tidal, river runoff, and evaporation/precipitation) or internal chaos (nonlinearities, eddies) is setting the trend for evolution of state parameters in the ensemble mean. The methodology of <ref type="bibr">Llovel et al. (2018)</ref> and <ref type="bibr">Leroux et al. (2018)</ref> is used as a guide. The ratio of "noise" to signal with respect to time was found, where noise is taken as the standard deviation of the model spread and signal is the mean over the ensemble. Let &#963; be the standard deviation of &#966; n i , which is also same as the model spread. The ratio &#963; i / &#966; i remains less than 0.5 within the predictability window and below 0.1 after crossing predictability time scale. <ref type="bibr">Llovel et al. (2018)</ref> state that a noise to signal ratio of less than 0.5 is sufficient so that external forcing is dominant in setting the ensemble mean variability over internal chaos, indicating also that model trend is captured sufficiently with this number of ensemble members. The upcoming companion paper by the authors expands on the approach of Llovel et al. ( <ref type="formula">2018</ref>) using information theory techniques to quantify forced versus internal variability even for non-Gaussian and non-independent datasets.</p><p>Let a variable in the climatology ensemble be given by c n t,i where t denotes time, i denotes spatial grid-point, and n is the ensemble member. Similarly, a variable in the forecast ensemble is f n t,i . The information entropy metrics have been calculated between forecast and climatology using two approaches: 1) Between running time win-dows (probability distributions of variability in t) of spatial volume weighted averaged data (i.e., averaged over i) in a zone or at an observation location, and 2) examining the covariability of spatial grid points (probability distributions based on i) within a zone at a fixed time. The advantage of the former is that it more naturally describes the evolution of slow variations over large regions of the Bay, while the latter can be used for very rapid convergence of variables with shorter predictability timescales.</p><p>The first running window approach is primarily used for evaluating predictability of temperature and salinity. First, data is averaged (volume weighted) over each zone.</p><p>Hence, &#931; i c n t,i dV i / (&#931; j dV j ) = c n t and &#931; i f n t,i dV i / (&#931; j dV j ) = f according to Equation <ref type="formula">2</ref>. Shannon entropy H(c) n t is also calculated from these histograms for ( c t , c t+&#964; ) according to Equation <ref type="formula">1</ref>.</p><p>The predictability time is taken to be when the mutual information averaged over  0 10 20 30 40 50 60 1 2 3 Celsius a Temperature Jan-Feb zone 6 10 20 30 40 50 Days 2 3 4 Entropy (bits) c 0 10 20 30 40 50 60 29.5 30.0 30.5 31.0 b Salinity Jan-Feb zone 6 Climatology ensemble Forecast ensemble 10 20 30 40 50 Days 3.0 3.5 4.0 4.5 5.0 d Shannon entropy Mutual information Figure 5. Predictability results for Zone 6 volume-averaged temperature (c) and salinity (d) in January to February. Top: Temperature (a) and salinity (b) timeseries from ensemble members is plotted for 7 climatology ensemble members (in black) and 7 forecast ensemble members (in red). Bottom: Information theory metrics (temperature (c) and salinity (d)) shows the convergence of mutual information (blue) with Shannon entropy (pink). The blue range indicates the forecast ensemble and the blue line is the forecast ensemble mean. The Shannon entropy of the climatological mean is shown at the top of the pink range and 90% of this value is shown as the bottom of the pink range. The mutual information converges to 90% of the Shannon entropy in 7-40 days (Table 1). Figures S14 to S19 in the supporting information show similar plots for other zones.</p><p>temporal probability distribution. The histogram intervals and bin sizes were chosen for each case such that the predictability time period is not sensitive to variations around those values (overly small or large choices show significant dependence on choices of binning and duration). The predictability timescale remains more sensitive to &#964; than the number of bins. While entropy and mutual information are both sensitive to data binning and duration choices, the timescale for mutual information to converge to Shannon entropy is less sensitive for the selected bin sizes and duration.</p><p>The second spatial variability method evaluates entropy using all spatial grid points within a zone. Let Z be the set of all grid points in a zone. I(f ; c) n t is evaluated from Equation <ref type="formula">2</ref>between the spatial histograms estimating the probability distribu- tions of c t,i&#8712;Z and f n t,i&#8712;Z . H(c) n t is evaluated using Equation 1 for c t,i&#8712;Z . This approach eliminates the need for time windows by comparing the spatial variation between the forecast and climatology ensemble mean. This methodology has a utility when predictability is short so a running window may be longer than the predictability timescale. For example, kinetic energy has low predictability and hence this approach is used and is shown for Zone 6 in Figure <ref type="figure">8</ref>.</p><p>Both the running window and spatial variability approaches use data without fixed references and are non-parametric. The data is not assumed to be Gaussian or any other distribution and hence our approach is robust towards all kinds of probability distributions, so long as the sampling is such that the histograms are an accurate representation of the probability distributions. Likewise, the method measures vari- ability by the same units of measure in the forecasts and climatology, so the units or standards of measurement are consistent regardless of whether physical, biological, environmental, or other metrics are chosen.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Predictability results</head><p>Figures <ref type="figure">5</ref> and <ref type="figure">6</ref> show typical temperature and salinity results, drawn for both seasons from Zone 6. Other zones are similarly illustrated in the supplementary material.</p><p>In each figure, the first row shows a timeseries comparison between the climatology ensemble (black) and forecast ensemble (red). The second row has information theory statistics, which permit a more precise time of convergence than just comparison of the timeseries in the upper row. Magenta shows H(X) and the range of H(x) n , the entropy of c t,i , blue members represent I(X; Y ) n and single blue line between blue shaded region is the average I(X; Y ) over all the I(X; Y ) n . of zones which progressively increase in volume from North to South are tabulated in Table <ref type="table">1</ref>. The combined zones enable us to compare the predictability time scale with flushing/turnover time scales evaluated over similar combined regions measured by distance from the northern end of the estuary to the southern end (Figure <ref type="figure">9</ref>).</p><p>Table <ref type="table">1</ref> compares the predictability timescales by region and season. The summer timescales tend to be longer, reflecting the typically drier conditions during summer of the year simulated. The timescales for salinity tend to increase as more and more of the Bay regions are included, indicating that anomalies persist somewhere within the Bay after initialization. For regions within the Bay, local circulations and patterns of mixing differ among the different regions, but few clear patterns emerge. Overall, the span of timescales is from 6.9 days to 40.5, indicating that predictions of a week or longer may potentially have skill, and that 1-2 months of spinup is necessary for initial condition effects to be lost and for forcing to become dominant.</p><p>Figure <ref type="figure">7</ref> shows an example of temperature and salinity predictability for a single grid point, for a location nearest to the Mount Hope Bay buoy (MtHB in Figure <ref type="figure">2</ref>).</p><p>Perhaps counter to intuition, the central predictability timescale estimates (temper-  <ref type="bibr">zone-averaged salinity: 18.5 [ 16.6 -31.2</ref> ] days), but note that the estimated ranges are consistently overlapping. There are many processes which would increase the amount of internal variability at a single location, such as meandering currents, waves, and other effects of flow-topography interaction. Thus, the predictability of an individual measurement location need not agree with the predictability of the region containing it, because of this internal variability would be missing from the zone averages. However, in this case and indeed for all of the monitoring buoy locations shown in Figure <ref type="figure">2</ref>, the buoys are deployed deliberately in locations thought to be representative of their section of the Bay rather than within a particular feature such as a regular plume or jet. Thus, the agreement in predictability timescales is perhaps not coincidental, but reflects judicious choices for observational advantage. Presenting results at this single location highlights the possibility of evaluating predictability metrics at one location, not just in regional averages, and the potential reasons why these two approaches may differ.</p><p>Likewise, predictability is not limited to temperature and salinity. The predictability of kinetic energy is shown in Figure <ref type="figure">8</ref> for Zone 6 and is less than 2 days. The mutual information converges towards Shannon entropy within a very short period, and the alternative method of calculating the probability distribution using spatial variability is needed. As will be shown in the next section, there is consistency between the timescales of freshwater and salinity flushing and predictability timescales, which argues that the estuarine circulation tends to dominate these tracers. However, anomalies in the kinetic energy within a region are much more quickly generated (by</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Turnover timescales</head><p>The turnover or flushing time scale is the time scale required for replenishment of a particular water mass in the estuary, based on its rate of resupply or removal. For a water mass having a volume V and volume flux rate Q, the flushing time scale is simply &#964; = V /Q (e.g., <ref type="bibr">Monsen et al., 2002;</ref><ref type="bibr">Rayson et al., 2016)</ref>. In the present study the freshwater turnover/flushing time scale and the salinity turnover time scale are calculated from the model output and compared with the predictability time scales.</p><p>The approach here follows <ref type="bibr">Lemagie &amp; Lerczak (2015)</ref> in comparing estuarine timescales by standard definitions, except here the estuarine timescales are also compared with the predictability timescale.</p><p>The freshwater volume is estimated using the relation </p><p>where Q r is the river supply and runoff.</p><p>The salinity turnover timescale follows the isohaline procedure of MacCready (2011). The fluxes of saline water masses are calculated for each salinity class. Let Q(s) be tidally averaged salinity flux corresponding to salinity s and be given by:</p><p>where double angled brackets denote temporal filtering over a tidal period with a Butterworth filter. A s is the cross sectional area having salinity greater than s. Q(s)</p><p>is the salinity flux for the salinity belonging in the range (s, s max ). Q(s) is evaluated laterally at a vertical cross section along the estuary, beginning at the north and proceeding south. The flux moving in, Q in and moving out, Q out , of the estuary is calculated using an integral over the salinity classes: The salinity predictability time scale is shown by red crosses, for Zone 1 and then the combined regions (1 to 2, 1 to 5, 1 to 7) in the last three rows of Table <ref type="table">1</ref>.</p><p>where "in" and "out" are evaluated on the basis of the sign of the integrand. Mac-Cready (2011) defines the fluxes as total exchange flow (TEF). The TEF relates to corresponding salt fluxes of</p><p>The MacCready (2011) approach results in the salinity turnover timescale of</p><p>Using above definitions, &#964; f and &#964; s have been found by considering a control volume with one end fixed at the mouth of Providence river at the northernmost end </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Discussion</head><p>The predictability timescales measure the persistence of statistical anomalies deviating from climatology that stem from the initial conditions. These anomalies might be detected to decay, through information theory metrics, by a variety of processes:</p><p>tidal or wind-driven mixing, being carried out of the Bay by advection, or becoming so well stirred by turbulent motions that they no longer persist as statistical anomalies. The consistency between the salinity and temperature predictability timescales and the salinity flushing timescales illustrates that it is likely that these anomalies are removed from the Bay primarily by the estuarine circulation whose timescale is estimated with the variety of flushing timescales shown. Even pointwise measurements tend to agree with their zone-average prediction timescale (Figure <ref type="figure">7</ref>), which indicates that the anomalies in OSOM temperature and salinity tend to be fairly mixed over broad areas, so that regions and buoys capture much the same information. It is not clear if this is true in the real Narragansett Bay to the same degree, but the consistency in the degree of variability between the modeled buoy locations and the buoy observations (Figure <ref type="figure">3</ref>) suggests that this may be.</p><p>The predictability timescale of kinetic energy is one to two orders of magnitude shorter than that of temperature or salinity (Figure <ref type="figure">8</ref>). This suggests that kinetic energy in NB is not governed solely by the estuarine overturning. Indeed, NB and the OSOM are highly tidally-driven -with the majority of the kinetic energy involved in the ebb and flow. Apparently, the propagation of the tidal energy into the Bay through waves, winds, currents, dissipation and drag, and generally perturbations to the surface elevation and kinetic energy, are a rather different set of processes operating on very different timescales from the estuarine overturning that transports the salinity and temperature anomalies and their predictability.</p><p>8 Conclusions:</p><p>This study has introduced the Ocean State Ocean Model (OSOM) and measures of its intrinsic timescales. The predictability timescales range from 6.9 to 40.5 days for temperature and salinity. The predictability timescales differ for different periods of the year and the region under observation-with generally longer periods for the larger basins and under drier conditions. These relationships are consistent with the expectations of estuarine circulation dominating the flushing of anomalies in salinity and temperature, and these predictability timescales are quantitatively similar to the range of estimates of flushing timescales.</p><p>Information theory proves useful for quantifying predictability. It can also be applied to other variables such as physical, biogeochemical, and environmental metrics that are being considered for forecasting with the OSOM. Not all variables have the same timescales, as some rely on processes that operate at different speeds.</p><p>While it is important to know the predictability timescales for understanding the constraints on spinning up a model and the potential length of a forecast, it is important to keep in mind that the skill of a forecast is not simply related to the predictability.</p><p>Here the model skill is adequate for the assessment of predictability (Section 2.1), but the model shows skill deficiencies in some locations, as highlighted here by comparison to observations at the Greenwich Bay buoy (Figure <ref type="figure">3</ref>). Such biases and errors in a model may not affect the predictability timescale, but they clearly reduce the value of a forecast. Future work in tuning the model parameterizations and improved forcing will increase model skill but are not expected to change the predictability. A higherresolution version of the model is expected to have better skill and lower biases, but the stronger chaotic transport and resolved eddying features in such a model are likely to decrease the predictability timescale (by increasing internal variability). This is one key reason why predictability metrics are not an aspect of Narragansett Bay itself, but only of this particular model: the OSOM.</p><p>In the case of temperature and salinity predictability in the OSOM, forced estuarine circulations tend to set the dominant timescales. Knowing this is useful in estimating forecast windows, spin up times, and sensitivity to forcing variability. Other systems, and perhaps the kinetic energy in this system, are dominated by internal variability rather than forced variability. A companion paper expands on this topic for coastal modeling, where a variety of different boundary forcing mechanisms can contribute.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>ESSOAr | https://doi.org/10.1002/essoar.10504826.1 | CC_BY_NC_4.0 | First posted online: Fri, 20 Nov 2020 07:52:10 | This content has not been peer reviewed. manuscript submitted to JGR: Oceans</p></note>
		</body>
		</text>
</TEI>
