<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Generalized Fiducial Inference for Threshold Estimation in Dose–Response and Regression Settings</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>03/01/2022</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10340699</idno>
					<idno type="doi">10.1007/s13253-021-00472-0</idno>
					<title level='j'>Journal of Agricultural, Biological and Environmental Statistics</title>
<idno>1085-7117</idno>
<biblScope unit="volume">27</biblScope>
<biblScope unit="issue">1</biblScope>					

					<author>Seungyong Hwang</author><author>Randy C. Lai</author><author>Thomas C. Lee</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[In many biomedical experiments, such as toxicology and pharmacological doseresponse studies, one primary goal is to identify a threshold value such as the minimum effective dose. This paper applies Fisher's fiducial idea to develop an inference method for these threshold values. In addition to providing point estimates, this method also offers confidence intervals. Another appealing feature of the proposed method is that it allows the use of multiple parametric relationships to model the underlying pattern of the data and hence, reduces the risk of model mis-specification. All these parametric relationships satisfy the qualitative assumption that the response and dosage relationship is monotonic after the threshold value. In practice, this assumption may not be valid but is commonly used in dose-response studies. The empirical performance of the proposed method is illustrated with synthetic experiments and real data applications. When comparing to existing methods in the literature, the proposed method produces superior results in most synthetic experiments and real data sets.Supplementary materials accompanying this paper appear on-line.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>In pharmacological and toxicological dose-response experiments, a primary and important goal is to identify a threshold value for a specific purpose. For example, identifying the minimum effective dose (MED) of a medicine in the beginning stage of a dose-response experiment is important, as any dose level that is less than the identified level does not need to be considered in further studies. As another related example, the identification of the noobserved-adverse-effect-level (NOAEL) is equally important. It is because in general, as the dosage increases, the possibility of having adverse effects also increases. Therefore, once NOAEL is identified, one can minimize the adverse side effects from the medicine and also the cost of the experiments <ref type="bibr">(Bretz et al. 2008;</ref><ref type="bibr">Guo and Li 2014;</ref><ref type="bibr">Kim et al. 2016;</ref><ref type="bibr">Schwartz et al. 2001;</ref><ref type="bibr">West and Kodell 2005;</ref><ref type="bibr">Coffey and Gennings 2007)</ref>. In this context, the detection of MED or NOAEL can be considered as a threshold estimation problem (e.g., <ref type="bibr">Mallik et al. 2011)</ref>. In addition to dose-response studies, threshold estimation problems can also be found in environmental studies such as detecting changes in air pollutant levels <ref type="bibr">(Holst et al. 1996)</ref>.</p><p>Existing statistical methods for threshold estimation in dose-response experiments can be broadly grouped into two different categories <ref type="bibr">(Bretz et al. 2008</ref>): multiple comparison methods <ref type="bibr">(Williams 1972;</ref><ref type="bibr">Ruberg 1989;</ref><ref type="bibr">Hsu and Berger 1999;</ref><ref type="bibr">Tamhane and Logan 2002;</ref><ref type="bibr">Guo and Li 2014;</ref><ref type="bibr">Otava et al. 2017</ref>) and model-based methods <ref type="bibr">(Cox 1987;</ref><ref type="bibr">Muggeo 2003;</ref><ref type="bibr">Schwartz et al. 2001;</ref><ref type="bibr">West and Kodell 2005;</ref><ref type="bibr">Coffey and Gennings 2007;</ref><ref type="bibr">Lutz and Lutz 2009;</ref><ref type="bibr">Mallik et al. 2011)</ref>, while multiple comparison methods can be further divided into Bayesian (e.g., <ref type="bibr">Guo and Li 2014;</ref><ref type="bibr">Otava et al. 2017)</ref> and frequentist (e.g., <ref type="bibr">Williams 1972;</ref><ref type="bibr">Ruberg 1989;</ref><ref type="bibr">Hsu and Berger 1999;</ref><ref type="bibr">Tamhane and Logan 2002)</ref> approaches.</p><p>Multiple comparison methods require fewer assumptions, but they restrict the estimated MED to be one of the investigated levels, i.e., they will not produce any estimated MED value that is different from those dose levels that were experimented with-not even with interpolation between doses levels or similar techniques. On the other hand, model-based methods are capable of interpolating and extrapolating dose levels as they impose a stronger assumption that there exists a model that adequately captures the relationship between the dose levels and the response. Most drug development experiments assume that such a relationship is non-decreasing, which means the higher the dosage the stronger the therapeutical effects <ref type="bibr">(Bretz et al. 2008;</ref><ref type="bibr">Lutz and Lutz 2009;</ref><ref type="bibr">Mallik et al. 2011)</ref>. Of course, if the chosen model does not adequately capture the relationship, the estimated threshold value is unlikely to be close to the true value. This is a potential drawback of the model-based methods.</p><p>Both Bayesian (e.g., <ref type="bibr">Guo and Li 2014;</ref><ref type="bibr">Otava et al. 2017)</ref> and frequentist (e.g., <ref type="bibr">Williams 1972;</ref><ref type="bibr">Hsu and Berger 1999)</ref> multiple comparison methods typically define the hypotheses of interest according to the investigated dose levels. For example, if there are two dose levels, four hypotheses will be investigated: (i) none of the doses have any effect, (ii) only one dose has effect, (iii) both doses have the same effect, and (iv) the two doses have different effects. As the number of investigated dose levels increases, the number of hypotheses increases rapidly. Therefore, such methods (e.g., <ref type="bibr">Williams 1972;</ref><ref type="bibr">Hsu and Berger 1999;</ref><ref type="bibr">Guo and Li 2014;</ref><ref type="bibr">Otava et al. 2017</ref>) only considered the case when the number of dose levels is small (4 or 5). The Bayesian methods typically assign prior probabilities to the hypotheses and make final conclusions using the posterior probabilities.</p><p>For model-based methods, if one believes the relationship between the dosage and response is monotonic, one can model this problem with a regression function that is constant to the left of a certain covariate value t 0 , say, and is increasing to the right of t 0 <ref type="bibr">(Bretz et al. 2008;</ref><ref type="bibr">Lutz and Lutz 2009;</ref><ref type="bibr">Mallik et al. 2011)</ref>. Under this regression model, the main purpose is to make an inference about the value of t 0 , the threshold value. In dose-response experiments, it can be a MED, which starts showing a discernible effect compared to a control. A notable example is the break point estimation method developed by <ref type="bibr">Muggeo (2003)</ref>.</p><p>With a parametric segmented regression model commonly used in the threshold estimation problem, this method first detects if a threshold t 0 exists, and if yes, it then provides the maximum likelihood estimate (MLE) as well as a confidence interval for t 0 . Consequently, this method does not always produce an estimate for t 0 . More recently, in <ref type="bibr">Mallik et al. (2011)</ref> a method based on a novel p value framework was developed. The idea is to first conduct a sequence of tests that, for each covariate value, assesses whether the regression function is at its baseline, then calculate the corresponding p values, and lastly estimate t 0 by fitting a piecewise function to those calculated p values. This method, however, does not provide confidence intervals.</p><p>The main contribution of this paper is a new method for performing statistical inference for t 0 . This method is model-based and has the following properties:</p><p>&#8226; It is neither Bayesian nor frequentist; rather, it follows the generalized fiducial inference methodology of <ref type="bibr">Hannig et al. (2016)</ref>.</p><p>&#8226; Model averaging is employed to avoid the above-mentioned danger of selecting a wrong model.</p><p>&#8226; Unlike most existing methods, it provides point estimates as well as confidence intervals for t 0 .</p><p>To be more specific, four different parametric forms are used for model averaging: the first one is a step function with its jump point at t 0 , the second one begins with a constant function until it hits t 0 and then has a linear trend afterward, while the third and fourth are similar, except they have, respectively, a quadratic and a cubic trend after t 0 ; some of these parametric forms were used by previous authors (more below). The generalized fiducial inference methodology is used to compute a probability density function for these four parametric forms. With such a density function, the proposed method pools multiple threshold estimates from each of the four parametric forms to form a single point estimate for t 0 , as well as constructs a confidence interval for its value. The rest of this paper is organized as follows. Section 2 provides a brief introduction of generalized fiducial inference, while Sect. 3 states the problem formulation. In Sect. 4, the empirical performance of the proposed method is illustrated under various experimental settings, while in Sect. 5 the proposed method is applied to two real datasets. Lastly, concluding remarks are offered Sect. 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">A BRIEF INTRODUCTION OF GENERALIZED FIDUCIAL INFERENCE</head><p>Bayesian inference is an important statistical methodology that offers both point and uncertainty estimates for the unknown parameters via the calculations of posterior distributions. This provides more information compared to frequentist inference <ref type="bibr">(Xie and Singh 2013)</ref>. However, when no prior information is available, any naive application of the Bayesian methodology could cause concerns; see <ref type="bibr">Efron (2013)</ref> for an example in which different priors lead to very different results. To address this issue, <ref type="bibr">Fisher (1930)</ref> introduced fiducial inference, where the main idea is to transfer the randomness from the observed data into the uncertainties of the parameter estimates via the use of a switching principle. This switching principle is also used in maximum likelihood estimation and will be described further below. The uncertainties of the parameter estimates are summarized in the form of a distribution function, termed generalized fiducial density, which plays a similar role as the posterior distribution in the Bayesian setting.</p><p>Soon it was realized by other researchers that Fisher's fiducial idea only works for univariate problems, and hence, it did not receive much attention from the statistics community for a few decades. However, there has been a recent resurgence of interest in different variants of Fisher's idea, including Dempster-Shafer theory <ref type="bibr">(Dempster 2008)</ref>, inferential models <ref type="bibr">(Martin and Liu 2015)</ref>, confidence distribution <ref type="bibr">(Xie and Singh 2013)</ref>, and more generally fusion learning <ref type="bibr">Cheng et al. (2014)</ref>. All these variants use similar but different ideas to extend Fisher's idea to multivariate cases. The particular variant that this paper considers is the so-called generalized fiducial inference <ref type="bibr">(Hannig et al. 2016)</ref>.</p><p>Generalized fiducial inference (GFI) repairs many drawbacks of Fisher's original proposal and has been applied to various inference problems with great success, including wavelet regression, ultrahigh-dimensional models, nonparametric survival function estimation; e.g., see <ref type="bibr">Hannig and Lee (2009)</ref>, <ref type="bibr">Lai et al. (2015)</ref>, <ref type="bibr">Hannig et al. (2016)</ref>, <ref type="bibr">Cui and Hannig (2019)</ref> and references therein. GFI assumes that the data are generated from a data generating equation G and expresses the relationship between the data Y and the parameters &#952; as</p><p>where U is the random component whose distribution is assumed to be completely known. For many practical problems, it is the distribution of the (normalized) residuals. For the current problem, it is the distribution of u in (4) below, which is assumed to be N (0, 1). Similar to Fisher's maximum likelihood method, an important idea behind GFI is the so-called switching principle. That is, the roles of the random data Y and the deterministic parameters &#952; are switched after Y is observed: the observed data, denoted as y, are now treated as deterministic while &#952; is treated as random. By using (1) and this switching principle, we can derive a probability distribution of &#952; as follows.</p><p>Suppose for now, the data generating equation G is invertible for any u &#8712; U and any observed y. That is, the inverse G -1 always exists:</p><p>(2)</p><p>Recall that the distribution of U is completely known, thus one can generate a random sample {u 1 , u 2 , . . . , u n } of U and plug them into (2) and obtain</p><p>We shall call { &#952;1 , . . . , &#952; n } a fiducial sample of &#952; , which can be used to form point estimates and confidence intervals for &#952;, as similar to a posterior sample in the Bayesian context. One can see that through G -1 , the randomness in the data y is transformed into the randomness in the parameter &#952; .</p><p>Of course in practice, the inverse function G -1 in (2) does not always exist, and Hannig (2013) proposed a solution to deal with this issue.</p><p>From the above discussion, one can see that there is a density function behind the fiducial sample { &#952;1 , . . . , &#952; n }. We shall call this density r (&#952;|y) the generalized fiducial density. When &#952; &#8712; &#8834; R p and y &#8712; R n , and under some regularity conditions, it can be shown that r (&#952;|y) admits the following expression <ref type="bibr">(Hannig 2013;</ref><ref type="bibr">Hannig et al. 2016)</ref>:</p><p>where f (y, &#952; ) is the likelihood, J (y, &#952; ) is a Jacobian-like quantity that admits the expression:</p><p>with</p><p>,</p><p>where A &#8712; R n&#215; p . The derivation of (3) is not straightforward, but it was obtained by applying the implicit function theorem and Jacobian transformation. Note also its similarity with the Jeffreys prior in the Bayesian methodology.</p><p>Depending on the problem settings, it may not be feasible to obtain closed-form expressions for J (y, &#952; ) or f (y, &#952; )J (y, &#952; )d&#952; . Especially, with the large sample size, the number of combinations of tuples is too large to consider all possible combinations. This problem can sometimes be solved by sampling the combinations of tuples and taking the mean of them (e.g., <ref type="bibr">Hannig et al. 2014)</ref>. Some other times, one could use Markov Chain Monte Carlo (MCMC) methods to generate fiducial samples without the need to evaluate the normalizing constant f (y, &#952; )J (y, &#952; )d&#952; . In fact, as described below, for the present problem we need to use MCMC to generate a fiducial sample.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">GENERALIZED FIDUCIAL INFERENCE FOR THRESHOLD ESTIMATION</head><p>As mentioned before, many dose-response studies assume a monotonic relationship between the dose levels x and the pharmacological effects y of the medicine (e.g., <ref type="bibr">Bretz et al. 2008</ref><ref type="bibr">Bretz et al. , 2010;;</ref><ref type="bibr">Pinheiro et al. 2006;</ref><ref type="bibr">Mallik et al. 2011</ref>). Hence, one can consider the following regression model:</p><p>where t 0 is the threshold, b 0 &#8712; R is the baseline, u is the error term (usually standard normal), and &#963; is the noise standard deviation. Also, we use the following to denote the observed data from a study:</p><p>where y i j is the measured pharmacological effect of i-th dosage from subject j, x i is the i-th dosage (x 0 represents control), n is the number of dosages, m i is the number of subjects with i-th dosage, and u i j iid &#8764; N (0, 1) The goal is to conduct statistical inference on t 0 , which is the MED in this dose-response setting. As mentioned in the introduction, we assume a constant value (&#946; 0 ) for the first portion of &#956;(x) (i.e., &#956;(x) for x &#8804; t 0 ) while we use four different non-decreasing functions to model &#956;(x) for x &gt; t 0 :</p><p>where I E is the indicator function for the event E.</p><p>At least for the cases of p = 1 and 2, these functions have been used by previous authors (Muggeo 2003; <ref type="bibr">Mallik et al. 2011</ref>) to model &#956;(x). We incorporate p = 3 and 4 with the expectation that ( <ref type="formula">5</ref>) is rich enough to capture the behaviors of the responses in most practical situations.</p><p>To proceed with inference on t 0 , we need to calculate the generalized fiducial density r (&#952;|y) for the above setup, where &#952; = {t 0 , p, &#963;, &#946; 0 , &#946; 1 } and y = (y i j ; &#8704;i, j).</p><p>Due to the indicator function, the four regression functions above do not satisfy a smoothness assumption required by <ref type="bibr">Hannig (2013)</ref>, and therefore (3) cannot be applied. To overcome this issue, the indicator function is replaced by a sigmoid function:</p><p>where C is a large constant that we set it to be C = 10 3 . In the sigmoid function, &#955; controls the slope near t 0 . See Figure S.1 for some examples of (5) with the indicator function replaced by the sigmoid function.</p><p>The data generating equation G for the current problem is:</p><p>for p = 1, 2, 3, 4. Now with this data generating equation ( <ref type="formula">7</ref>), one could attempt to calculate the generalized fiducial density r (&#952;|y) using (3). However, the analytic evaluation of the normalizing constant f (y, &#952; )J (y, &#952; )d&#952; in (3) is infeasible. Consequently, we need to resort to MCMC methods for generating fiducial samples for &#952; , as to be described below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">GENERATING FIDUCIAL SAMPLES</head><p>This subsection presents an MCMC algorithm for generating a fiducial sample for &#952; = {t 0 , p, &#963;, &#946; 0 , &#946; 1 }. We first provide a summary of the algorithm, then followed by the details. The steps of the algorithm are:</p><p>1. generate a t0 using (8); 2. given t0 obtained in the previous step, generate a p from (9); 3. given t0 and p obtained in the previous steps, generate a &#963; from (10); 4. given t0 , p and &#963; obtained in the previous steps, generate &#946;0 and &#946;1 using (11); <ref type="formula">12</ref>) is larger than a realization from Uniform(0, 1); otherwise, set &#952; new = &#952; old . Now, we present the detailed steps. Generating t 0 : To generate a value for t 0 , we use the truncated normal distribution with x 0 and x n as, respectively, the lower and upper bounds:</p><p>where t 0 old is the t 0 value from the previous iteration, and &#963; 2 t is used to perturb its value. In our practical implementation, we set &#963; t = (x nx 0 )/10 and chose the initial value of t 0 at random uniformly from [x 0 , x n ].</p><p>Generating p given t 0 : Suppose now we have a value for t 0 . We can calculate the conditional generalized fiducial density for each of the four models &#956; p (x) in (5), indexed by p = 1, 2, 3, 4. Let &#958; = {&#946; 0 , &#946; 1 , &#963; } &#8712; and denote the conditional generalized fiducial density as r ( p|t 0 , y):</p><p>Direct calculations give</p><p>where X p is the design matrix for &#956; p (x), RSS p is the corresponding residual sum of squares, and N is total number of observations (N = i m i ). Therefore,</p><p>Thus, given a value for t 0 , using (9) one can calculate the fiducial probabilities for the four models in (5), and from these probabilities, one can generate a model out of the four, or equivalently, generate a value for p. Generating &#963; 2 given (t 0 , p): Next, given t 0 and p, it can be shown that the conditional generalized fiducial distribution of &#963; 2 is inverse &#967; 2 :</p><p>where</p><p>Therefore, &#963; 2 can be sampled from (10).</p><p>Generating &#946; p = (&#946; 0 , &#946; 1 ) given (t 0 , p, &#963; ): It is straightforward to show that the generalized fiducial distribution of &#946; p condition on t 0 , p, &#963; 2 is</p><p>from which a value for &#946; p can be sampled from.</p><p>Accept/Reject: Executing the above steps will produce a value for &#952; = (t 0 , p, &#963;, &#946; 0 , &#946; 1 ) which we need to decide if it should be accepted or rejected. To do so, we need the generalized fiducial density r (&#952;|y):</p><p>where</p><p>.</p><p>To speed up the computations, we use the sampling technique proposed in <ref type="bibr">Hannig et al. (2014)</ref> to approximate the summation i | det(A p ) i |.</p><p>Denote those new parameter values generated using the above steps as &#952; new , and those parameter values in the previous MCMC iteration as &#952; old . Let T (&#952; new |&#952; old ) be the proposal distribution which is the product of (10), ( <ref type="formula">11</ref>) and ( <ref type="formula">8</ref>). If the ratio</p><p>is larger than a realization from Uniform(0, 1), we accept &#952; new . Otherwise, we use &#952; old again for the next iteration.</p><p>Repeating the above procedure, a fiducial sample for &#952; can be generated, and the methods implemented by <ref type="bibr">Plummer et al. (2006)</ref> can be applied to determine if the sample size is large enough. Once a sufficiently large fiducial sample is available, statistical inference can then be conducted. For example, one could use the average of all t0 's as an estimate of t 0 . Also, for a 95% confidence interval, one could take the quantiles 2.5% and 97.5 % of the t0 's.</p><p>Lastly, we stress that in our approach more than one model (see ( <ref type="formula">5</ref>)) are used to generate the above fiducial sample t0 's, and therefore reducing the risk of incorrect model selection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">NUMERICAL EXPERIMENTS</head><p>This section studies the empirical performance of the proposed method through a sequence of numerical experiments. In particular, the proposed method is compared to two existing methods in the literature: the MLE break point estimation method of Muggeo (2003) and the p value method of <ref type="bibr">Mallik et al. (2011)</ref>. Further details of these two methods can be found in the introduction. In sequel, we shall refer the proposed method as GFI, the method of <ref type="bibr">Mallik et al. (2011)</ref> as p value, and the method of Muggeo (2003) as MLE. The code for GFI can be downloaded from <ref type="url">https://github.com/vic-dragon/GFI_threshold</ref> Altogether five different test functions were used for &#956;(x):</p><p>&#8226; Test Function 1: &#956;(x) = 0.5 &#215; I {x&#8805;0.5} ,</p><p>&#8226; Test Function 2: &#956;(x) = 2(x -0.5) 2 &#215; I {x&#8805;0.5} ,</p><p>&#8226; Test Function 4: &#956;(x) = 0.05 sin(50x)I {x&lt;0.5} + 2(x -0.5) 2 &#215; I {x&#8805;0.5} ,</p><p>&#8226; Test Function 5: &#956;(x) = 0.1 -0.1(x -0.5)I {x&lt;0.5} + 0.5(x -0.5) &#215; I {x&#8805;0.5} .</p><p>These functions are displayed in Fig. <ref type="figure">1</ref>. Notice that only the first two functions satisfy the parametric assumption (5). Also notice that all five functions share the same threshold value t 0 = 0.5, although their shapes are quite different. However, for Test Function 5, one should treat t 0 = 0.6 as the threshold value (instead of 0.5), as it is the point that starts showing a discernible effect compared to the baseline <ref type="bibr">(Agathokleous et al. 2019</ref>). Some of these test functions have been used by other authors (e.g., Muggeo 2003; <ref type="bibr">Mallik et al. 2011</ref>). The data were generated as follows. First, we set x i = i/(n + 1) for i = 1, . . . , n. Then, we calculated y i j = &#956;(x i ) + i j , where &#956;(x) is one of the five test functions, i j iid &#8764; N 0, &#963; 2 , for j = 1, . . . , m and i = 1, . . . , n. Two values were used for &#963; = {0.1, 0.3}, while five different combinations of values were used for (m, n) = {(6, 5), (10, 10), (10, 20), (20, 50), (50, 100)}. We also investigated the performances of the methods under heteroscedastic noise: we generated data with the gradually increasing noise variance (&#963; + 0.2(x -0.5)I {x&#8805;0.5} ).</p><p>For each combination of experimental parameters, 200 data sets were generated. For each of these data sets, the three methods mentioned above were applied to estimate the threshold value t 0 = 0.5 for Test Functions 1 to 4 and t 0 = 0.6 for Test Function 5. The averaged biases and mean-squared-errors (MSEs) were calculated. Also, for the GFI and MLE methods, we constructed 95% confidence intervals for the threshold value. The empirical coverage rates (ECRs) and their averaged widths of these confidence intervals were calculated. The results can be found in Tables <ref type="table">1,</ref><ref type="table">2</ref>, S.1, and S.2. Note that the p value method does not produce confidence intervals and hence, is omitted for this comparison.</p><p>With the largest sample size that we tested (n = 50, m = 100), the GFI method typically took less than 2 min to process one data set with a Dell PowerEdge R360 server, 2015 model. The acceptance rates of the MCMC algorithm depend on the experimental configurations. They ranged from 0.022 to 0.572 for homoscedastic noise, and from 0.024 to 0.559 for heteroscedastic noise. More detailed information can be found in Tables S.3 and S.4.</p><p>Recall that the MLE method does not always produce an estimate t0 for t 0 (see Sect. 1). Thus, all the results in the previous tables are based on those cases where t0 were available. The percentages of time that MLE produced estimates are given in Tables S.5 and S.6. These percentages ranged from 42% to 100%, with most cases larger than 90%.</p><p>From the experimental results, the following empirical conclusions can be drawn. First, in terms of point estimation, no method is uniformly the best. The relative performances of the three methods depend largely on the test function. Nevertheless, it seems GFI provided the best results for a majority of the experimental configurations (72% of the experimental configurations for homoscedastic noise and in 66% of the experimental configurations for heteroscedastic noise), and that p value suffered from bias issues. For confidence intervals, one cannot conclude if GFI or MLE is uniformly better, although GFI gave better results in 88% of the experimental configurations for homoscedastic noise and in 96% of the experimental configurations for heteroscedastic noise. Therefore, in practice, our recommendation is that, if nothing is known about the shape of the underlying function, GFI would be the first method to apply.</p><p>Recall that the proposed GFI method uses four different parametric functions to model the underlying true function (see ( <ref type="formula">5</ref>)), and that the fiducial samples for p were sampled from the conditional distribution (9). Figures S.2 to S.5 report the relative percentages of the sampled values of p. As Test Functions 1 and 2 are special cases of (5), one can see that when m and n are large, GFI has a high tendency of selecting the true model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">APPLICATIONS WITH REAL DATA</head><p>In this section, two real data sets are analyzed with the proposed GFI method. We note that for one data set the monotonicity is in the opposite direction to model (5), while for the  Reported are the averages (&#215;10 2 ) and standard errors (&#215;10 2 , in parentheses) of the biases and the MSEs for the threshold estimates obtained by the GFI, MLE, and p value methods (smallest values are bolded). Also reported are the empirical coverage rates (ECRs) and the average widths (in parentheses) of the 95% confidence intervals produced by the GFI and MLE methods (ECRs that are closest to the nominal target rate of 95% are bolded). Recall that n is the number of dosages and m is the number of subjects for each dosage The first data set is related to a dose response experiment originated from a biological study of the effect of the 1-methyl-3-butylimidazolium tetrafluoroborate treatment on the IPC-81 rat cell lines; see <ref type="bibr">Ranke et al. (2004)</ref>. The purpose of this study is to assess environmental hazards through measuring toxicity of ionic liquids on the mammalian cell lines. It is of particular interest to estimate the lethal dose level which makes cell lines stop responding. The cell responses were measured as percentages of cell viability compared to control, taken at different dose levels (measured in &#956;M) of the treatment. The sample size is N = n &#215; m = 9 &#215; 9 = 81. The GFI method provides a threshold value estimate of t0 = 5.22 &#956;M, with a 95% confidence interval as (4.92 &#956;M, 5.34 &#956;M). The MLE method provides a threshold value estimate of t0 = 5.10 &#956;M, with a 95% confidence interval as (4.58 &#956;M, 5.62 &#956;M), while the p value method provides a threshold value estimate of t0 = 5.52 &#956;M. The observed data, as well as these results, are displayed in the left panel of Fig. <ref type="figure">2</ref>. One can see that, after the GFI estimated threshold 5.22 &#956;M, the cell culture dose not show any significant response. Also, the GFI point estimate for t 0 is not in the middle of the confidence interval. Instead, it is closer to the right side, which is very reasonable when inspecting the left panel of Fig. <ref type="figure">2</ref>. Lastly, as 100% of the fiducial sampled values from p = 2, it suggests that the underlying relationship can be well modeled with a piecewise linear function.</p><p>The second data set concerns a light detection and ranging (LiDAR) experiment. LiDAR is a technique for measuring the concentrations of various atmospheric particulates. Very often, it is used to monitor the air pollution status in an area. This second data set contains atmospheric atomic mercury measurements taken at the geothermal power plan Bella Vista in Italy (e.g., <ref type="bibr">Holst et al. 1996;</ref><ref type="bibr">Mallik et al. 2011)</ref>. There are 221 observations. The predictor is the distance from the measuring equipment to the source that reflected the light, with values ranging from 390 m to 720 m. The response is the logarithm of the received light ratio from two different wavelengths; see the right panel of Fig. <ref type="figure">2</ref>. The decreasing trend starting around range=540 m indicates the existence of mercury. And it is of interest in knowing exactly when this decreasing trend started. The GFI method gives a threshold value estimate of t0 = 543.55 m, with a 95% confidence interval as (541.47 m, 548.92 m). The MLE method provides a threshold value estimate of t0 = 544.66 m, with a 95% confidence interval as (538.04 m, 551.29 m), while the p value method provides a threshold value estimate of t0 = 541 m. These results are also displayed in the right panel of Fig. <ref type="figure">2</ref>. As similar to above, the point estimate from GFI is not in the middle of the confidence interval. In addition, 24% and 76% of the fiducial sampled values for p are 1 and 2, respectively. This suggests that the underlying relationship may not be simply modeled by either a piecewise step function or linear function.</p><p>From Fig. <ref type="figure">2</ref>, for both real data examples, one can see that the GFI and MLE methods gave very similar point estimates for t 0 , while the GFI confidence intervals are narrower than those from MLE. The point estimates from the p value method are outside the GFI confidence intervals: it seems to suggest that the p value estimates are biased, which are in agreement with the simulation results. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">CONCLUDING REMARKS</head><p>In this paper, we developed a method for estimating threshold under the a general setting used in many biomedical studies such as toxicology and dose-response studies. The proposed method is based on the generalized fiducial inference framework. Two noteworthy features of the proposed method are (i), in addition to point estimates, it also provides confidence intervals for the parameters of interest, and (ii) it allows the possibility of having multiple parametric models to capture the underlying pattern of the data so that the risk of model mis-specification is reduced. Results from numerical experiments and applications to real data suggest that the proposed method possesses excellent empirical properties. Some extensions of the current work are worth considering. One possibility would be to adopt a richer set of non-decreasing functions to model &#956;(x), as opposed to (5). This could be done quite straightforwardly. Another possibility would be to allow the occurrence of multiple change points, where the number of the change points is unknown a priori. However, this extension could cause some complications, as for example, one potentially would need to handle the issue that different fiducial samples could give different number of change points. At the moment, if there is no change point, our numerical experience suggests that our method will often return x 1 as the estimated threshold, while its performance is less predictable if there are multiple change points.</p><p>One could also investigate the theoretical properties of the proposed method. For example, if one is willing to make the unrealistic assumption that t 0 is given, one can show that as the sample size increases, the probability of selecting the correct model from (5) goes to 1, which leads to the following proposition. <ref type="bibr">Gao et al. (2020)</ref> and that t 0 is given. Denote the true value of p in (5) as p 0 . Then, the fiducial probability (9) follows r ( p 0 |t 0 , y) &#8594; 1 as n &#8594; &#8734;. This proposition can be proved following the arguments of <ref type="bibr">Gao et al. (2020)</ref>. Essentially, it involves showing that RSS p /RSS p 0 &#8594; 0 for any p = p 0 , which leads to r ( p 0 |t 0 , y) = &#8594; 1 as n &#8594; &#8734;.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proposition 1. Assume the technical conditions in</head><p>With Proposition 1, the results of <ref type="bibr">Hannig et al. (2016)</ref> also guarantee that inferences made by GFI enjoy asymptotically correct frequentist properties, i.e., confidence intervals constructed by r (&#952;|y) will achieve the nominated coverage levels when the sample size is large enough. However, the case of unknown t 0 is substantially more complicated and we plan to report our results in a future paper.</p></div></body>
		</text>
</TEI>
