Quantifying the Uncertainty of a Coupled Plume and Tephra Dispersal Model: PLUME‐MOM/HYSPLIT Simulations Applied to Andean Volcanoes

Numerical modeling of tephra dispersal and deposition is essential for evaluation of volcanic hazards. Many models consider reasonable physical approximations in order to reduce computational times, but this may introduce a certain degree of uncertainty in the simulation outputs. The important step of uncertainty quantification is dealt in this paper with respect to a coupled version of a plume model (PLUME‐MoM) and a tephra dispersal model (HYSPLIT). The performances of this model are evaluated through simulations of four past eruptions of different magnitudes and styles from three Andean volcanoes, and the uncertainty is quantified by evaluating the differences between modeled and observed data of plume height (at different time steps above the vent) as well as mass loading and grain size at given stratigraphic sections. Different meteorological data sets were also tested and had a sensible influence on the model outputs. Other results highlight that the model tends to underestimate plume heights while overestimating mass loading values, especially for higher‐magnitude eruptions. Moreover, the advective part of HYSPLIT seems to work more efficiently than the diffusive part. Finally, though the coupled PLUME‐MoM/HYSPLIT model generally is less efficient in reproducing deposit grain sizes, we propose that it may be used for hazard map production for higher‐magnitude eruptions (sub‐Plinian or Plinian) for what concern mass loading.

(concentrations in the atmosphere in kg/m 3 ) [e.g., Scollo et al., 2008;Costa et al., 2009; 78 Bonasia et al., 2010;Folch, 2012]. 79 The aim of the present study is therefore twofold. Firstly, we present a coupled version 80 of two different models: i) a renewed version of PLUME-MoM, a simplified 1-D plume 81 model developed by de'Michieli Vitturi et al. [2015], and ii) the HYSPLIT model [Stein et al.,  that is within the eruptive column and through atmospheric dispersion. Furthermore, the 92 uncertainty quantification represents an important aspect regarding hazard maps production. 93 In this article, after describing the eruptions chosen for the uncertainty quantification 94 (section 2.1), we present the PLUME-MoM and HYSPLIT models as well as the coupling of 95 these two models (section 2.2.1). Then we present the input parameters used for the 96 simulations (Section 2.2.2) and we describe the strategy adopted for the quantification of the 97 uncertainty of the coupled model (Section 2.3). Results presented in Section 3 serve as a basis 98 for the discussion in Section 4 about the uncertainties related to the input parameters and the 99 numerical models and about also the effectiveness of these models when used for producing 100 tephra fallout hazard maps. The four eruptions chosen for testing our simulations cover a wide range of eruptive 105 styles (sub-Plinian, violent strombolian, vulcanian, hydrovolcanic to long-lasting ash 106 emission), durations (from few hours up to more than 3 months) and magma compositions 107 (andesitic to rhyolitic/rhyodacitic). The criteria for selecting these eruptions were i) the 108 location of the volcanoes in the same geodynamic context, ii) the existence of both detailed 109 chronologies and meteorological data for the eruptions, and iii) the availability of reasonably 110 well constrained input parameters for the models.  Through frequent sampling missions, the ash emission rate was calculated and correlated with 120 the eruptive tremors, and it decreased during three emission phases following the conduit 121 opening [Bernard et al., 2016a]. 122 The fallout deposit was characterized by a very fine-grained ash with mostly blocky   approximately between 9 and 12 km above vent, then between 4 and 9 km during the 177 following week, and less than 6 km after June 14 th [Bonadonna et al., 2015a;Biondi et al., 178 2017].

179
During the eruption, the mass eruption rate (MER) fluctuated between 2.8x10 7 (during 180 the first days) and less than 5x10 5 kg/s after June 7 th [Bonadonna et al., 2015b] to June 20 th using the above-mentioned WRF-FALL3D code. The authors compared both the 190 column mass load (in ton/km 2 ) and ground deposit measurements between modeled and 191 observed values. With respect to deposit thickness measurements, they compared deposit 192 thicknesses at 37 locations, resulting in a best-fit line on a computed versus observed graphs.  varying a series of input parameters (i.e. air radial/wind entrainment, exit velocity, exit 214 temperature, water fraction and wind intensity). The above-mentioned authors showed that 215 plume height distribution was the widest when the parameters varied were the exit velocity, 216 exit temperature, water fraction and wind intensity. With respect to the sensitivity, de'Michieli 217 Vitturi et al. [2016] showed that initial water fraction had the strongest influence on plume height determination (i.e. the plume height decreased by a factor of ~1.54 when increasing 219 water content from 1 to 5 wt%).

220
HYSPLIT belongs to the family of Lagrangian Volcanic ash transport and dispersion 221 models, which have been used operationally since the mid 1990's by the International Civil 222 Aviation Organization (ICAO) to provide ash forecast guidance. The model solves the 223 Lagrangian equations of motion for the horizontal transport of pollutants (i.e. particles), while 224 vertical motion depends on the pollutant terminal fall velocity. The dispersion of a pollutant 225 may be described using three main types of configuration, "3D particle" "puff" or hybrid 226 "particle/puff". Particularly, in the "puff" configuration, pollutants are described by packets of 227 ash particles ("puffs") having a horizontal Gaussian distribution of mass described by a 228 standard deviation σ. The puffs expand with atmospheric turbulence until they exceed the size 229 of the meteorological grid cell (either horizontally or vertically) and then split into several 230 new puffs, each with their respective pollutant mass. In this work, the hybrid "particle/puff" 231 configuration has been used, in which the horizontal packets of particles have a "puff"  In the present study we coupled the PLUME-MoM and HYSPLIT models with an ad-243 hoc Python script, which computes for each grain size, from the output of the plume model, were identified first. We considered the mass flow rate (in kg/s), the initial water mass 254 fraction (in wt%) and the particle shape factor [Wilson and Huang, 1979;Riley et al., 2003]. 255 We chose these parameters because their uncertainty was higher and/or the models were more 256 sensitive to small variations of them. The procedure was aimed at minimizing the T 2 function where the sum is extended over N stratigraphic sections used in the inversion, w i are

Modelling features and input parameters
We tested four different types of meteorological data (GDAS, NCEP/NCAR, ERA-

265
Interim, ERA-Interim refined using WRF/ARW; see Text S1 from the Supporting 266 Information for details) with various spatial and temporal resolutions (see Table S1 in 267 Supporting Information), which correspond to the most widely used meteo data for studies 268 similar to ours.

270
After the end of each emission time (i.e. the actual duration of the eruption), a further amount 271 of 12 hours was added to the simulation in order to allow finer particles to settle down.

272
Simulations were performed in a forward way for all the four eruptions. However, a best-

303
[2016] we also used several unpublished data (see Table S1 from the Supporting

308
[2016a] to obtain hourly values (see Table S2 from the Supporting Information).

309
For the Tungurahua T13 eruption, the simulations also covered the whole eruption  335 We quantified the uncertainty of the coupled numerical model by comparing modeled 336 and observed values of key parameters of both the PM and the TTDM.

337
With respect to the PM, we compared the plume height (in meters above vent) 338 observed against the corresponding value at the same time (or at the closest measurement 339 available) given by the model. In this case it is important to remember that plume height in 340 PLUME-MoM is obtained as output value using a fixed MER.

341
For the TTDM, we compared ground deposit measurements and we adopted a specific    dispersal axes for all the simulations (see Figure S17 from the Supporting Information). The

409
The grain size data are scarce but we note that the computed Mdφ values are almost 410 always shifted toward coarser sizes (Fig. 3c) and that the σφ values show that the sorting of 411 the computed deposit is much smaller with respect to reality (Fig. 3d) used for the comparison (Fig. 2b).

418
The plume heights comparison (Fig. 4a) shows that all the simulations markly 419 underestimate the observations reported from both sources. The mean difference is about -2.1 420 km to -2.2 km ( Table 3). The difference of deposit main dispersal axes is small since the 421 simulations done using GDAS and ERA-Interim data are almost coincident with respect to 422 field data while the NCAR simulation is only 8° shifted toward the SW (Figs. 2b and 4b). 423 The observed values of mass loading ( Fig. 4b and  (Table 3). This is also shown by the absolute differences (MO and MU),  Supporting Information] were used for the comparison (Fig. 2c).  considered that most of the sections with underestimation are concentrated in proximity of the 459 main dispersal axis highlighted by field data (Fig. 5b). Notice that the NCEP/NCAR provides values but define trends mimicking that of the perfect fit line (Fig. 5d).

482
For this eruption, the simulations generally overestimate the plume heights observed, 483 which are lowered with the inverse procedure (see Table 5, Fig. 6a). The simulated deposit  axis given by field data, a pattern that is confirmed also by the simulations (see Fig. 6b). This 502 latter feature might be correlated with the progressive anticlockwise rotation of the ash cloud, observed ones (Fig. 6d). An important remark for the modeled grain sizes of the PCC11 527 eruption is that none of them show any bimodal distribution in contrast to the observed data.

528
This is particularly evident for the above-mentioned section n° 57, which does not have 529 bimodality and which has an Mdφ shifted toward more coarser-grained values. Mass loading values for the C15, T13 and T06 eruptions have been actually measured 577 for each section (with various methods), but for the PCC11 they have been determined by 578 multiplying the deposit thickness by a mean bulk deposit density value (see Section 3.4). This 579 latter aspect is critical since density of tephra fall deposits may vary considerably owing to 580 drastic density change between different particle sizes [e.g., Bonadonna and Phillips, 2003;581 Eychenne and Le Pennec, 2012; Pistolesi et al., 2015]. This is particularly important for the 582 PCC11 eruption that has the highest T 2 /MML values (see Table 5), which might also be 583 related to an uncertainty in the observed mass loading data. We also stress that the assumption The PLUME-MoM/HYSPLIT model tends generally to have more points 615 underestimating the mass loading data (see Tables 2 to 5). However, if the absolute mean The failure to take into account such mechanisms implies that the simulated finest-645 grained particles are transported much further that in reality. For instance, the C15 eruption 646 has a particularly fine-grained TGSD [due also to its hydrovolcanic nature, Bernard et al., 647 2016a] (see Table S2 from the Supporting Information) so that the mass is transported all over might be employed to correct the model by estimating its deviance from the observed data.

Conclusions
This paper presents the coupling of the PLUME-         Table 5. Values calculated for the uncertainty quantification for the PCC11 eruption.