Clinically, false negatives are much more harmful to patients than false positives. In the context of pulmonary distension, a false negative may result in higher ventilation pressures that lead to distension and VILI. In contrast, a false positive result may mean clinicians do not increase PEEP, and miss out on the potential for improved recruitment at the higher PEEP level. Hence, the comparative impact of a false negative is stronger than the impact of a false positive, and thus, a high sensitivity should be favoured when predicting the outcomes of potential treatment methods.

Elastance captures the static tidal pressure required for a given volume of inspired air. While PEEP is the model input that allows the model to fit to the pressure troughs in each breath, an accurate elastance allows an accurate model fit to PIP. If predicted elastance is too high, a predicted breath will have a PIP that is higher than the true PIP. Similarly, if predicted elastance is too low, the predicted PIP will be lower than the true PIP.

VILI often occurs at the alveoli, due to mechanical strain caused by high alveolar pressure [23]. Due to low flowrates in the lower bronchiole close to the alveoli and the compliance of the bronchial path, the PIP measured at the airway is generally higher than the pressure experienced at the alveoli. Since airway resistance reduces at high pressures, higher than expected PIP levels are more likely due to increases in elastance at high pressure. Such an outcome is indicative of alveolar distension. Although this clinical protocol did not utilise the end-inspiratory pause needed to approximate alveolar pressure, it utilised a proxy metric (PIP) that is easily available in typical clinical practice.

In this analysis, there were no false negative and only three false positive predictions from the NARX model over a prediction horizon of one PEEP step. In the three false positive cases, the mean difference between measured and predicted peak pressures was only 1.5 cmH_{2}O. For the NARX false positives that occurred at prediction horizons of two and three PEEP steps, the mean difference between measured and predicted PIP was 2.4 cmH_{2}O in both cases. Thus, the false positives did not actually represent poor prediction. However, the incidence of these false positives may imply that the basis function shapes slightly overestimate the onset of distension.

FOM(I) and FOM(II) were unable to always predict when measured PIP was likely to be greater than 40 cmH_{2}O (Table 1). The FOM assumes a constant elastance (Eq. 1). In reality, elastance changes during inflation according to recruitment and distension effects [24, 25]. In general, distension effects mean that elastance is likely to be higher at pressure close to 40 cmH_{2}O than at lower pressures, as most of the possible recruitment has already been achieved. Thus, in a situation with increasing elastance with PEEP, the single elastance term identified in the FOM will lead to underestimated PIP. This led to the high rate of false negatives given by FOM(I) and FOM(II). In contrast, the extrapolation of the NARX elastance shape to higher pressures enables the prediction of a higher elastance at higher PEEP, and thus very few false negatives occurred.

FOM(II) uses identification data from only one PEEP step previous to the prediction PEEP. This method better represented the clinical use of the FOM. While less data was used for parameter identification in comparison to FOM(I), FOM(II) was more successful at predicting PIP. Compared to FOM(I) the sensitivity of FOM(II) increased by an average of 0.14 percentage points for the three prediction horizons. This occurred because the single elastance of FOM(II) did not need to partially represent the lower elastance of the lower pressure ranges (Fig. 2). However, the predicted peak pressures of FOM(II) tended to be lower than the NARX prediction and had higher residuals (Fig. 3).

The FOM(I) and FOM(II) specificities were higher than the NARX model specificities for each prediction horizon. This was an expected result, as a FOM false positive would only occur if elastance was decreasing when PIP was near 40 cmH_{2}O. When peak pressure is close to 40 cmH_{2}O, elastance is very unlikely to decrease as all recruitable lung volumes are likely to be recruited at lower pressures. Alternatively, FOM false positives would be likely to occur in a decreasing PEEP scenario. A single elastance identified from high pressure data is likely to be too high for an accurate prediction when PEEP is decreased. In this scenario, the NARX would be expected to perform better than the FOM, due to the ability to easily extrapolate a continuous elastance to lower pressures.

The ROC curve analysis (Fig. 5) shows the possible diagnostic equivalence of the models as their diagnostic thresholds are varied. The area under the ROC curve for the NARX model was the largest and yielded a peak sensitivity and specificity of 0.98 and 1.00 when a threshold of 40.3 cmH_{2}O is used. Since the optimum threshold of the NARX is very close to 40 cmH_{2}O, the NARX predicted pressures were very precise in the region of clinical interest. The FOM(I) yielded a maximum diagnostic equivalence at 37.8 cmH_{2}O and had a sensitivity of 0.96 and specificity of 0.93 at this point. The FOM(II) optima occurred at 39.3 cmH_{2}O and had a sensitivity of 0.98 and a specificity of 0.96 at this point. While the NARX precision and accuracy exceeded the performance of the FOM, it should be noted that the performance of all models would be acceptable within the variance expected in clinical practice.

Clinically, once a dangerous PIP has been recorded, PEEP would not be increased further as the risk of distension and VILI would be high. Thus, outside of an initial recruitment manoeuvre, PEEP might not be increased to allow peak pressures beyond the clinician’s chosen threshold. In this analysis, we included all available data from the recruitment manoeuvres and thus contradicted the clinical process. However, this was necessary to establish that the models did not erroneously predict a low PIP at high PEEP. Furthermore, ceasing the evaluation at the first instance of PIP > 40 cmH_{2}O would generate an incorrect low true positive rate, and thus the sensitivity calculation would be negatively affected.

In this analysis we used data from seven patients on pressure controlled ventilation and three patients on volume controlled ventilation. In pressure controlled data, the PIP is a setting defined by the clinician, and thus it may seem strange to analyse the ability of the model to predict PIP in pressure controlled mode. However, none of the modelling approaches used in this analysis incorporated a priori information on the applied ventilator settings. In contrast, the modelling approaches provide a transfer function between pressure and flow. Thus the ability to predict pressure from flow data remains scientifically valid, even in pressure controlled mode, when the model does not use the ventilator settings as an input. Furthermore, parameterisation must be considered when model residuals are used to assess model performance [26]. In particular, in such cases extraneous parameters could enable improved fitting, but they could also confound the precise identification of physiologically meaningful parameters and thus limit the applicability of the model for prediction or extrapolation of behaviour. However, the level of model parameterisation is insignificant when the ability to precisely extrapolate beyond the identification domain is used to assess model suitability.

This model was designed specifically to capture non-linear elastance effects and uses pressure dependent elastance terms. However, we note that this formulation, like all models has limitations in certain behaviours. In particular, the model may well not perform well in some clinical cases, such as atelectasis or high auto-PEEP. However, the primary goal of this research was determine the model’s ability to predict the likelihood of exceeding a particular PIP threshold. Hence, it was critical to be able to extrapolate elastance beyond the pressure in the identification sets. Pressure and volume dependent elastance formulation are mathematically equivalent due to the generally monotonic behaviour of pressure and volume. However, integration of flow signals yields drift in volume values that can confound PIP prediction. Pressure signals can be more effectively calibrated across PEEP steps and are thus a much more stable parameter to extrapolate elastance.

While the cohort is representative of modern ICU patients, the sample size of nine patients and 16 RMs is relatively small. The method should be tested on a larger cohort under varying disease states and ventilation modes to confirm the results. Additionally, the threshold pressure of 40 cmH_{2}O was somewhat arbitrarily chosen. While this pressure would normally be considered high, it may not necessarily cause over-distension in all patients. However, the findings of this analysis imply that the NARX model’s variable elastance can enable accurate and precise PIP predictions at any pressures thresholds that are chosen with the intention to limit distension.

The NARX model predicted high peak pressures more accurately than the FOM, and importantly, had very low instances of false negatives. Zero false negatives occurred at the prediction horizon of one PEEP step. The NARX model may aid clinicians in deciding whether to raise the PEEP setting for individual patients, and avoid dangerous PEEP levels that may cause distension and VILI.