### Considerations on the spatial resolution

In the AFM approach, while the best attainable resolution can reach the molecular scale, the main issue is the trade-off between resolution and scope, the latter being the scan size S. In fact, differing from optical and electron microscopy, the AFM images are digital maps where the measured quantity (cantilever deflection or oscillation amplitude) is serially sampled point by point at discrete spatial positions. Since the scan speed is relatively low due to the feedback response time (typically between 4 and 40 μm/s), a limited number of data-points is set, to maintain the overall image acquisition time to acceptable values (typically 2 to 20 min). Therefore, setting a given S value means setting the lower limit of achievable resolution to the value of S/√N, where N is the number of acquired data-points (i.e. image pixels). Details smaller than the pixel linear size S/√N are low-passed in the spatial frequency domain, and averaged out into the value measured at the considered position. In our case, with S = 30 μm and √N = 512, the smallest roughness features considered had linear size ~60 nm.

In order to verify whether this limit may affect our measurements, we have preliminarily analyzed control samples by repeating AFM imaging in the same region with √N increasing from 32 to 1024, in a geometrical ratio 2×, for a total of 6 data-points. This process has been repeated three times in different regions. The obtained sequences of R_{q} have all shown similar behavior. A representative case is reported in Fig. 1A, where the initial images (with lowest √N = 32, 64, and 128) of one such sequence are shown, which better describe the effect of insufficient and increasingly improved sampling of the surface features. In Fig. 1B the respective R_{q} values of the whole sequence have been plotted versus the actual number of sampling points N. As N is rapidly increasing on doubling √N, a logarithmic scale (with base 10) has been used for the corresponding axis.

Two traces have been plotted in Fig. 1B, with R_{q} values measured both from the AFM raw data (empty squares) and from the same data after zero order line-by-line flattening of the images (filled squares). The latter treatment is usually performed on the AFM images to remove artificial inconsistency among the different lines along the fast scan direction, due to drifts of the height offset in the instrument, which appears along the slow scan axis (vertical direction in our images). In this processing step a small amount of real R_{q} can also be removed. Therefore, the 'true' R_{q} value should lie between the two traces in Fig. 1B. In any case, the difference observed between the two traces is below 3%, with lower values for the flattened images (filled squares), as expected.

The error bars included in Fig. 1B are related to the difference between forward and backward scans of the same surface area. Despite the relatively large errors, lines have been traced that join consecutive data-points, which serve as guides to the eye. For both flattened and unflattened image data, these lines show a similar, roughly flat trend with Log(N), and comparable R_{q} values but for the highest N points, for which the error bars of flattened and unflattened value do not overlap. Concerning the error bars, they appear larger for the two leftmost data-points of both plots in Fig. 1B, (i.e. for the two lowest N values). Indeed, when the real surface is properly sampled, the fluctuations are expected to decrease both in spatial frequency (i.e. N spacing) and amplitude (i.e. R_{q} value and the respective standard deviation around its mean). In Fig. 1B both the mean R_{q} values of flattened and unflattened images, their difference, difference, and the respective standard deviations (error bar lengths) reach a minimum at N = 256^{2}.(fourth data-point starting from the left end of the plots). Therefore, our choice of N = 512^{2} for all the AFM images in the subsequent analysis guarantees that no R_{q} information from the analyzed surfaces is lost.

### Change in roughness upon air-polishing

Some AFM images of representative specimens are reported in Fig. 2A. The left panel in Fig. 2A shows the typical surface of a control specimen, whereas the middle and right panel show the surfaces of the composite material after AP at d = 2 mm and t = 5 s with bicarbonate and glycine powders, respectively. It may be qualitatively observed that the control is smoother than both treated specimens. Moreover, the specimen treated with glycine is smoother than the specimen treated with bicarbonate. In particular, sodium bicarbonate determined large depressions on the surface (typically 5-10 μm wide), whereas glycine was associated to smaller surface defects (typically 1-2 μm wide). These observations were consistent throughout most combinations of treatment distance and time.

Quantitative analysis of R_{q} values confirmed these findings. The R_{q} values resulting from the AFM images after AP for different t are represented in the Fig. 2B. The left half (light color bars) and the right half (dark color bars) of Fig. 2B refer to d = 2 and 7 mm, respectively.

For sodium bicarbonate (light and dark red bars) a trend towards an increase in R_{q} over t can be observed; on the other hand, for glycine (light and dark green bars) the R_{q} value reaches a maximum for the intermediate time t = 10 s, after which it seems to either decrease (for d = 2 mm, left half of Fig. 2B) or remain constant (d = 7 mm, right half of Fig. 2B). Overall, for both d = 2 and 7 mm, R_{q} increased in all groups with respect to the controls; this effect was already evident after only t = 5 s treatment.

The difference in R_{q} between treated specimens and controls was significant at all times for both powders (p < 0.01 for bicarbonate at all times and glycine for 10 s, p < 0.05 for glycine for 30 s), with the exception of glycine sprayed for t = 5 s. The application of glycine for t = 5 s was associated to the lowest R_{q} value among all the treated samples, reaching a significant difference in most comparisons (p < 0.05 vs bicarbonate at all times and vs glycine for t = 10 s).

Even if a trend towards an increase of surface damage with the increase of d was observed as in previous studies [12, 14], this difference was only significant for glycine sprayed for t = 30 s (p < 0.05). This can be partly due to the adjustment of the jet aperture cone at different spraying distance d, which was made to keep the treated area constant.

Overall, we have confirmed on a composite material used for dental restoration the observation - previously made in the literature only for natural teeth surfaces [11] - that during the AP process glycine powder determines less surface erosion than bicarbonate. Two different patterns for bicarbonate and glycine in the variation of R_{q} over the treatment time have also been identified in our measurement. In principle, and according to a previous study [14], an increase in surface damage may be expected over time, at constant distance. This effect has been observed for bicarbonate powder, at both considered distances, but not for glycine. In fact, particularly at a spraying distance d = 2 mm a maximum of damage after AP for t = 10 s has been observed with this powder. Such an effect may be attributed either to a loss in power of the AP device over time when using glycine powder [14], which was not observed during the experimental process, or to the lower particle size of glycine. Indeed, glycine particles are about four times smaller than sodium bicarbonate particles [11]. On the basis of visual assessment by AFM, we may speculate that the larger bicarbonate particles remove larger portions of composite surface, thus resulting in a linear increase of R_{q} at the adopted scan size S. On the other hand, glycine may determine smaller but most diffuse surface defects, determining a faster kinetics of damage, that may give rise to full surface coverage of defects, and thus result in a smoothing effect after removal of a whole material layer, within the considered treatment time (t = 30 s).

Concerning the clinical relevance of our measurements, comparison with the existing literature suggests that the RMS values reported in Fig 2 probably span the roughness range across which bacterial growth may indeed be activated or not, this step being found typically between ~200 nm and 800-2000 nm [24].

### Fractal character of the surfaces

Material surface features usually exhibit a fractal character right after growth [24, 25], (for example, thermal evaporated metal films normally evolve in clusters with cauliflower-like structure, which is a typical form of fractal geometry [24, 26]). Alternatively, a fractal character can arise as a consequence of surface treatment by physical or chemical methods [20, 27–30]. In the present case, both conditions of as deposited material (after preparation of the composite slab) and material that has undergone a surface treatment (namely AP) appear as the candidates for occurrence of a fractal character.

The goal of the fractal analysis presented here was to search for a possible correlation between AP results and an additional roughness parameter other than simple R_{q}. In the future, after proper in vivo testing of the treated surfaces, this novel measurement will possibly be checked against the clinical results of the obtained surfaces, such as the rate of bacterial growth.

In a Log-Log scale plot of R_{q}(S) for a fractal surface, it is possible to identify α (roughness exponent) as the slope (see subsection "Fractal analysis of the surface"), which for the function R_{q}(S) can be identified with the Hurst exponent H [21, 24]. This coefficient can provide the fractal dimension D of the surface, if fractal in character, since it is D = D_{E}-H, with D_{E} dimension of the Euclidean space in which the considered object is embedded [21, 25]. In our case it is D_{E} = 3, as the AFM height images are 3D surfaces z(x, y).

As an illustrative example of our measurements, a subset of a sequence of images with increasing S has been included in Fig. 3A. For each sample, two specimens out of 6 were chosen, and for each specimen two different regions were imaged. The data-points in the plot are the mean of all the measurements at a given scan size, and the error bars represent ±1σ (standard deviation) ranges around them. Similar to the preliminary analysis of the effect of image resolution on R_{q} (i.e. Fig. 1B), in all cases two curves, for raw data and for flattened images, have been plot.

In Fig. 3B the R_{q} values for the control (untreated) specimens are reported with respect to S, in a Log-Log plot, (log base 10). Clearly R_{q} increases over the whole S range considered, without reaching a plateau. With some deviation for the lowest S data-point, the measurements can be well fit by a straight line, with a common slope over more than three orders of magnitude for S. Therefore the surface displays a space-invariant relationship of its topographical features as they reflect in the R_{q}(S) function, and appears to be fractal.

As the slopes from the fits in Fig. 3B are in the range 0.31 ± 0.02, the fractal dimension is D_{control} = 2.69 ± 0.02.

Similar sequences of AFM images have been acquired also for selected cases of AP treated samples, and the respective processing has been performed. In order to find possible differences associated with the AP conditions, samples with the most different resulting R_{q} have been selected, according to statistical analysis. The 'least AP damaged' (i.e. lowest R_{q}) sample was the set of specimens treated with glycine at d = 2 mm and t = 5 s, whereas for the 'most AP damaged' (i.e. highest R_{q}) sample the specimens treated with bicarbonate at d = 7 mm and t = 10 s were selected. Same as for the control sample, the measurement has been repeated on two specimens from each sample, and on two different regions for each specimen. For each of the above four sets similar results were obtained, and in Fig. 4A and 4B representative sets for the 'most AP damaged' and for the 'least AP damaged' sample are reported, respectively. In these images the error bars only refer to the semidispersion of the forward and backward images on the same area.

It can be seen that the for the 'most AP damaged' sample set (Fig. 4A) the Log-Log R_{q}(S) data-points followed also a roughly linear trend as for the control sample, which means that a fractal character is preserved throughout the respective AP treatment. In fact, the residual-square correlation coefficients Δ^{2} between data-points and fitting lines in Fig. 4A is still as close to 1 as for the control data fits (see Fig. 3B). The slopes turn out to be 0.61 and 0.63 for the raw and for the flattened data in Fig. 4A, respectively, and 0.6 ± 0.1 for all the four sets altogether, such that the fractal dimension evaluated for this sample is D_{mostDamage} = 2.4 ± 0.1. This is lower than the D_{control}≅2.7 obtained for the control sample, possibly meaning that while still preserving the fractal character, the specimen surfaces treated in the considered AP conditions have undergone some loss of the complex structure arising from material deposition.

On the contrary, data-points in the 'least AP damaged' sample set (Fig. 4B) cannot be properly fit by a straight line. The slope values of the fitting straight line would be in the range of 0.45 ± 0.15 (for all the four sets altogether), corresponding to a D_{leastDamage} = 2.55 ± 0.15, intermediate to D_{mostDamage} and D_{control} and compatible with both of them. However, the Δ^{2} values of 0.91 and 0.93 show that no more fractal character of the surfaces appears over the whole S range considered, but this property has been removed by the optimized AP treatment.

In a previous work on fractal analysis of worn human dental surfaces [19], an increase in D appeared upon the decay of the surface quality, which was accompanied by an increase in R_{q}. In our case, one could expect that conditions of minimum R_{q} be associated with minimum D. In fact, the 'least AP damaged' sample cannot be compared with the 'most AP damaged' sample to this extent, as the former shows no fractal dimension at all. In turn, when comparing the control sample with the 'most AP damaged' sample a decrease in D, opposite to the increase in R_{q}, appears; however, one should keep in mind that D_{control} arises from material deposition, whereas D_{mostDamage} arises from its later treatment, so they can be hardly correlated. Obviously AP destroys the former kind of fractal character, and, when not optimized, induces a new, generally not correlated fractal character.