1. The challenge of setting peak-detection parameters
Peak detection is the step that turns a raw spectrum into interpretable information — peak positions, intensities, and widths that can be linked to structure, composition, or concentration. A threshold-based detector works by identifying candidate local maxima and keeping only those that satisfy criteria defined by detection parameters. The strengths of this approach are that it is transparent, flexible, and does not require a large volume of training data.
The real difficulty, however, lies in setting the parameters themselves. A widely used threshold-based peak-detection function offers up to eight tunable parameters, covering criteria such as minimum peak height, minimum distance between peaks, peak prominence, and peak width. These parameters are continuous real values, and there is no universal guideline or default that tells you which value a given spectrum requires.
More importantly, the parameters interact with one another: the optimal value of one parameter depends on the values of the others. For example, when prominence is changed, the optimal width may shift as well, so the parameters cannot be tuned one at a time in isolation. Finding suitable values is therefore a continuous, multi-dimensional search that depends on the specific character of each spectrum, which makes manual tuning difficult and impossible to scale when many spectra must be processed.
Among these parameters, the ones with the greatest influence on detection results are:
- Prominence — the vertical distance between a peak’s apex and the lowest contour line surrounding it. Because it measures how far a peak stands out relative to its surroundings (relative rather than absolute height), it is effective at separating real peaks from noise.
- Width — the horizontal extent of the peak, measured at a given height, reflecting how broad the peak is (its full width at half maximum, FWHM).
- Relative height — the fraction of the prominence at which the width is measured, ranging from 0 (at the apex) to 1 (at the base). It does not describe the peak directly but controls where the width is read, and so is coupled to the width parameter.
A major strength of this model is its ability to propose parameters that accurately detect both high- and low-intensity peaks within the same spectrum while minimizing false positives (mistaking noise for real peaks)—a balance that is notoriously difficult to achieve with other methods. Striking a balance between capturing faint, true materials signals and rejecting background noise is a constant challenge. This capability is crucial for untargeted analysis, where it is impossible to know beforehand which peaks will be important. By reliably capturing all true peaks up front, the model ensures that no critical data is missed before undergoing downstream screening and feature selection during post-processing.
How to set these values automatically for each spectrum — without manual tuning — first requires understanding what an “optimal” value actually looks like, which is the subject of the next section.
Fig.1 Schematic illustrations of the three main peak properties used as detection parameters: Prominence, Width, and Relative height. The figure is partly modified from Reference [4].
2. Discovering the “effective parameter space” and building the model
By studying the relationship between parameter values and detection performance, MI-6 found a key point: each spectrum does not have just one workable set of parameters, but many. In other words, near-optimal performance does not sit at a single point in parameter space but spans a region — what we call the effective parameter space. This region has a complex shape and shifts or changes size according to the spectrum’s characteristics, with simpler spectra tending to have broader regions.
Fig.2 (a,c) Different spectra have (b,d) distinct effective parameter spaces within the F1-score maps across combinations of parameters, such as prominence vs. width. The F1-score reflects peak detection performance, with 0 being the worst and 1 being perfect detection. The figure is partly modified from Reference [4].
This finding has two implications. On one hand, it reflects the complexity of the problem, because a single “best” value cannot fully represent the truth, and the relationship between a spectrum’s characteristics and its effective region is non-linear and complex. On the other hand, it is helpful for building a model, because the model does not need to predict an exactly correct value — it only needs to choose a value that falls inside the effective region in a way that matches the character of each spectrum.
This understanding led to the development of an RF model that learns the relationship between a spectrum’s shape and distinctive characteristics (such as noise level and average peak width) and the suitable parameter values. When given a new spectrum, the model analyses its characteristics and sets the parameters automatically. The trends the model learns are also physically sensible: for example, it sets a higher prominence for noisier spectra in order to suppress noise-driven maxima, and sets a width consistent with the peak breadth of that spectrum, indicating that the model is learning a meaningful relationship between spectral characteristics and parameters rather than guessing at random.
3. Performance and benchmarking
On generated spectra, the MI-6 model reaches an F1-score — the balance between false positives and false negatives — of about 0.94, whereas using the default parameters without any tuning gives nearly unusable results because it accepts every local maximum as a peak, producing a large number of false positives. This comparison shows that choosing parameters appropriate to the spectrum is a decisive factor in detection quality.
More significantly, the MI-6 model was trained on a small dataset — about 500 times smaller than the deep-learning model it was compared against (2,000 spectra versus roughly one million simulated spectra). Both models were then validated on more than 100 unseen experimental spectra not encountered during training, spanning XRD, Raman, and GC–MS, which have different characteristics (for example, XRD shows sharp peaks while Raman shows broader, often overlapping peaks). Peak positions were established as ground truth through manual analysis by expert researchers.
Fig.3 Peak detection results in experimental spectra of (a) XRD, (b) GC–MS, and (c) Raman, comparing peaks identified by human annotation (True peaks), our method, and the CNN model. Triangles below each spectrum mark detected peaks. The figure is partly modified from Reference [4].
Table 1. Peak detection performance (precision, recall, and F1-score) of the MI-6 model and the deep-learning baseline on unseen experimental XRD, GC–MS, and Raman spectra, evaluated against expert-annotated ground truth.
Spectral type | Method | Precision | Recall | F1 |
|---|---|---|---|---|
XRD | MI-6 model | 0.925 | 0.984 | 0.952 |
Deep learning | 0.515 | 0.252 | 0.327 | |
GC–MS | MI-6 model | 0.965 | 0.929 | 0.946 |
Deep learning | 0.378 | 0.176 | 0.236 | |
Raman | MI-6 model | 0.890 | 0.915 | 0.893 |
Deep learning | 0.092 | 0.140 | 0.105 |
The MI-6 model generalises well across all three techniques despite their very different characteristics and despite being trained on far less data. Its detected peak positions also agree closely with those established by expert researchers across all three spectral types — showing that, for the specific task of locating peaks, the model reproduces expert annotations reliably on this test set.
However, the comparison should be viewed fairly; the deep-learning model's low scores primarily reflect poor cross-domain transfer. In contrast, the clear strength of the MI-6 approach lies in its use of an interpretable detector, which enables it to transfer effectively across different spectral types. Because the experimental test set is modest in size and performance varies across techniques, these absolute numbers are best treated as indicative. Looking ahead, we expect performance to improve even further, as our model can be easily updated and redesigned for specific tasks and highly specific spectral datasets.
4. Conclusion
This approach preserves the interpretability and flexibility of threshold-based detection while removing its main practical bottleneck — per-spectrum parameter tuning — making it well suited to high-throughput analysis where detected peaks feed downstream property-prediction models. The effective-parameter-space view also points out that, in threshold-based detection, the correct mental model is a viable region of parameter combinations rather than a single optimal value, and the same idea extends naturally to more complex data such as two-dimensional spectra.
For an overview of how peak-based and full-spectrum methods fit into materials-informatics workflows, see the companion article “Spectral Data Analysis in Materials Science” ; for the deep-learning side of peak detection, see “Deep Learning for Spectral Peak Detection”.
References
- A. Kensert et al., Convolutional neural network for automated peak detection in reversed-phase liquid chromatography, J. Chromatogr. A 1672 (2022) 463005. https://doi.org/10.1016/j.chroma.2022.463005
- B. Lafuente et al., The power of databases: the RRUFF project, Highlights in Mineralogical Crystallography (2015) 1–30.
- J. Sundberg, Extended characterization of petroleum aromatics by off-line LC-GC-MS (dataset), 2021. https://doi.org/10.5281/zenodo.5121065
- T. Yoongsomporn, S. Kanharattanachai, P. Lueangratana, T. Yamashita, C.H. Chen, M. Irie, C. Jirayupat, Automated spectral peak detection with machine learning: Parameter optimization and effective parameter space analysis with SciPy's find_peaks, Chemometrics and Intelligent Laboratory Systems 265 (2026) 105651. https://doi.org/10.1016/j.chemolab.2026.105651


















