New Publication: Extracting True Virus Spectra from SERS Data Using Deep Learning—A Novel Strategy for SERS Spectral Processing
- Yiping Zhao
- 2 days ago
- 3 min read
We’re thrilled to announce the publication of our latest research in ACS Sensors, titled:"Extracting True Virus SERS Spectra and Augmenting Data for Improved Virus Classification and Quantification."👉 Read it here
This work tackles one of the most persistent challenges in label-free virus diagnostics: the interference from background signals in Surface-Enhanced Raman Spectroscopy (SERS) that obscures the analyte’s true spectral features. Our team introduces a robust deep learning framework capable of extracting clean virus-specific spectra from complex biological backgrounds and using them to generate augmented datasets for highly accurate virus classification and quantification.
🔍 Why It Matters
In many real-world settings—such as saliva or inactivated viral transport media—target virus signals are buried under background noise. These interfering spectra not only make peak identification difficult, but also raise the detection limit and hinder machine learning model training. Our method enables:
Accurate extraction of virus-specific SERS spectra, even at low concentrations.
Effective augmentation of training datasets for machine learning without the need for extensive new measurements.
Reliable classification and quantification of respiratory viruses, even in complex media like human saliva.
🧠 The Technical Innovation
We developed a dual neural network framework that models each measured SERS spectrum as a linear combination of a virus signal and background signal, with an added noise term. By training the network on spectra collected at multiple concentrations, we extract what we call the Extracted True Virus Spectrum (ETVS) and the corresponding concentration coefficients.
These clean ETVS can then be used to reconstruct realistic synthetic spectra across a range of virus concentrations by introducing controlled spectral noise, enabling a powerful data augmentation strategy.
⚙️ Machine Learning Model Performance
Using the augmented spectra, we trained two XGBoost models—one for virus type classification and another for concentration regression. The results are remarkable:
92.3% overall classification accuracy for 12 respiratory viruses in water.
R² > 0.95 for concentration prediction across all virus types.
When tested on real saliva-based virus spectra:
91.9% classification accuracy
R² > 0.93 for quantification
This proves that the ETVS extracted from virus-in-water spectra can generalize effectively to complex biological environments like saliva.
🧪 Viruses Studied
We analyzed 12 respiratory viruses, including:
SARS-CoV-2 B.1
H1N1, H3N2 (influenza)
RSV-A2, RSV-B1
HMPV-A, HMPV-B
CoV-OC43, CoV-229E, CoV-NL63
Ad5
Flu B
This is one of the most comprehensive demonstrations of label-free SERS-based viral classification and quantification to date.
💡 Why This Approach Is Different
While prior studies rely on SERS tags or biochemical probes, our method is completely label-free. It reduces experimental burden, improves data utility through augmentation, and is readily adaptable to point-of-care systems. Notably:
Limit of detection (LOD) ranges from ~17 to 62 PFU/mL, suitable for clinical relevance.
No biochemical functionalization of the SERS substrate is required.
Spectral augmentation enables robust machine learning with minimal data collection.
🔭 Looking Forward
This approach opens exciting new possibilities:
Integration into portable Raman-based diagnostic systems
Application to bacteria, proteins, and small molecules in bodily fluids or food matrices
Development of real-time, AI-enhanced diagnostic tools for healthcare and environmental monitoring
🧑🔬 A Collaborative Effort
This interdisciplinary project brought together expertise from statistics, physics, engineering, and computer science. Congratulations to the entire team!
📌 Citation: Liu, Y.; Yang, Y.; Lu, H.; Cui, J.; Chen, X.; Ma, P.; Zhong, W.; Zhao, Y. Extracting True Virus SERS Spectra and Augmenting Data for Improved Virus Classification and Quantification. ACS Sensors, 2024. DOI: 10.1021/acssensors.4c03397

Comments