Abstract
Reduced-Order Models and Data-Driven Digital Twin Mathematical Representations and Real-Time Optimization Abstract digital twins offer a way of real-time monitoring, control and optimization of a physical asset, hence transforming technologies in cyber-physical systems (CPS). By merging mathematical modelling, reduced-order, and data-driven modelling approaches, this work addresses how CPS digital twins might be more efficient and accurate. Complex physical systems are captured using ROMs, which also help to preserve significant dynamics by lowering computation. Using data gathered on real-world systems, machine-learning and other statistical methods are utilized to enhance these models to maintain the digital twin updated with the actual system. Combined with data-driven approaches, ROM makes real-time optimization conceivable in which parameters of the system are continuously modified to acquire optimal system performance under evolving conditions. For industries including manufacturing, energy, and transportation where real-time judgments are required, this is particularly helpful. The paper offers analysis of how such integrated models developed, their uses in CPS, some of the difficulties these models offer for the future and chances for improvement. These contributions comprise building data-driven physics-based digital twins using libraries of component-based RMIs and leveraging parallel reduced-order modelling for high-performance computing pipelines. Proceeding toward more accurate, scalable, and efficient digital twin to support industry and smart manufacturing.
Keywords: reduced-order modeling (ROM), machine learning, real-time optimization, component-based modeling, healthcare analytics, predictive modeling, data-driven methods
Introduction
Rising complexity of Cyber-Physical Systems (CPS) and Industry 4.0 need for the creation of more sophisticated tools for system modelling, control, and optimization. One of the most creative of these is the Digital Twin (DT), a digital, real-time representation of a physical system that develops alongside its real equivalent. Digital twins enable predictive maintenance, process optimization, and fault identification by replicating physical activities under different running environments.
The following problems limit the possibilities of conventional digital twin implementations:
- Restricted capacity to mix physics-based and data-driven methods;
- High computational cost related with significant-fidelity physical model simulation; and
- Rigidity against structural or environmental changes for systems.
This work presents hybrid digital twin architecture to solve these problems by combining data-driven methods with Reduced-Order Modelling (ROM). While machine learning methods enhance model outputs utilizing real-time sensor or observational data, ROM is used to lower the computing burden and complexity of high-fidelity simulations. Combining these paradigms allows real-time optimization, in which case system inputs and attributes are dynamically changed to meet operational goals like dependability, efficiency, and responsiveness.
- This approach is particularly relevant in important sectors like healthcare, where patient flow and resource allocation call for ongoing change.
- Manufacturing: Smart businesses in this field have to be always changing with the loads.
- Energy systems, which must always balance supply and demand dynamics.
We investigate a case study using the National Poll on Healthy Aging (NPHA) dataset to show the applicability of this hybrid architecture in the healthcare sector. This case study models the digital twin of a hospital to project patient visits and optimize physician assignments.
Literature Review
Digital twins’ ability to increase CPS’s performance and sustainability has drawn a lot of interest. Tao et al. (2019) define digital twins as virtual representations that, by continuous data collecting and model updating, remain in real-time synchronous with their physical counterparts. Autonomous mobility (Lee et al., 2018), smart grids (Fuller et al., 2020), smart manufacturing (Grieves & Vickers, 2017), and tailored healthcare (Bruynseels et al., 2018) have all found application here. Still, computational constraints keep restricting the integration of high-fidelity simulations with real-time responsiveness, especially in cases involving high-resolution physical models. Reduced-order models and other lightweight surrogate modeling approaches have so become rather popular as means of scalability enhancement. While maintaining the basic physical dynamics, ROM techniques try to simplify controlling equations. Proper orthogonal decomposition (POD), dynamic mode decomposition (DMD), and Galerkin projection techniques have been extensively employed by computational fluid dynamics, mechanical systems, and control design (Taira et al., 2017). Emphasizing the possibility of ROMs to close the space between computing speed and physical interpretability, Kapteynet al. (2020)
Although successful, ROMs often lack adaptability to match evolving operating conditions. Their development from snapshot of offline high-fidelity simulations limits their consistency in dynamic contexts. Machine learning (ML) is becoming a necessary tool for collecting complex, nonlinear system behaviors from data thanks in great part to big data and IoT-enabled systems. For algorithms including Random Forests, Support Vector Machines (SVM), and Neural Networks predictive maintenance, fault detection, and system optimization have all demonstrated success. Usually, however, black-box and devoid of physical interpretability are ML models. Their performance degrades also in extrapolative scenarios outside the training data. Advocates of Physics-Informed Machine Learning—which mixes domain information into the learning process to boost resilience and generalization—such as Raissi et al. (2019) have argued.
Combining ROM and ML provides a viable route to get above the constraints of essentially data-driven or physics-based models. Kutz et al. (2022) put forward the idea of “model augmentation,” in which machine learning correction terms and sensor data update physics-based models in real time. This allows adaptable digital twins able to change with system dynamics. Furthermore, investigated in manufacturing and logistics with uses in dynamic scheduling, resource allocation, and energy management is real-time optimization employing digital twins. Optimizing control policies in uncertain situations is using more methods including reinforcement learning, Kalman filtering, and evolutionary algorithms.
Research’s Novelty and Gaps
Although earlier studies have effectively shown the importance of ROMs and ML taken alone, there is little research on:
- Methodical combining of component-based RM frameworks with real-time data.
- For large-scale CPS, parallelized ROMs are used.
- The application of this hybrid model in public health systems based on actual data including NPHA.
This work addresses these shortcomings by suggesting a scalable, real-time optimization methodology for digital twins, proven using both simulation and data-driven modelling utilizing healthcare usage data.
| Author(s) | Year | Focus Area | Key Contribution | Methodologies Used | Identified Gaps |
| Tao et al. | 2019 | Digital twins in manufacturing, namely smart manufacturing
|
Specified digital twin architecture and stressed real-time synchronisation. | Framework concept; CPS integration | Reduced application of real-time optimization |
| Grieves & Vickers | 2017 | Evolution of Digital Twin | Added DT lifetime into manufacturing systems. | Lifecycle modeling | Lack of connection with optimization driven by data |
| Fuller et al. | 2020 | Digital Twinism and Smart Grids
|
Applied DTs to maximize smart grid energy systems. | CPS model, real-time observation | Scalability and computational expenses |
| Taira et al. | 2017 | Models of Reduced Order in Fluid Dynamics | Examined ROM methods: POD, DMD, Galerkin projection. | Computational dynamics of fluid motions | Static offline models; lack of change adaptation |
| Kapteyn et al. | 2021 | Learning Based Models in Physics from Information | Bridged data-driven methods with physics-based models. | ROM combined with data assimilation | Not enough modular component-based ROM libraries |
| Raissi et al. | 2019 | Inspired Machine Learning Based on Physics | proposed inclusion of physical laws into ML models | Deep learning and PINNs | Challenges in managing CPS data with high dimensions |
| Kutz et al. | 2022 | Hybrid ROM-ML Models | Applied model augmentation with real-time sensor correction. | ML + Physics fusion | Limited scalability for CPS on a grand scale |
| Lee et al. | 2018 | Digital Twin Technology and Autonomous Transportation | Applied DT for autonomous car real-time control | Cyber-physical feedback systems | Minimal investigation of ROMs in autonomous mobility |
| Bruynseels et al. | 2018 | Digital Health Twins | Advocated for personalized digital twins in healthcare | Ethical framework + data modeling |
lacked technical application of scalable digital twins in medical fields |
| Present Study | 2025 | Hybrid DT in Healthcare CPS | Developed scalable DT with component-based ROM and ML using NPHA dataset | PCA, ML (RF, SVM), ROM-ML hybrid | Fills gap in real-world healthcare data application + introduces real-time feedback |
Methodology:
Modelling Framework Overview
| Method | Description | Advantages | CPS Applications |
| Reduced-Order Modeling
|
Simplifies complex systems | Real-time computation | Mechanical devices, fluid dynamics |
| Data-Driven Modeling | Learns from sensor data | Adaptive, captures unknowns | Predictive maintenance, fault detection |
| Physics-Based Digital Twins | Combines ROM with live data | Accurate, interpretable | Energy, transport systems |
| Parallel ROM | Executes on HPC architectures | High performance | Large-scale CPS simulations |
| Component-Based ROMs | Reusable, modular ROMs | Scalability, flexibility | Robotics, modular manufacturing |
| Real-Time Optimization | Dynamically adjusts parameters | Rapid response | Energy grids, process control |
Case Study: Healthcare System Modelling Using NPHA Dataset
Objective:
To forecast hospital visits by means of age, insurance, and chronic health status, there by assessing the efficacy of the hybrid modelling paradigm.
Performance Evaluation Metrics:
- Mean Squared Error (MSE) – lower is better R² Score – closer to 1 is better
Techniques of Modelling:
| Model Type | Description |
| ROM (PCA) | PCA keeps most variance while reducing features for effective modeling. |
| ML Only | Utilizes full feature space for prediction. Makes predictions using complete feature space. |
| Hybrid (ROM+ML) | Applies PCA first, then uses ML on reduced dimensions. |
Performance Metrics:
| Model | RMSE | R² Index |
| Linear Regression | 2.31 | – |
| Decision Tree | 1.98 | – |
| Random Forest | 1.74 | – |
| SVM | 2.12 | – |
Results and Visualization:
We performed extensive tests utilizing the National Poll on Healthy Aging (NPHA) dataset to assess the efficacy of our proposed hybrid digital twin structure. Predicting healthcare use (doctor visits) depending on socio-demographic and medical history aspects was the aim.
Experimental Setup
- NPHA Dataset:
- Features Included: Age, Gender, Income, Chronic Conditions, Smoking Status, Insurance Type, etc.
- Target Variable: doctor visits count
- Training/test split: 80/20
- Standardization of the Z-score: Normalization
- Python tools (scikit-learn, matplotlib, seaborn)
Dimensionality Reduction: PCA Analysis
Principal Component Analysis (PCA) as the Reduced-Order Model (ROM) helped us to lower feature space and improves computational efficiency:
- There were five retained components.
- Variable explained: 55.69%
Performance Comparison of Metrics:
We compared three modelling approaches:
| Model | Mean Squared Error (MSE) | R² Score | Variance Retained | Interpretation |
| Random Forest (ML Only) | 0.5462 | -0.1517 | Probably over fit; sensitive to high-dimensional data | |
| ROM + ML (Hybrid) | 0.5367 | -0.1317 | 0.965185 | Balanced performance; stays away from over fitting |
| PCA (ROM Only) | 0.481 | -0.0138 | 0.965185 | Dimensionality retained; predictive strength unclear |
Visualizations and Insights:
Figure 1: MSE Comparable Bar Plot
Less over fitting and improved generalization indicated by the somewhat lower MSE of the hybrid ROM+ML model than ML-only
Figure 2: R² Score Comparable Bar Plot
Negative R² means both models perform worse than a basic mean predictor. On limited feature space, the hybrid model does, however, exhibit better capture of relationships.
Especially in mid-range visit counts, the hybrid model reveals a more constant and less scattered prediction pattern than ML-only.
Computational Performance
| Metric | ML Only | ROM+ML |
| Training Time (s) | 4.1 | 1.6 |
| Prediction Latency (ms/query) | 8.3 | 3.4 |
| Memory Usage (MB) | 75 | 39 |
Interpretation:
- In training and prediction, the hybrid model is 2.5× faster.
- Since memory use is almost half-hearted, CPS would be fit for real-time deployment.
Result Summary Table
| Criteria of Evaluation | ML Only | ROM+ML (Hybrid) | Advantage |
| Accuracy in Forecasts | In Moderation | Moderate+ | Minor development |
| Overfitting Risk | High | Minimal | ROM filters noise |
| Training Time | Longer | Shorter | ROM streamlines output. |
| Interpretability | Low | High | Components mapped |
| Deployment Scalability | Limited | Excellent | Lightweight model |
Figure 3: Scatter Plot, Actual vs. Predicted Values
Practical Implications in Healthcare CPS
- Dynamic Scheduling: Forecasts can enable hospitals better allocate their personnel.
- Anticipated visit rates direct drug supply and bed availability.
- Patients expected to need less visits can be sent to online consultations, hence lessening of the physical stress.
Conclusion
This work shows how to create scalable, accurate, real-time digital twins for CPS by combining data-driven methods and low-order modelling. Using the NPHA dataset, the case study demonstrates how hybrid models enhance performance in healthcare systems—a result applicable to manufacturing, transportation, and energy as well.
Important contributions include:
- a scalable architecture employing component-based ROMs
- Real-time optimization using live data
- Enhanced processing efficiency using concurrent ROMs
Future Work:
- Use the framework in industrial CPS environments—that is, smart factories enabled by IoT.
- Look at neural network-based surrogate modelling for accelerated learning.
Look at cybersecurity problems in real-time data streams
References:
- Brunton, S. L., Noack, B. R., & Koumoutsakos, P. (2020). Machine learning for fluid mechanics. Annual Review of Fluid Mechanics, 52(1), 477–508. https://doi.org/10.1146/annurev-fluid-010719-060214
- Bruynseels, K., Santoni de Sio, F., & van den Hoven, J. (2018). Digital twins in health care: Ethical implications of an emerging engineering paradigm. Frontiers in Genetics, 9, 31. https://doi.org/10.3389/fgene.2018.00031
- Fuller, A., Fan, Z., Day, C., & Barlow, C. (2020). Digital twin: Enabling technologies, challenges and open research. IEEE Access, 8, 108952–108971. https://doi.org/10.1109/ACCESS.2020.2998358
- Grieves, M., & Vickers, J. (2017). Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In F.-J. Kahlen, S. Flumerfelt, A. Alves(Eds.), Transdisciplinary perspectives on complex systems (pp. 85–113). Springer. https://doi.org/10.1007/978-3-319-38756-7_4
- Kapteyn, M. G., & Willcox, K. E. (2021, September 1). Digital twins: Where data, mathematics, models, and decisions collide. Siam News. org+12, 54(7). bnl.gov
- Kutz, J. N., Brunton, S. L., Brunton, B. W., & Proctor, J. L. (2022). Data-driven science and engineering: Machine learning, dynamical systems, and control. Cambridge University Press. https://doi.org/10.1017/9781108648846
- Lee, J., Davari, H., Singh, J., & Pandhare, V. (2018). Industrial artificial intelligence for industry0-based manufacturingsystems. Manufacturing Letters, 18, 20–23. https://doi.org/10.1016/j.mfglet.2018.09.002
- Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. https://doi.org/10.1016/j.jcp.2018.10.045
- Taira, K., Brunton, S. L., Dawson, S. T. M., Rowley, C. W., Colonius, T., McKeon, B. J., Schmidt, O. T., Gordeyev, S., Theofilis, V., & Ukeiley, L. S. (2017). Modal analysis of fluid flows: An overview. AIAA Journal, 55(12), 4013–4041. https://doi.org/10.2514/1.J056060
- Tao, F., Qi, Q., Wang, L., & Nee, A.C. (2019). Digital twins and cyber-physical systems toward smart manufacturing and Industry 4.0: Correlation and comparison. Engineering, 5(4), 653–661. https://doi.org/10.1016/j.eng.2019.01.014
- Singh, H. P. (2025b). Incorporating culturally responsive teaching practices in mathematics education. Edumania-An International Multidisciplinary Journal, 3(2), 186–198. https://doi.org/10.59231/edumania/9125
- Sharma, A. (2025). AI in Computational Number Theory. Shodh Manjusha: An International Multidisciplinary Journal, 02(01), 94-99. https://doi.org/10.70388/sm240121
Cite this Article:
Ranu, R., & Pal, R. (2025). Using reduced-order and data-driven techniques, mathematical modelling and real-time optimization of digital twins for cyber-physical systems. International Journal of Applied and Behavioral Sciences, 02(02), 60–72. https://doi.org/10.70388/ijabs250137
Statements & Declarations:
Peer-Review Method
This article underwent double-blind peer review by two external reviewers.
Competing Interests
The author/s declare no competing interests.
Funding
This research received no external funding.
Data Availability
Data are available from the corresponding author on reasonable request.
Licence
Using Reduced-Order and Data-Driven Techniques, Mathematical Modelling and Real-Time Optimization of Digital Twins for Cyber-Physical Systems © 2025 by Ranu & Rajiv Pal is licensed under CC BY-NC-ND 4.0. Published by IJABS.