International Journal of Applied and Behavioral Sciences (IJABS)

Using Reduced-Order and Data-Driven Techniques, Mathematical Modelling and Real-Time Optimization of Digital Twins for Cyber-Physical Systems

Abstract

Reduced-Order Models and Data-Driven Digital Twin Mathematical Representations and Real-Time Optimization Abstract digital twins offer a way of real-time monitoring, control and optimization of a physical asset, hence transforming technologies in cyber-physical systems (CPS). By merging mathematical modelling, reduced-order, and data-driven modelling approaches, this work addresses how CPS digital twins might be more efficient and accurate. Complex physical systems are captured using ROMs, which also help to preserve significant dynamics by lowering computation. Using data gathered on real-world systems, machine-learning and other statistical methods are utilized to enhance these models to maintain the digital twin updated with the actual system. Combined with data-driven approaches, ROM makes real-time optimization conceivable in which parameters of the system are continuously modified to acquire optimal system performance under evolving conditions. For industries including manufacturing, energy, and transportation where real-time judgments are required, this is particularly helpful. The paper offers analysis of how such integrated models developed, their uses in CPS, some of the difficulties these models offer for the future and chances for improvement. These contributions comprise building data-driven physics-based digital twins using libraries of component-based RMIs and leveraging parallel reduced-order modelling for high-performance computing pipelines. Proceeding toward more accurate, scalable, and efficient digital twin to support industry and smart manufacturing.

Keywords: reduced-order modeling (ROM), machine learning, real-time optimization, component-based modeling, healthcare analytics, predictive modeling, data-driven methods

Introduction

Rising complexity of Cyber-Physical Systems (CPS) and Industry 4.0 need for the creation of more sophisticated tools for system modelling, control, and optimization. One of the most creative of these is the Digital Twin (DT), a digital, real-time representation of a physical system that develops alongside its real equivalent. Digital twins enable predictive maintenance, process optimization, and fault identification by replicating physical activities under different running environments.

The following problems limit the possibilities of conventional digital twin implementations:

  • Restricted capacity to mix physics-based and data-driven methods;
  • High computational cost related with significant-fidelity physical model simulation; and
  • Rigidity against structural or environmental changes for systems.

This work presents hybrid digital twin architecture to solve these problems by combining data-driven methods with Reduced-Order Modelling (ROM). While machine learning methods enhance model outputs utilizing real-time sensor or observational data, ROM is used to lower the computing burden and complexity of high-fidelity simulations. Combining these paradigms allows real-time optimization, in which case system inputs and attributes are dynamically changed to meet operational goals like dependability, efficiency, and responsiveness.

  • This approach is particularly relevant in important sectors like healthcare, where patient flow and resource allocation call for ongoing change.
  • Manufacturing: Smart businesses in this field have to be always changing with the loads.
  • Energy systems, which must always balance supply and demand dynamics.

We investigate a case study using the National Poll on Healthy Aging (NPHA) dataset to show the applicability of this hybrid architecture in the healthcare sector. This case study models the digital twin of a hospital to project patient visits and optimize physician assignments.

Literature Review

Digital twins’ ability to increase CPS’s performance and sustainability has drawn a lot of interest. Tao et al. (2019) define digital twins as virtual representations that, by continuous data collecting and model updating, remain in real-time synchronous with their physical counterparts. Autonomous mobility (Lee et al., 2018), smart grids (Fuller et al., 2020), smart manufacturing (Grieves & Vickers, 2017), and tailored healthcare (Bruynseels et al., 2018) have all found application here. Still, computational constraints keep restricting the integration of high-fidelity simulations with real-time responsiveness, especially in cases involving high-resolution physical models. Reduced-order models and other lightweight surrogate modeling approaches have so become rather popular as means of scalability enhancement. While maintaining the basic physical dynamics, ROM techniques try to simplify controlling equations. Proper orthogonal decomposition (POD), dynamic mode decomposition (DMD), and Galerkin projection techniques have been extensively employed by computational fluid dynamics, mechanical systems, and control design (Taira et al., 2017). Emphasizing the possibility of ROMs to close the space between computing speed and physical interpretability, Kapteynet al. (2020)

Although successful, ROMs often lack adaptability to match evolving operating conditions. Their development from snapshot of offline high-fidelity simulations limits their consistency in dynamic contexts. Machine learning (ML) is becoming a necessary tool for collecting complex, nonlinear system behaviors from data thanks in great part to big data and IoT-enabled systems. For algorithms including Random Forests, Support Vector Machines (SVM), and Neural Networks predictive maintenance, fault detection, and system optimization have all demonstrated success. Usually, however, black-box and devoid of physical interpretability are ML models. Their performance degrades also in extrapolative scenarios outside the training data. Advocates of Physics-Informed Machine Learning—which mixes domain information into the learning process to boost resilience and generalization—such as Raissi et al. (2019) have argued.

Combining ROM and ML provides a viable route to get above the constraints of essentially data-driven or physics-based models. Kutz et al. (2022) put forward the idea of “model augmentation,” in which machine learning correction terms and sensor data update physics-based models in real time. This allows adaptable digital twins able to change with system dynamics. Furthermore, investigated in manufacturing and logistics with uses in dynamic scheduling, resource allocation, and energy management is real-time optimization employing digital twins. Optimizing control policies in uncertain situations is using more methods including reinforcement learning, Kalman filtering, and evolutionary algorithms.

Research’s Novelty and Gaps

Although earlier studies have effectively shown the importance of ROMs and ML taken alone, there is little research on:

  • Methodical combining of component-based RM frameworks with real-time data.
  • For large-scale CPS, parallelized ROMs are used.
  • The application of this hybrid model in public health systems based on actual data including NPHA.

This work addresses these shortcomings by suggesting a scalable, real-time optimization methodology for digital twins, proven using both simulation and data-driven modelling utilizing healthcare usage data.

Author(s) Year Focus Area Key Contribution Methodologies Used Identified Gaps
Tao et al. 2019 Digital twins in manufacturing, namely smart manufacturing

 

 

Specified digital twin architecture and stressed real-time synchronisation. Framework concept; CPS integration Reduced application of real-time optimization
Grieves & Vickers 2017 Evolution of Digital Twin Added DT lifetime into manufacturing systems. Lifecycle modeling Lack of connection with optimization driven by data
Fuller et al. 2020 Digital Twinism and Smart Grids

 

Applied DTs to maximize smart grid energy systems. CPS model, real-time observation Scalability and computational expenses
Taira et al. 2017 Models of Reduced Order in Fluid Dynamics Examined ROM methods: POD, DMD, Galerkin projection. Computational dynamics of fluid motions Static offline models; lack of change adaptation
Kapteyn et al. 2021 Learning Based Models in Physics from Information Bridged data-driven methods with physics-based models. ROM combined with data assimilation Not enough modular component-based ROM libraries
Raissi et al. 2019 Inspired Machine Learning Based on Physics proposed inclusion of physical laws into ML models Deep learning and PINNs Challenges in managing CPS data with high dimensions
Kutz et al. 2022 Hybrid ROM-ML Models Applied model augmentation with real-time sensor correction. ML + Physics fusion Limited scalability for CPS on a grand scale
Lee et al. 2018 Digital Twin Technology and Autonomous Transportation Applied DT for autonomous car real-time control Cyber-physical feedback systems Minimal investigation of ROMs in autonomous mobility
Bruynseels et al. 2018 Digital Health Twins Advocated for personalized digital twins in healthcare Ethical framework + data modeling  

lacked technical application of scalable digital twins in medical fields

Present Study 2025 Hybrid DT in Healthcare CPS Developed scalable DT with component-based ROM and ML using NPHA dataset PCA, ML (RF, SVM), ROM-ML hybrid Fills gap in real-world healthcare data application + introduces real-time feedback

Methodology:

Modelling Framework Overview

Method Description Advantages CPS Applications
Reduced-Order Modeling

 

Simplifies complex systems Real-time computation Mechanical devices, fluid dynamics
Data-Driven Modeling Learns from sensor data Adaptive, captures unknowns Predictive maintenance, fault detection
Physics-Based Digital Twins Combines ROM with live data Accurate, interpretable Energy, transport systems
Parallel ROM Executes on HPC architectures High performance Large-scale CPS simulations
Component-Based ROMs Reusable, modular ROMs Scalability, flexibility Robotics, modular manufacturing
Real-Time Optimization Dynamically adjusts parameters Rapid response Energy grids, process control

Case Study: Healthcare System Modelling Using NPHA Dataset

Objective:

To forecast hospital visits by means of age, insurance, and chronic health status, there  by assessing the efficacy of the hybrid modelling paradigm.

Performance Evaluation Metrics:

  • Mean Squared Error (MSE) – lower is better R² Score – closer to 1 is better

Techniques of Modelling:

Model Type Description
ROM (PCA) PCA keeps most variance while reducing features for effective modeling.
ML Only Utilizes full feature space for prediction. Makes predictions using complete feature space.
Hybrid (ROM+ML) Applies PCA first, then uses ML on reduced dimensions.

Performance Metrics:

Model RMSE R² Index
Linear Regression 2.31
Decision Tree 1.98
Random Forest 1.74
SVM 2.12

Results and Visualization:

We performed extensive tests utilizing the National Poll on Healthy Aging (NPHA) dataset to assess the efficacy of our proposed hybrid digital twin structure. Predicting healthcare use (doctor visits) depending on socio-demographic and medical history aspects was the aim.

Experimental Setup

  • NPHA Dataset:
  • Features Included: Age, Gender, Income, Chronic Conditions, Smoking Status, Insurance Type, etc.
  • Target Variable: doctor visits count
  • Training/test split: 80/20
  • Standardization of the Z-score: Normalization
  • Python tools (scikit-learn, matplotlib, seaborn)

Dimensionality Reduction: PCA Analysis

Principal Component Analysis (PCA) as the Reduced-Order Model (ROM) helped us to lower feature space and improves computational efficiency:

  • There were five retained components.
  • Variable explained: 55.69%

Performance Comparison of Metrics:

We compared three modelling approaches:

Model Mean Squared Error (MSE) R² Score Variance Retained Interpretation
Random Forest (ML Only) 0.5462 -0.1517 Probably over fit; sensitive to high-dimensional data
ROM + ML (Hybrid) 0.5367 -0.1317 0.965185  Balanced performance; stays away from over fitting
PCA (ROM Only) 0.481 -0.0138 0.965185 Dimensionality retained; predictive strength unclear

Visualizations and Insights:

Figure 1: MSE Comparable Bar Plot

Less over fitting and improved generalization indicated by the somewhat lower MSE of the hybrid ROM+ML model than ML-only

 

Figure 2: R² Score Comparable Bar Plot

Negative R² means both models perform worse than a basic mean predictor. On limited feature space, the hybrid model does, however, exhibit better capture of relationships.

Especially in mid-range visit counts, the hybrid model reveals a more constant and less scattered prediction pattern than ML-only.

Computational Performance

Metric ML Only ROM+ML
Training Time (s) 4.1 1.6
Prediction Latency (ms/query) 8.3 3.4
Memory Usage (MB) 75 39

Interpretation:

  • In training and prediction, the hybrid model is 2.5× faster.
  • Since memory use is almost half-hearted, CPS would be fit for real-time deployment.

Result Summary Table

Criteria of Evaluation ML Only ROM+ML (Hybrid) Advantage
Accuracy in Forecasts In Moderation Moderate+ Minor development
Overfitting Risk High Minimal ROM filters noise
Training Time Longer Shorter ROM streamlines output.
Interpretability Low High Components mapped
Deployment Scalability Limited Excellent Lightweight model

Figure 3: Scatter Plot, Actual vs. Predicted Values

Practical Implications in Healthcare CPS

  • Dynamic Scheduling: Forecasts can enable hospitals better allocate their personnel.
  • Anticipated visit rates direct drug supply and bed availability.
  • Patients expected to need less visits can be sent to online consultations, hence lessening of the physical stress.

Conclusion

This work shows how to create scalable, accurate, real-time digital twins for CPS by combining data-driven methods and low-order modelling. Using the NPHA dataset, the case study demonstrates how hybrid models enhance performance in healthcare systems—a result applicable to manufacturing, transportation, and energy as well.

Important contributions include:

  • a scalable architecture employing component-based ROMs
  • Real-time optimization using live data
  • Enhanced processing efficiency using concurrent ROMs

Future Work:

  • Use the framework in industrial CPS environments—that is, smart factories enabled by IoT.
  • Look at neural network-based surrogate modelling for accelerated learning.

Look at cybersecurity problems in real-time data streams

 References:

  1. Brunton, S. L., Noack, B. R., & Koumoutsakos, P. (2020). Machine learning for fluid mechanics. Annual Review of Fluid Mechanics, 52(1), 477–508. https://doi.org/10.1146/annurev-fluid-010719-060214
  2. Bruynseels, K., Santoni de Sio, F., & van den Hoven, J. (2018). Digital twins in health care: Ethical implications of an emerging engineering paradigm. Frontiers in Genetics, 9, 31. https://doi.org/10.3389/fgene.2018.00031
  3. Fuller, A., Fan, Z., Day, C., & Barlow, C. (2020). Digital twin: Enabling technologies, challenges and open research. IEEE Access, 8, 108952–108971. https://doi.org/10.1109/ACCESS.2020.2998358
  4. Grieves, M., & Vickers, J. (2017). Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In F.-J. Kahlen, S. Flumerfelt, A. Alves(Eds.), Transdisciplinary perspectives on complex systems (pp. 85–113). Springer. https://doi.org/10.1007/978-3-319-38756-7_4
  5. Kapteyn, M. G., & Willcox, K. E. (2021, September 1). Digital twins: Where data, mathematics, models, and decisions collide. Siam News. org+12, 54(7). bnl.gov
  6. Kutz, J. N., Brunton, S. L., Brunton, B. W., & Proctor, J. L. (2022). Data-driven science and engineering: Machine learning, dynamical systems, and control. Cambridge University Press. https://doi.org/10.1017/9781108648846
  7. Lee, J., Davari, H., Singh, J., & Pandhare, V. (2018). Industrial artificial intelligence for industry0-based manufacturingsystems. Manufacturing Letters, 18, 20–23. https://doi.org/10.1016/j.mfglet.2018.09.002
  8. Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. https://doi.org/10.1016/j.jcp.2018.10.045
  9. Taira, K., Brunton, S. L., Dawson, S. T. M., Rowley, C. W., Colonius, T., McKeon, B. J., Schmidt, O. T., Gordeyev, S., Theofilis, V., & Ukeiley, L. S. (2017). Modal analysis of fluid flows: An overview. AIAA Journal, 55(12), 4013–4041. https://doi.org/10.2514/1.J056060
  10. Tao, F., Qi, Q., Wang, L., & Nee, A.C. (2019). Digital twins and cyber-physical systems toward smart manufacturing and Industry 4.0: Correlation and comparison. Engineering, 5(4), 653–661. https://doi.org/10.1016/j.eng.2019.01.014
  11. Singh, H. P. (2025b). Incorporating culturally responsive teaching practices in mathematics education. Edumania-An International Multidisciplinary Journal, 3(2), 186–198. https://doi.org/10.59231/edumania/9125
  12. Sharma, A. (2025). AI in Computational Number Theory. Shodh Manjusha: An International Multidisciplinary Journal, 02(01), 94-99. https://doi.org/10.70388/sm240121

Cite this Article:

Ranu, R., & Pal, R. (2025). Using reduced-order and data-driven techniques, mathematical modelling and real-time optimization of digital twins for cyber-physical systems. International Journal of Applied and Behavioral Sciences, 02(02), 60–72. https://doi.org/10.70388/ijabs250137

Statements & Declarations:

Peer-Review Method

This article underwent double-blind peer review by two external reviewers.

Competing Interests

The author/s declare no competing interests.

Funding

This research received no external funding.

Data Availability

Data are available from the corresponding author on reasonable request.

Licence

Using Reduced-Order and Data-Driven Techniques, Mathematical Modelling and Real-Time Optimization of Digital Twins for Cyber-Physical Systems © 2025 by Ranu & Rajiv Pal is licensed under CC BY-NC-ND 4.0. Published by IJABS.