Generación de datos climáticos sintéticos a través de redes generativas adversariales condicionales para aplicaciones en agricultura de precisión
DOI:
https://doi.org/10.63728/riisds.v10i1.28Keywords:
CTGAN, datos sintéticos, temperatura, presión, simulaciónAbstract
The generation of synthetic data is fundamental in areas such as artificial intelligence and precision agriculture, where real data can be scarce. In this study, CTGAN (Conditional Tabular Generative Adversarial Networks) was employed to create synthetic temperature and pressure data using real records from the meteorological station at Mazatlán International Airport, Sinaloa, from February 2022 to October 2024. An exhaustive preprocessing was carried out, including cleaning, normalization, and temporal segmentation, preparing the data for training the CTGAN model by adjusting parameters such as the number of epochs, learning rate, and batch size. The results showed that CTGAN effectively replicated the general distributions of temperature and pressure, evidenced by comparative histograms and boxplots. However, limitations were observed in preserving complex multivariable correlations, such as the negative relationship between temperature and pressure, and in generating extreme values, indicating a tendency of the model to smooth out critical variations.
References
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN (Versión 3). arXiv. https://doi.org/10.48550/ARXIV.1701.07875
Aseeri, A. O., Zhuang, Y., & Alkatheiri, M. S. (2018). A Subspace Pre-learning Approach to Fast High-Accuracy Machine Learning of Large XOR PUFs with Component-Differential Challenges. 2018 IEEE International Conference on Big Data (Big Data), 1563-1568. https://doi.org/10.1109/BigData.2018.8621890
Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W. F., & Sun, J. (2017). Generating Multi-label Discrete Patient Records using Generative Adversarial Networks (Versión 3). arXiv. https://doi.org/10.48550/ARXIV.1703.06490
Corriger, J., & Goret, J. (2024). Développement et évaluation d’approches algorithmiques et par GAN pour la génération de données synthétiques en Allergologie. Revue Française d’Allergologie, 64, 103893. https://doi.org/10.1016/j.reval.2024.103893
Dosiadis, E., Katsogiannou, A., Nikitakis, E., Valiantza, E., Gerontidis, S., Soulis, K., & Kalivas, D. (2024, marzo 8). Assessing Global Climate Datasets for Small-Scale Agricultural Applications: The Case of Nemea, Greece. https://doi.org/10.5194/egusphere-egu24-3839
Elghamrawy, S. (2023). An AI-Based Prediction Model for Climate Change Effects on Crop production using IoT. 2023 International Telecommunications Conference (ITC-Egypt), 497-503. https://doi.org/10.1109/ITC-Egypt58155.2023.10206201
Goncalves, A., Ray, P., Soper, B., Stevens, J., Coyle, L., & Sales, A. P. (2020). Generation and evaluation of synthetic patient data. BMC Medical Research Methodology, 20(1), 108. https://doi.org/10.1186/s12874-020-00977-1
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks (Versión 1). arXiv. https://doi.org/10.48550/ARXIV.1406.2661
Goyal, M., & Mahmoud, Q. H. (2024). A Systematic Review of Synthetic Data Generation Techniques Using Generative AI. Electronics, 13(17), 3509. https://doi.org/10.3390/electronics13173509
Han, H., Zhang, M., Hou, M., Zhang, F., Wang, Z., Chen, E., Wang, H., Ma, J., & Liu, Q. (2020). STGCN: A Spatial-Temporal Aware Graph Learning Method for POI Recommendation. 2020 IEEE International Conference on Data Mining (ICDM), 1052-1057. https://doi.org/10.1109/ICDM50108.2020.00124
Herrera, H. (2021). Synthetic data starch potato system Veenkoloniën (Versión 1) [Dataset]. Zenodo. https://doi.org/10.5281/ZENODO.5016321
Karras, T., Laine, S., & Aila, T. (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4396-4405. https://doi.org/10.1109/CVPR.2019.00453
Mishra, A., & Misra, S. P. (2019). Upsilon decay widths in magnetized asymmetric nuclear matter (Versión 2). arXiv. https://doi.org/10.48550/ARXIV.1907.11380
Nyambo, D. G., Ngulumbi, N., Mduma, N., Sinde, R., & Lyimo, T. (2023). Data Synthesis Technique for Categorical Pestes Des Petits Ruminants (PPR) Data Using CTGAN Model. https://doi.org/10.20944/preprints202305.0777.v1
Patki, N., Wedge, R., & Veeramachaneni, K. (2016). The Synthetic Data Vault. 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 399-410. https://doi.org/10.1109/DSAA.2016.49
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (Versión 2). arXiv. https://doi.org/10.48550/ARXIV.1511.06434
Rajotte, J.-F., Bergen, R., Buckeridge, D. L., El Emam, K., Ng, R., & Strome, E. (2022). Synthetic data as an enabler for machine learning applications in medicine. iScience, 25(11), 105331. https://doi.org/10.1016/j.isci.2022.105331
Schröder, W., & Nickel, S. (2020). Research Data Management as an Integral Part of the Research Process of Empirical Disciplines Using Landscape Ecology as an Example. Data Science Journal, 19, 26. https://doi.org/10.5334/dsj-2020-026
Xu, L., Skoularidou, M., Cuesta-Infante, A., & Veeramachaneni, K. (2019). Modeling Tabular data using Conditional GAN (Versión 2). arXiv. https://doi.org/10.48550/ARXIV.1907.00503
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 RIISDS. Revista interdisciplinaria de ingeniería sustentable y desarrollo social

This work is licensed under a Creative Commons Attribution 4.0 International License.