EFFECTIVENESS OF EVOLUTIONARY ALGORITHMS IN NEURAL NETWORK OPTIMIZATION TASKS: UNCONDITIONAL MINIMIZATION, PRUNING, AND HYPERPARAMETER OPTIMIZATION
DOI:
https://doi.org/10.26906/SUNZ.2026.1.045Keywords:
evolutionary algorithms, CMA-ES, neural networks, optimization, compression, pruning, efficiency, deep learning, zero-cost metrics, architecture search, AutoML, hyperparameter optimization, models, analysis, Bayesian optimization, neuroevolution, gradient-based methods, AdamAbstract
Relevance. Modern neural networks contain a significant number of redundant parameters, which increases training time, energy consumption, and hardware requirements. Standard gradient methods tend to get trapped in local minima on multimodal tasks; magnitude pruning ignores the structural importance of layers. Zero-cost proxies, particularly SynFlow, estimate architecture viability via signal propagation capacity without full training, yet their applicability boundaries remain insufficiently explored. Object of study: neural network optimization methods. Objective: a systematic experimental comparison of evolutionary, gradient-based, and Bayesian methods within a unified framework, with an emphasis on the role of SynFlow as a proxy for evaluating genome (architecture) fitness. Results. At extreme sparsity (98%), the combination of CMA-ES + SynFlow (signal propagation capacity analysis) with the addition of mathematical energy compensation overcomes network degradation and achieves State-of-the-Art performance (F1 = 0.81). This approach outperforms both classic magnitude pruning and gradient-based soft mask methods (SoftMask), requiring orders of magnitude less computational cost compared to approximate training-based approaches. At moderate sparsity (≤ 90%), magnitude pruning retains its superiority. In hyperparameter optimization (HPO), CMA-ES achieves practical equivalence with Optuna TPE in terms of solution quality, while in high-dimensional settings, it requires orders of magnitude less execution time. Using SynFlow as an objective function in HPO allows for accelerating the search by over 250 times, but yields inferior final accuracy compared to full training. Conclusions: Evolutionary algorithms are effective for multimodal tasks and structural optimization problems; for differentiable weight training tasks, Adam prevails (F1 ≈ 0.97 on Digits). Zero-cost proxies are context-specific.Downloads
References
1. Hansen N., Ostermeier A. Completely derandomized self-adaptation in evolution strategies // Evolutionary Computation. 2001. Vol. 9, No. 2. P. 159–195. https://doi.org/10.1162/106365601750190398
2. Hansen N. The CMA Evolution Strategy: A Tutorial https://arxiv.org/abs/1604.00772
3. Loshchilov I., Hutter F. CMA-ES for Hyperparameter Optimization of Deep Neural Networks https://arxiv.org/abs/1604.07269
4. Frankle J., Carbin M. The lottery ticket hypothesis: finding sparse, trainable neural networks // Proc. ICLR. 2019. https://arxiv.org/abs/1803.03635
5. Ding Y., Chen D.-R. Optimization based layer-wise pruning threshold method for accelerating CNNs // Mathematics. 2023. Vol. 11, No. 15. Art. 3311. https://www.mdpi.com/2227-7390/11/15/3311
6. Poyatos J. et al. EvoPruneDeepTL: An evolutionary pruning model for transfer learning based deep neural networks // Neural Networks. 2023. Vol. 158. P. 59–82. https://arxiv.org/abs/2202.03844
7. Chung K. T. et al. Multi-objective evolutionary architectural pruning of deep convolutional neural networks with weights inheritance // Information Sciences. 2024. Vol. 685. https://dl.acm.org/doi/abs/10.1016/j.ins.2024.121265
8. Tang S. et al. DarwinLM: Evolutionary Structured Pruning of Large Language Models https://arxiv.org/abs/2502.07780
9. Tanaka H., Kunin D., Yamins D. L. K., Ganguli S. Pruning neural networks without any data by iteratively conserving synaptic flow // Proc. NeurIPS. 2020. https://arxiv.org/abs/2006.05467
10. Abdelfattah M. S., Mehrotra A., Dudziak Ł., Lane N. D. Zero-cost proxies for lightweight NAS // Proc. ICLR. 2021. https://arxiv.org/abs/2101.08134
11. Krishnakumar A. et al. NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies // Proc. NeurIPS (Datasets and Benchmarks). 2022. https://arxiv.org/abs/2210.03230
12. Li Y. et al. Extensible and Efficient Proxy for Neural Architecture Search // Proc. ICCV. 2023. P. 6199–6210. https://openaccess.thecvf.com/content/ICCV2023/papers/Li_Extensible_and_Efficient_Proxy_for_Neural_Architecture_Search_ICCV_2023_paper.pdf
13. Lee J., Ham B. AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search // Proc. CVPR. 2024. P. 5893–5903. https://openaccess.thecvf.com/content/CVPR2024/html/Lee_AZ-NAS_Assembling_ZeroCost_Proxies_for_Network_Architecture_Search_CVPR_2024_paper.html
14. Akiba T., Sano S., Yanase T., Ohta T., Koyama M. Optuna: a next-generation hyperparameter optimization framework // Proc. KDD. 2019. https://doi.org/10.1145/3292500.3330701
15. Liu K., Wang R., Gao J., Chen K. Differentiable model scaling using differentiable topk // Proc. ICML. 2024. https://arxiv.org/abs/2405.07194
16. Dong P., Li L., Tang Z., Liu X., Pan X., Wang Q., Chu X. Pruner-zero: Evolving symbolic pruning metric from scratch for large language models // Proc. ICML. 2024. https://arxiv.org/abs/2406.02924
17. Sieberling O., Kuznedelev D., Kurtic E., Alistarh D. EvoPress: Accurate Dynamic Model Compression via Evolutionary Search // arXiv:2410.14649. 2024. https://arxiv.org/abs/2410.14649
18. Yu P., Wang J., Sui X., Ling N., Wang W., Jiang W. Efficient Post-Training Pruning of Large Language Models with Statistical Correction // arXiv:2602.07375. 2026. https://arxiv.org/abs/2602.07375
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Bohdan Hirianskyi, Bogdan Bulakh

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.