EFFECTIVENESS OF EVOLUTIONARY ALGORITHMS IN NEURAL NETWORK OPTIMIZATION TASKS: UNCONDITIONAL MINIMIZATION, PRUNING, AND HYPERPARAMETER OPTIMIZATION

Authors

  • Bohdan Hirianskyi
  • Bogdan Bulakh

DOI:

https://doi.org/10.26906/SUNZ.2026.1.045

Keywords:

evolutionary algorithms, CMA-ES, neural networks, optimization, compression, pruning, efficiency, deep learning, zero-cost metrics, architecture search, AutoML, hyperparameter optimization, models, analysis, Bayesian optimization, neuroevolution, gradient-based methods, Adam

Abstract

Relevance. Modern neural networks contain a significant number of redundant parameters, which increases training time, energy consumption, and hardware requirements. Standard gradient methods tend to get trapped in local minima on multimodal tasks; magnitude pruning ignores the structural importance of layers. Zero-cost proxies, particularly SynFlow, estimate architecture viability via signal propagation capacity without full training, yet their applicability boundaries remain insufficiently explored. Object of study: neural network optimization methods. Objective: a systematic experimental comparison of evolutionary, gradient-based, and Bayesian methods within a unified framework, with an emphasis on the role of SynFlow as a proxy for evaluating genome (architecture) fitness. Results. At extreme sparsity (98%), the combination of CMA-ES + SynFlow (signal propagation capacity analysis) with the addition of mathematical energy compensation overcomes network degradation and achieves State-of-the-Art performance (F1 = 0.81). This approach outperforms both classic magnitude pruning and gradient-based soft mask methods (SoftMask), requiring orders of magnitude less computational cost compared to approximate training-based approaches. At moderate sparsity (≤ 90%), magnitude pruning retains its superiority. In hyperparameter optimization (HPO), CMA-ES achieves practical equivalence with Optuna TPE in terms of solution quality, while in high-dimensional settings, it requires orders of magnitude less execution time. Using SynFlow as an objective function in HPO allows for accelerating the search by over 250 times, but yields inferior final accuracy compared to full training. Conclusions: Evolutionary algorithms are effective for multimodal tasks and structural optimization problems; for differentiable weight training tasks, Adam prevails (F1 ≈ 0.97 on Digits). Zero-cost proxies are context-specific.

Downloads

Download data is not yet available.

References

1. Hansen N., Ostermeier A. Completely derandomized self-adaptation in evolution strategies // Evolutionary Computation. 2001. Vol. 9, No. 2. P. 159–195. https://doi.org/10.1162/106365601750190398

2. Hansen N. The CMA Evolution Strategy: A Tutorial https://arxiv.org/abs/1604.00772

3. Loshchilov I., Hutter F. CMA-ES for Hyperparameter Optimization of Deep Neural Networks https://arxiv.org/abs/1604.07269

4. Frankle J., Carbin M. The lottery ticket hypothesis: finding sparse, trainable neural networks // Proc. ICLR. 2019. https://arxiv.org/abs/1803.03635

5. Ding Y., Chen D.-R. Optimization based layer-wise pruning threshold method for accelerating CNNs // Mathematics. 2023. Vol. 11, No. 15. Art. 3311. https://www.mdpi.com/2227-7390/11/15/3311

6. Poyatos J. et al. EvoPruneDeepTL: An evolutionary pruning model for transfer learning based deep neural networks // Neural Networks. 2023. Vol. 158. P. 59–82. https://arxiv.org/abs/2202.03844

7. Chung K. T. et al. Multi-objective evolutionary architectural pruning of deep convolutional neural networks with weights inheritance // Information Sciences. 2024. Vol. 685. https://dl.acm.org/doi/abs/10.1016/j.ins.2024.121265

8. Tang S. et al. DarwinLM: Evolutionary Structured Pruning of Large Language Models https://arxiv.org/abs/2502.07780

9. Tanaka H., Kunin D., Yamins D. L. K., Ganguli S. Pruning neural networks without any data by iteratively conserving synaptic flow // Proc. NeurIPS. 2020. https://arxiv.org/abs/2006.05467

10. Abdelfattah M. S., Mehrotra A., Dudziak Ł., Lane N. D. Zero-cost proxies for lightweight NAS // Proc. ICLR. 2021. https://arxiv.org/abs/2101.08134

11. Krishnakumar A. et al. NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies // Proc. NeurIPS (Datasets and Benchmarks). 2022. https://arxiv.org/abs/2210.03230

12. Li Y. et al. Extensible and Efficient Proxy for Neural Architecture Search // Proc. ICCV. 2023. P. 6199–6210. https://openaccess.thecvf.com/content/ICCV2023/papers/Li_Extensible_and_Efficient_Proxy_for_Neural_Architecture_Search_ICCV_2023_paper.pdf

13. Lee J., Ham B. AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search // Proc. CVPR. 2024. P. 5893–5903. https://openaccess.thecvf.com/content/CVPR2024/html/Lee_AZ-NAS_Assembling_ZeroCost_Proxies_for_Network_Architecture_Search_CVPR_2024_paper.html

14. Akiba T., Sano S., Yanase T., Ohta T., Koyama M. Optuna: a next-generation hyperparameter optimization framework // Proc. KDD. 2019. https://doi.org/10.1145/3292500.3330701

15. Liu K., Wang R., Gao J., Chen K. Differentiable model scaling using differentiable topk // Proc. ICML. 2024. https://arxiv.org/abs/2405.07194

16. Dong P., Li L., Tang Z., Liu X., Pan X., Wang Q., Chu X. Pruner-zero: Evolving symbolic pruning metric from scratch for large language models // Proc. ICML. 2024. https://arxiv.org/abs/2406.02924

17. Sieberling O., Kuznedelev D., Kurtic E., Alistarh D. EvoPress: Accurate Dynamic Model Compression via Evolutionary Search // arXiv:2410.14649. 2024. https://arxiv.org/abs/2410.14649

18. Yu P., Wang J., Sui X., Ling N., Wang W., Jiang W. Efficient Post-Training Pruning of Large Language Models with Statistical Correction // arXiv:2602.07375. 2026. https://arxiv.org/abs/2602.07375

Published

2026-02-13