SHALLOW ANN MODELS TO CLASSIFY UKRAINIAN AI-GENERATED TEXT
DOI:
https://doi.org/10.26906/SUNZ.2025.4.108Keywords:
artificial neural networks (ANN), shallow ANN models, AI-generated content (AIGC), LLM, AI detectorAbstract
In this study, we address the task of detecting AI-generated fragments within Ukrainian-language texts. The objective is to develop a tool capable of identifying content produced with the assistance of artificial intelligence, particularly in PDF documents related to the IT domain. The research explores and analyzes existing solutions and approaches currently available in this area. Several commercial AI-content detectors were evaluated using our custom datasets. The dataset was constructed by segmenting bachelor's theses from IT-related fields into fragments of approximately 1,000 characters each. Five artificial neural network models were tested using the custom dataset combined with a traditional NLP pipeline, achieving an accuracy of 87–88%. Given the complexity of the problem and the ethical considerations within the educational context, the classification results should be further validated by human experts. The current implementation can serve as a foundation for future improvements.Downloads
References
1. From Tool to Temptation: AI's Impact on Academic Integrity, UMass Amherst. [Online]. Available: https://www.umass.edu/ideas/news/tool-temptation-ais-impact-academic-integrity.
2. Bittle K., & El-Gayar O. "Generative AI and Academic Integrity in Higher Education: A Systematic Review and Research Agenda" Information 2025, 16(4), 296. Apr. 2025 doi: https://doi.org/10.3390/info16040296.
3. Fraser K. C., Dawkins H., Kiritchenko S. "Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods". Journal of Artificial Intelligence Research 82 (2025) pp. 2233-2278. doi: https://doi.org/10.1613/jair.1.16665.
4. Abdali S., Anarfi R., Barberan CJ, He J. "Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text". KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 6428-6436, Aug. 2024. doi: https://doi.org/10.1145/3637528.3671463.
5. Li Y., Li Q., Cui L., Bi W., Wang L., Yang L., Shi S., & Zhang Y., "Deepfake Text Detection in the Wild". 2023. URL: https://ar5iv.labs.arxiv.org/html/2305.13242.
6. Li Y., Li Q., Cui L., Bi W., Wang Z., Wang L., Yang L., Shi S., & Zhang Y., "MAGE: Machine-generated Text Detection in the Wild. " May 2024. doi: https://doi.org/10.48550/arXiv.2305.13242.
7. Tufts B., Xuandong Zhao, Lei Li. A Practical Examination of AI-Generated Text Detectors for Large Language Models URL: https://arxiv.org/html/2412.05139v4.
8. Balla E. (2025, May 22). How NLP Powers AI-Generated Text Detection. The AI Journal URL: https://aijourn.com/how-nlppowers-ai-generated-text-detection/
9. Solaiman I., et al., "Release strategies and the social impacts of language models". OpenAI Report, Nov. 2019, doi: https://doi.org/10.48550/arXiv.1908.09203.
10. Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj. "Token Prediction as Implicit Classification to Identify LLM-Generated Text". In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 2023, pp.13112–13120. doi: https://doi.org/10.48550/arXiv.2311.08723
11. Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj. "GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content". doi: https://doi.org/10.48550/arXiv.2305.07969
12. Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho. RADAR: Robust AI-Text Detection via Adversarial Learning. doi: https://doi.org/10.48550/arXiv.2307.03838.
13. Radar tester: robust ai-text detection via adversarial learning. URL: https://radar-app.vizhub.ai/
14. RADAR-Vicuna-7B Model, Hugging Face. URL: https://huggingface.co/TrustSafeAI/RADAR-Vicuna-7B
15. Mitchell E., Lee Y., Khazatsky A., Manning C. D., & Finn C., "DetectGPT: Zero-shot machine-generated text detection using probability curvature," Proceedings of the 40th International Conference on Machine Learning (ICML’23), Honolulu, HI, USA, 2023, Article No.: 1038, pp. 24950–24962. doi: https://doi.org/10.48550/arXiv.2301.11305
16. Guangsheng Bao, Yanbin Zhao, Zhiyang Teng, Linyi Yang, Yue Zhang. "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature". doi: https://doi.org/10.48550/arXiv.2310.05130.
17. GPTZero. URL: https://gptzero.me/
18. Napier E. "GPTZero. Behind the Scenes: Multilingual Detection Update". May 2025. URL: https://gptzero.me/news/behindthe-scenes-multilingual-detection-update/
19. ZeroGPT. URL: https://www.zerogpt.com/
20. AI Checker – Most Accurate AI Detector. URL: https://originality.ai/ai-checker.
21. AI Detector – Free AI Checker for ChatGPT, GPT-4, Gemini & More. URL: https://copyleaks.com/ai-content-detector.
22. Забезпечення автентичності: найточніший український ШІ-детектор. Isgen.ai detector. URL: https://isgen.ai/uk.
23. Markley T. (2025, February 27). The Problem with AI Detectors: Why Professors Should Reconsider Their Use, URL: https://www.kaltmanlaw.com/post/problem-with-ai-detectors-professors-should-rethink.
24. Tian Y., Chen H., Wang X., Bai Z., Zhang Q., Li R., Xu C., & Wang Y. "Multiscale positive-unlabeled detection of AIgenerated texts". In Proceedings of the Twelfth International Conference on Learning Representations (ICLR), 2024. doi: https://doi.org/10.48550/arXiv.2305.18149.
25. Plagramme – Plagiarism checker & AI detector. URL: https://plagramme.com.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Olena Peredrii

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.