AN INTERACTIVE SYSTEM FOR SEMANTIC EDITING OF RASTER GRAPHICS BASED ON THE INTEGRATION OF MULTIMODAL GENERATIVE APIS

Keywords: semantic editing, computer vision, generative artificial intelligence, inpainting, multimodal APIs, graphical user interface, client-server architecture

Abstract

This paper investigates the process of automated semantic editing of raster images using artificial intelligence methods. The relevance of the study is обусловлена high computational demands of modern generative models, which require powerful graphics processing units (GPUs) to perform inpainting operations, as well as the limited flexibility of cloud-based services in terms of precise spatial control over editing. This creates a significant barrier to the accessibility of intelligent image editing tools for users with limited computational resources. The aim of the work is to improve the accessibility and efficiency of semantic image editing processes by distributing the computational workload between the client-side and server-side components of the system. The proposed approach combines local tools for spatial mask generation with the use of a cloud-based multimodal API (Gemini 3 Flash Image) to perform generative transformations. As a result, a lightweight desktop application with a modular client-server architecture has been designed and implemented. The key features of the system include asynchronous multithreaded processing of network requests, which ensures a responsive graphical user interface, as well as the use of reverse compositing algorithms for seamless integration of generated fragments into the original image. A real-time binary mask generation mechanism based on cursor coordinates has been implemented, enabling high-precision selection of regions of interest. The obtained results are explained by the effective offloading of tensor computations to cloud infrastructure while maintaining local control over the editing process. Experimental evaluation confirmed the feasibility of performing complex image transformations on low-performance devices without loss of output quality. The practical significance of the work lies in the possibility of using the developed system by digital artists, designers, and researchers for rapid prototyping and image editing without the need for specialized hardware.

References

1. Coloma Ballester, Bertalmio M., Caselles V., Sapiro G., Verdera J. Filling-in by joint interpolation of vector fields and gray levels // IEEE Transactions on Image Processing. 2001. Vol. 10, No. 8. P. 1200–1211.
2. Bertalmio M., Sapiro G., Caselles V., Ballester C. Image inpainting // Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 2000. P. 417–424.
3. Lugmayr A., Danelljan M., Romero A., Yu F., Timofte R., Van Gool L. RePaint: inpainting using denoising diffusion probabilistic models // Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. P. 11461–11471.
4. Dixit M., Srimathi C., Doss R., Loke S., Saleemdurai M. A. Smart parking with computer vision and IoT technology // 2020 43rd International Conference on Telecommunications and Signal Processing (TSP). Milan, Italy, 2020. P. 170–174. DOI: 10.1109/TSP49548.2020.9163467
5. Giampaoli L. E., Hessel F. Parking space occupancy monitoring system using computer vision and IoT // 2021 IEEE 7th World Forum on Internet of Things (WF-IoT). New Orleans, LA, USA, 2021. P. 7–12. DOI: 10.1109/WF-IoT51360.2021.9595935
6. Kuzela M., Fryza T., Zeleny O. Using computer vision and machine learning for efficient parking management: a case study // 2024 13th Mediterranean Conference on Embedded Computing (MECO). Budva, Montenegro, 2024. P. 1–4. DOI: 10.1109/MECO62516.2024.10577808
7. Popereshnyak S., Yurchuk I. Car parking data processing technique for smart parking system as part of smart city // Advances in Intelligent Systems and Computing. 2021. Vol. 1246. DOI: https://doi.org/10.1007/978-3-030-54215-3
8. Поперешняк С. В., Чорнобривець Д. В. Підхід до виявлення доступності паркувальних місць на основі комп’ютерного зору // Системи та технології. 2025. № 69(1). С. 83–91. DOI: https://doi.org/10.32782/2521-6643-2025-1-69.10
9. Moore A. Python GUI programming with Tkinter: develop responsive and powerful GUI applications with Tkinter and Python 3. 2nd ed. Birmingham: Packt Publishing, 2018. 368 p.
Published
2026-05-30
How to Cite
Chornobryvets, D. V., & Popereshnyak, S. V. (2026). AN INTERACTIVE SYSTEM FOR SEMANTIC EDITING OF RASTER GRAPHICS BASED ON THE INTEGRATION OF MULTIMODAL GENERATIVE APIS. Systems and Technologies, 72(2), 145-153. Retrieved from https://st.umsf.in.ua/index.php/journal/article/view/307
Section
COMPUTER SCIENCES