Abstract
Introduction
Among the elements that constitutes traditional Suzhou gardens, windows and their patterns hold significant aesthetic and cultural value, with window lattices standing out for their intricate and diverse designs. However, the complexity of these patterns makes rapid identification difficult for non-specialists. Recently, visual AI models such as You Only Look Once (YOLO)[1] and Graph neural Networks(GNNs)[2] have been increasingly applied in the detection of historic architecture for their efficiency in recognising and classifying complex patterns, including the detection of traditional Chinese roofs [3] and the classification of building types in Athens [4]. Other AI models have also advanced image recognition for applications in heritage sites, such as religious buildings [5], monuments [6], and craft patterns [7-9], offering valuable insights for developing deployable tools to help students, tourists, and researchers developing a deeper understanding of architectural details. Among these, YOLO is notable for its real-time, accurate object detection, making it suitable for mobile and desktop applications [1]. Despite these advances, research on the automated detection of traditional Chinese window lattices remains limited, as existing studies focus on detecting the location of windows [10] rather than the lattice patterns. To address this gap, this study presents Heritage-Scan (H-Scan), a YOLOv8n-based software for the automatic detection of traditional window lattices, focusing on the long-window lattices of Chinese classical gardens from the Ming and Qing dynasties.
1. How effective is a YOLOv8n-based object detection model in accurately identifying diverse traditional long-window lattice patterns in Chinese gardens?
2. How can automated pattern recognition enhance access to and understanding of traditional Chinese architectural elements in digital heritage education?
Methodology
The workflow comprised three stages: data collection, model training, and evaluation. A total of 1,143 images of traditional garden window lattice patterns were collected, of which 315 were classified into 14 lattice types based on Yuanyan (园冶) [11] , a 1631AD monograph by Ji Cheng considered foundational in Chinese garden architecture. Images were annotated with corresponding window names for YOLOv8n training.
YOLOv8n, a lightweight object detection model optimised for speed and real-time inference, was initially trained using 50 epochs, an image size of 640, and a batch size of 8. The baseline model achieved a mAP50 of approximately 0.59 with precision around 0.49, indicating a relatively high false-positive rate. Fine-tuning with a reduced learning rate (lr0 = 0.0005) improved performance, particularly for mid-frequency classes, raising mAP50 to ~0.66 and precision to ~0.77. Standard object detection metrics—precision, recall, mAP50, and mAP50–95—were calculated using the Ultralytics YOLOv8 library in Python. Class-wise results showed persistent challenges with rare categories, highlighting the need for further data augmentation or class balancing.
The trained model was integrated into the H-Scan desktop application to support rapid identification of long-window patterns for tourists, architecture students, and researchers. The software enables real-time recognition from live camera feeds or batch processing of images and pre-recorded videos. Detected windows are automatically annotated, and the software can export counts and classifications directly to Excel, facilitating documentation and analysis.
Results and Conclusions
The fine-tuned YOLOv8n model demonstrated strong performance in detecting and classifying most traditional long-window patterns, achieving an average processing time of 0.30 ms per image or video frame. High accuracy was recorded for certain categories, such as the Haitang lattice (mAP50 = 0.803). However, some limitations persist: a proportion of ice-crack patterns were misclassified as Liujiao patterns and, Bajiaojing windows showed lower recognition accuracy. These issues are linked to patterns with low prominence or reduced recognisability, indicating the potential for further optimisation through targeted data enrichment.
The contributions of this study are both cultural and technological. Culturally, it compiles a dataset of over 1,000 photographs of traditional long-windows, 315 of which are annotated and applied in model training. This dataset supports AI-based research while serving as a foundational digital heritage resource for preservation, education, and public engagement—fulfilling a key role of digitised heritage. Technologically, the study delivers a fine-tuned YOLOv8n model capable of accurate, high-speed detection of long-window patterns and integrates it into a functional desktop application. H-Scan bridges advanced AI-based visual recognition with practical, accessible tools for end-users, offering both real-time and batch processing alongside automated statistical reporting.
References
[1] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). 2016; 779–88. https://doi.org/10.1109/CVPR.2016.91.
[2] Jiang B, Chen S, Wang B, Luo B. MGLNN: semi-supervised learning via multiple graph cooperative learning neural networks. Neur Netw. 2022; 153:204–14. https://doi.org/10. 1016/j.neunet.2022.5.024.
[3] Hou M, Hao W, Dong Y, Ji Y. A detection method for the ridge beast based on improved YOLOv3 algorithm. Herit Sci. 2023;11(1):167. https://doi.org/10.1186/s40494-023-00995-4.
[4] Janković R. Machine learning models for cultural heritage image classification: comparison based on attribute selection. Information. 2019;11(1):12. https://doi.org/10.3390/ info110100 12.
[5] Fesl J, Jelínek J, Horníčková K, Nevařilová Z, Konopa M, Feslová M. AIbased system for cultural heritage objects identification from real photos. 12th Int Conf Adv Comp Inf Technol (ACIT). 2022. https://doi.org/10.1109/ACIT5 4803.2022.99127 52.
[6] Saadat, M. A., Hossain, M. S., Karim, R., & Mustafa, R. Classification of cultural heritage mosque of Bangladesh using CNN and Keras model. In intelligent computing and optimization: proceedings of the 3rd International Conference On Intelligent Computing And Optimization 2020. 2021. 647-658.
[7] Girsang ND. Literature study of convolutional neural network algorithm for batik classification. Brill Res Artif Intell. 2021;1(1):1–7. https://doi.org/10.47709/brilliance.v1i1. 1069.28.
[8] Horn C, Ivarsson O, Lindhe C, Potter R, Green A, Ling J. Artifcial intelligence, 3D documentation, and rock art—approaching and reflecting on the automation of identification and classification of rock art images. J Archaeol Method Theor. 2022;29(1): 188–213. https:// doi.org/10. 1007/s10816-021-09518-6.29.
[9] Liu E. Research on image recognition of intangible cultural heritage based on CNN and wireless network. EURASIP J Wirel Commun Netw. 2020;2020:1–12. https://doi.org/10. 1186/s13638-020-01859-2.
[10] Du L, Wang Y. Bi-YOLO: A novel object detection network and dataset for components of China heritage buildings. J Build Eng. 2024; 97:110817. https://doi.org/10.1016/j.jobe.2024.110817.
[11] Ji C. Yuan Ye [The Craft of Gardens]. 1634. Beijing: Zhonghua Book Company; 1984. (in Chinese).
Among the elements that constitutes traditional Suzhou gardens, windows and their patterns hold significant aesthetic and cultural value, with window lattices standing out for their intricate and diverse designs. However, the complexity of these patterns makes rapid identification difficult for non-specialists. Recently, visual AI models such as You Only Look Once (YOLO)[1] and Graph neural Networks(GNNs)[2] have been increasingly applied in the detection of historic architecture for their efficiency in recognising and classifying complex patterns, including the detection of traditional Chinese roofs [3] and the classification of building types in Athens [4]. Other AI models have also advanced image recognition for applications in heritage sites, such as religious buildings [5], monuments [6], and craft patterns [7-9], offering valuable insights for developing deployable tools to help students, tourists, and researchers developing a deeper understanding of architectural details. Among these, YOLO is notable for its real-time, accurate object detection, making it suitable for mobile and desktop applications [1]. Despite these advances, research on the automated detection of traditional Chinese window lattices remains limited, as existing studies focus on detecting the location of windows [10] rather than the lattice patterns. To address this gap, this study presents Heritage-Scan (H-Scan), a YOLOv8n-based software for the automatic detection of traditional window lattices, focusing on the long-window lattices of Chinese classical gardens from the Ming and Qing dynasties.
1. How effective is a YOLOv8n-based object detection model in accurately identifying diverse traditional long-window lattice patterns in Chinese gardens?
2. How can automated pattern recognition enhance access to and understanding of traditional Chinese architectural elements in digital heritage education?
Methodology
The workflow comprised three stages: data collection, model training, and evaluation. A total of 1,143 images of traditional garden window lattice patterns were collected, of which 315 were classified into 14 lattice types based on Yuanyan (园冶) [11] , a 1631AD monograph by Ji Cheng considered foundational in Chinese garden architecture. Images were annotated with corresponding window names for YOLOv8n training.
YOLOv8n, a lightweight object detection model optimised for speed and real-time inference, was initially trained using 50 epochs, an image size of 640, and a batch size of 8. The baseline model achieved a mAP50 of approximately 0.59 with precision around 0.49, indicating a relatively high false-positive rate. Fine-tuning with a reduced learning rate (lr0 = 0.0005) improved performance, particularly for mid-frequency classes, raising mAP50 to ~0.66 and precision to ~0.77. Standard object detection metrics—precision, recall, mAP50, and mAP50–95—were calculated using the Ultralytics YOLOv8 library in Python. Class-wise results showed persistent challenges with rare categories, highlighting the need for further data augmentation or class balancing.
The trained model was integrated into the H-Scan desktop application to support rapid identification of long-window patterns for tourists, architecture students, and researchers. The software enables real-time recognition from live camera feeds or batch processing of images and pre-recorded videos. Detected windows are automatically annotated, and the software can export counts and classifications directly to Excel, facilitating documentation and analysis.
Results and Conclusions
The fine-tuned YOLOv8n model demonstrated strong performance in detecting and classifying most traditional long-window patterns, achieving an average processing time of 0.30 ms per image or video frame. High accuracy was recorded for certain categories, such as the Haitang lattice (mAP50 = 0.803). However, some limitations persist: a proportion of ice-crack patterns were misclassified as Liujiao patterns and, Bajiaojing windows showed lower recognition accuracy. These issues are linked to patterns with low prominence or reduced recognisability, indicating the potential for further optimisation through targeted data enrichment.
The contributions of this study are both cultural and technological. Culturally, it compiles a dataset of over 1,000 photographs of traditional long-windows, 315 of which are annotated and applied in model training. This dataset supports AI-based research while serving as a foundational digital heritage resource for preservation, education, and public engagement—fulfilling a key role of digitised heritage. Technologically, the study delivers a fine-tuned YOLOv8n model capable of accurate, high-speed detection of long-window patterns and integrates it into a functional desktop application. H-Scan bridges advanced AI-based visual recognition with practical, accessible tools for end-users, offering both real-time and batch processing alongside automated statistical reporting.
References
[1] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). 2016; 779–88. https://doi.org/10.1109/CVPR.2016.91.
[2] Jiang B, Chen S, Wang B, Luo B. MGLNN: semi-supervised learning via multiple graph cooperative learning neural networks. Neur Netw. 2022; 153:204–14. https://doi.org/10. 1016/j.neunet.2022.5.024.
[3] Hou M, Hao W, Dong Y, Ji Y. A detection method for the ridge beast based on improved YOLOv3 algorithm. Herit Sci. 2023;11(1):167. https://doi.org/10.1186/s40494-023-00995-4.
[4] Janković R. Machine learning models for cultural heritage image classification: comparison based on attribute selection. Information. 2019;11(1):12. https://doi.org/10.3390/ info110100 12.
[5] Fesl J, Jelínek J, Horníčková K, Nevařilová Z, Konopa M, Feslová M. AIbased system for cultural heritage objects identification from real photos. 12th Int Conf Adv Comp Inf Technol (ACIT). 2022. https://doi.org/10.1109/ACIT5 4803.2022.99127 52.
[6] Saadat, M. A., Hossain, M. S., Karim, R., & Mustafa, R. Classification of cultural heritage mosque of Bangladesh using CNN and Keras model. In intelligent computing and optimization: proceedings of the 3rd International Conference On Intelligent Computing And Optimization 2020. 2021. 647-658.
[7] Girsang ND. Literature study of convolutional neural network algorithm for batik classification. Brill Res Artif Intell. 2021;1(1):1–7. https://doi.org/10.47709/brilliance.v1i1. 1069.28.
[8] Horn C, Ivarsson O, Lindhe C, Potter R, Green A, Ling J. Artifcial intelligence, 3D documentation, and rock art—approaching and reflecting on the automation of identification and classification of rock art images. J Archaeol Method Theor. 2022;29(1): 188–213. https:// doi.org/10. 1007/s10816-021-09518-6.29.
[9] Liu E. Research on image recognition of intangible cultural heritage based on CNN and wireless network. EURASIP J Wirel Commun Netw. 2020;2020:1–12. https://doi.org/10. 1186/s13638-020-01859-2.
[10] Du L, Wang Y. Bi-YOLO: A novel object detection network and dataset for components of China heritage buildings. J Build Eng. 2024; 97:110817. https://doi.org/10.1016/j.jobe.2024.110817.
[11] Ji C. Yuan Ye [The Craft of Gardens]. 1634. Beijing: Zhonghua Book Company; 1984. (in Chinese).
| Original language | English |
|---|---|
| Title of host publication | CAADRIA 2026 |
| Subtitle of host publication | Humanistic Computation & Intelligence |
| Publication status | Accepted/In press - 19 Dec 2025 |
Fingerprint
Dive into the research topics of 'Detecting Traditional Chinese Window Patterns in Suzhou Gardens Using YOLOv8: A Computer Vision Approach to Architectural Heritage Recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Reinterpreting Traditional Windows Lattices Patterns in Contemporary Design through Parametric Design and Digital Fabrication
Zhao, J. (PI)
1/07/25 → 31/08/25
Project: Internal Research Project
Activities
- 1 Completed SURF Project
-
SURF-2025-0165, Reinterpreting Traditional Windows Lattices Patterns in Contemporary Design through Parametric Design and Digital Fabrication
Zhao, J. (Supervisor)
1 Jul 2025 → 31 Aug 2025Activity: Supervision › Completed SURF Project
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver