Construction process monitoring traditionally relies on manual inspections and document cross-referencing, leading to inefficiencies in project management. Despite advances enabling computer vision-based monitoring and automated document analysis, integrating these technologies remains challenging, particularly in connecting field data with work documentation. This paper proposes an automated monitoring system integrating computer vision-based field data with text-based work instructions. The system employs YOLOv5 object detection models to analyze construction site images and architectural drawings, while utilizing text parsing techniques to extract information from work instructions. Validation using thirty apartment units demonstrated effectiveness in monitoring finishing works, particularly masonry and tiling applications. Results showed consistent performance in establishing automated connections between work instructions, drawings, and site conditions, reducing manual verification requirements while maintaining high accuracy. The successful implementation in finishing works demonstrates potential scalability for broader construction applications with varying complexity levels.
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) and the Ministry of Education ( NRF-2022R1F1A1072450 ). In addition, this study was also supported by a grant ( RS-2022-00143493 ) from Digital-Based Building Construction and Safety Supervision Technology Research Program funded by Ministry of Land, Infrastructure and Transport of Korean Government.