Augmented reality, deep learning and vision-language query system for construction worker safety

Chen, Haosen; Hou, Lei; Wu, Shaoze; Zhang, Guomin; Zou, Yang; Moon, Sungkon; Bhuiyan, Muhammed

DC Field	Value	Language
dc.contributor.author	Chen, Haosen	-
dc.contributor.author	Hou, Lei	-
dc.contributor.author	Wu, Shaoze	-
dc.contributor.author	Zhang, Guomin	-
dc.contributor.author	Zou, Yang	-
dc.contributor.author	Moon, Sungkon	-
dc.contributor.author	Bhuiyan, Muhammed	-
dc.date.issued	2024-01-01	-
dc.identifier.issn	0926-5805	-
dc.identifier.uri	https://dspace.ajou.ac.kr/dev/handle/2018.oak/33763	-
dc.description.abstract	Low situational awareness contributes to safety incidents in construction. Existing Deep Learning (DL)-based applications lack the capability to provide context-specific and interactive feedback that is essential for workers to fully understand their surrounding environments. This paper proposes the Visual Construction Safety Query (VCSQ) system. The system encompasses real-time Image Captioning (IC), safety-centric Visual Question Answering (VQA), and keyword-based Image-Text Retrieval (ITR), integrated with head-mounted Augmented Reality (AR) devices. System validation includes benchmarks and real-world images. The ITR module posted high recall rates of 0.801 and 0.835 for Recall@5 and @10. The VQA module achieved an 89.7% accuracy rate, and the IC module had a SPICE score of 0.449. Feasibility tests and surveys confirmed the system's practical advantages in different construction scenarios. This study establishes an integration roadmap adaptable to future advancements in interactive DL and immersive AR.	-
dc.language.iso	eng	-
dc.publisher	Elsevier B.V.	-
dc.subject.mesh	Construction safety	-
dc.subject.mesh	Construction workers	-
dc.subject.mesh	Deep learning	-
dc.subject.mesh	Image captioning	-
dc.subject.mesh	Image texts	-
dc.subject.mesh	Language model	-
dc.subject.mesh	Query systems	-
dc.subject.mesh	Question Answering	-
dc.subject.mesh	Text retrieval	-
dc.subject.mesh	Vision-language model	-
dc.title	Augmented reality, deep learning and vision-language query system for construction worker safety	-
dc.type	Article	-
dc.citation.title	Automation in Construction	-
dc.citation.volume	157	-
dc.identifier.bibliographicCitation	Automation in Construction, Vol.157	-
dc.identifier.doi	10.1016/j.autcon.2023.105158	-
dc.identifier.scopusid	2-s2.0-85175417154	-
dc.identifier.url	https://www.journals.elsevier.com/automation-in-construction	-
dc.subject.keyword	Augmented reality	-
dc.subject.keyword	Construction safety	-
dc.subject.keyword	Deep learning	-
dc.subject.keyword	Vision-language models	-
dc.description.isoa	true	-
dc.subject.subarea	Control and Systems Engineering	-
dc.subject.subarea	Civil and Structural Engineering	-
dc.subject.subarea	Building and Construction	-

Show simple item record

qrcode

트윗하기

Related Researcher

Moon, Sung Kon문성곤: Department of Civil Systems Engineering

File Download

There are no files associated with this item.

Related Researcher

Total Views & Downloads

File Download