Language-Led Visual Grounding and Future Possibilities

Zezhou Sui; Mian Zhou; Zhikun Feng; Angelos Stefanidis; Nan Jiang

doi:10.3390/electronics12143142

Language-Led Visual Grounding and Future Possibilities

Zezhou Sui, Mian Zhou^*, Zhikun Feng, Angelos Stefanidis, Nan Jiang

^*Corresponding author for this work

School of AI and Advanced Computing

Research output: Contribution to journal › Article › peer-review

Abstract

In recent years, with the rapid development of computer vision technology and the popularity of intelligent hardware, as well as the increasing demand for human–machine interaction in intelligent products, visual localization technology can help machines and humans to recognize and locate objects, thereby promoting human–machine interaction and intelligent manufacturing. At the same time, human–machine interaction is constantly evolving and improving, becoming increasingly intelligent, humanized, and efficient. In this article, a new visual localization model is proposed, and a language validation module is designed to use language information as the main information to increase the model’s interactivity. In addition, we also list the future possibilities of visual localization and provide two examples to explore the application and optimization direction of visual localization and human–machine interaction technology in practical scenarios, providing reference and guidance for relevant researchers and promoting the development and application of visual localization and human–machine interaction technology.

Original language	English
Article number	3142
Journal	Electronics (Switzerland)
Volume	12
Issue number	14
DOIs	https://doi.org/10.3390/electronics12143142
Publication status	Published - Jul 2023

Keywords

human–computer interaction
intelligent systems
interaction design
user experience
visual grounding

Access to Document

10.3390/electronics12143142

Cite this

@article{733e779be6f04cc28c43f3dde8ec9cd5,

title = "Language-Led Visual Grounding and Future Possibilities",

abstract = "In recent years, with the rapid development of computer vision technology and the popularity of intelligent hardware, as well as the increasing demand for human–machine interaction in intelligent products, visual localization technology can help machines and humans to recognize and locate objects, thereby promoting human–machine interaction and intelligent manufacturing. At the same time, human–machine interaction is constantly evolving and improving, becoming increasingly intelligent, humanized, and efficient. In this article, a new visual localization model is proposed, and a language validation module is designed to use language information as the main information to increase the model{\textquoteright}s interactivity. In addition, we also list the future possibilities of visual localization and provide two examples to explore the application and optimization direction of visual localization and human–machine interaction technology in practical scenarios, providing reference and guidance for relevant researchers and promoting the development and application of visual localization and human–machine interaction technology.",

keywords = "human–computer interaction, intelligent systems, interaction design, user experience, visual grounding",

author = "Zezhou Sui and Mian Zhou and Zhikun Feng and Angelos Stefanidis and Nan Jiang",

note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

month = jul,

doi = "10.3390/electronics12143142",

language = "English",

volume = "12",

journal = "Electronics (Switzerland)",

issn = "2079-9292",

number = "14",

}

TY - JOUR

T1 - Language-Led Visual Grounding and Future Possibilities

AU - Sui, Zezhou

AU - Zhou, Mian

AU - Feng, Zhikun

AU - Stefanidis, Angelos

AU - Jiang, Nan

PY - 2023/7

Y1 - 2023/7

N2 - In recent years, with the rapid development of computer vision technology and the popularity of intelligent hardware, as well as the increasing demand for human–machine interaction in intelligent products, visual localization technology can help machines and humans to recognize and locate objects, thereby promoting human–machine interaction and intelligent manufacturing. At the same time, human–machine interaction is constantly evolving and improving, becoming increasingly intelligent, humanized, and efficient. In this article, a new visual localization model is proposed, and a language validation module is designed to use language information as the main information to increase the model’s interactivity. In addition, we also list the future possibilities of visual localization and provide two examples to explore the application and optimization direction of visual localization and human–machine interaction technology in practical scenarios, providing reference and guidance for relevant researchers and promoting the development and application of visual localization and human–machine interaction technology.

AB - In recent years, with the rapid development of computer vision technology and the popularity of intelligent hardware, as well as the increasing demand for human–machine interaction in intelligent products, visual localization technology can help machines and humans to recognize and locate objects, thereby promoting human–machine interaction and intelligent manufacturing. At the same time, human–machine interaction is constantly evolving and improving, becoming increasingly intelligent, humanized, and efficient. In this article, a new visual localization model is proposed, and a language validation module is designed to use language information as the main information to increase the model’s interactivity. In addition, we also list the future possibilities of visual localization and provide two examples to explore the application and optimization direction of visual localization and human–machine interaction technology in practical scenarios, providing reference and guidance for relevant researchers and promoting the development and application of visual localization and human–machine interaction technology.

KW - human–computer interaction

KW - intelligent systems

KW - interaction design

KW - user experience

KW - visual grounding

UR - http://www.scopus.com/inward/record.url?scp=85175118092&partnerID=8YFLogxK

U2 - 10.3390/electronics12143142

DO - 10.3390/electronics12143142

M3 - Article

AN - SCOPUS:85175118092

SN - 2079-9292

VL - 12

JO - Electronics (Switzerland)

JF - Electronics (Switzerland)

IS - 14

M1 - 3142

ER -

Language-Led Visual Grounding and Future Possibilities

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this