Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBaseEkotizim uchun ochiq API
Maqola

Hyperrectangle Embedding for Debiased 3D Scene Graph Prediction From RGB Sequences

Mingtao FengSchool of Artificial Intelligence, Xidian University, Xi’an, ChinaChan Kit YanSchool of Artificial Intelligence, Xidian University, Xi’an, ChinaZijie WuCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaWeisheng DongSchool of Artificial Intelligence, Xidian University, Xi’an, ChinaYaonan WangCollege of Electrical and Information Engineering, Hunan University, Changsha, ChinaAjmal MianDepartment of Computer Science and Software Engineering, University of Western Australia, Perth, WA, Australia
2025en
ABI

Annotatsiya

3D scene graph has emerged as a powerful high-level representation of the environment and is regarded as a prerequisite for long-term autonomous robotic operations. A practical research problem here is to predict the 3D scene graph from sequentially captured data. However, existing methods neglect the polysemy of semantic roles that coarse feature vectors are insufficient to represent entities in different relationship semantics. This extremely limits their capability to predict relationships. We propose an approach to tackle the aforementioned challenge by introducing a novel representation, the hyperrectangle embedding, which represents entity using distinctive geometry for more effective scene understanding, rather than learning within vector-based feature with blindly increasing dimensions. By incorporating an entity within two affine-transformed embeddings, each representing either the subject or object and characterized by separate learnable transformations, we achieve the polysemy of semantic roles. The intersections of affine-transformed hyperrectangle embeddings represent the bidirectional relationship between two entities. We identify bias and reliability as two challenges impeding the model learning process. In response to the bias, that arises from long-tailed distributions in the data, we propose a history-guided debiasing strategy that utilizes a confusion history block comprised of previous hyperrectangle embeddings. This strategy mitigates inherent biases by extracting pertinent information and facilitating knowledge transfer from dominant categories to rare ones. To enhance the reliability of predictions, we introduce predictive uncertainty into the 3D scene graph prediction task. We develop a post-hoc reliability enhancement strategy to identify potentially unreliable predictions and subsequently enhance the model's predictive accuracy. Extensive experiments on the 3DSSG dataset show the effectiveness of the proposed method in this challenging task, outperforming existing state-of-the-art.

Hali tarjima qilinmagan

Identifikatorlar

Iqtiboslar va manbalar

3 ta iqtibos0 ta foydalanilgan manba