In recent years, self-supervised learning (SSL) has made significant progress as a method for extracting useful features from images without requiring human-annotated labels. These approaches have enabled models to achieve strong performance on
Read More
Learning Content and Positional Features for Object Detection via Self-Supervision
Who Decides What Should Be Detected? Challenges in OSOD Research and Beyond
he Challenge of Unknown Objects in Detection Recent advances in object detection have enabled models to accurately detect and classify known objects in images. However, in real-world applications, detectors frequently encounter objects
Read More
Landslide Image Analysis and Disaster Risk Assessment Using Multimodal AI
In recent years, climate change has increased the frequency of natural disasters worldwide. Among them, landslides are particularly hazardous because they drastically alter terrain, raising the risk of secondary disasters. This makes rapid
Read More
Zero-shot Texture Anomaly Detection
In recent years, anomaly detection (AD) in images has become increasingly important in industrial inspection and quality control. In particular, detecting anomalies in texture images has encountered a challenge: conventional methods assume the availability of numerous normal images, but when the orientations of the input and normal images do not match, accuracy degrades. Existing approaches […]
Read More
Driving Hazard Prediction by Multi-modal AI
In recent years, with the advancement of autonomous driving technologies and advanced driver-assistance systems (ADAS), predicting hazards in the vicinity of vehicles has become a critical issue for safe driving. Conventional methods have relied on video analysis and simulations
Read More
Bridge Inspection by Multi-modal AI
This paper focuses on enhancing visual question answering (VQA) for bridge inspection using multimodal AI techniques that process both images and natural language. Traditionally, bridge inspections rely on expert visual assessments, which are time-consuming, costly, and sometimes inconsistent. To address these challenges,
Read More
RefVSR++: Reference-based High-Precision Video Super-Resolution
his study proposes a novel method called RefVSR++ for Reference-based VSR(*), which leverages the characteristics of multi-camera systems found in modern smartphones to restore low-resolution videos into high-resolution ones. Traditional VSR enhances
Read More
A Graph Network Approach to Fast Bundle Adjustment for Optimized SLAM
In the fields of Structure-from-Motion (SfM) and visual SLAM (Simultaneous Localization and Mapping), Bundle Adjustment (BA) is a crucial process that optimizes camera poses and the positions of 3D landmarks. In practice, many visual SLAM systems perform BA locally on the most recent keyframes and their associated landmarks to maintain overall system accuracy and tracking […]
Read More
High-Speed, High-Precision Visual Localization
Visual localization is a critical task in many computer vision applications such as Structure-from-Motion (SfM) and SLAM, as it involves estimating the 6-DoF camera pose. Traditional approaches extract global features for image retrieval and
Read More
Unsupervised Domain Adaptation for Semantic Segmentation
This paper proposes a novel method called “Cross-Region Adaptation (CRA)” aimed at improving the accuracy of unsupervised domain adaptation (UDA) for semantic segmentation. Semantic segmentation, which assigns semantic labels to each pixel in an
Read More