Research

This study proposes a novel method called RefVSR++ for Reference-based VSR(*), which leverages the characteristics of multi-camera systems found in modern smartphones to restore low-resolution videos into high-resolution ones. Traditional VSR enhances resolution by utilizing temporal information from a single low-resolution (LR) video stream. In contrast, multi-camera setups allow the same scene to be captured […]

Read More

In the fields of Structure-from-Motion (SfM) and visual SLAM (Simultaneous Localization and Mapping), Bundle Adjustment (BA) is a crucial process that optimizes camera poses and the positions of 3D landmarks. In practice, many visual SLAM systems perform BA locally on the most recent keyframes and their associated landmarks to maintain overall system accuracy and tracking […]

Read More

Embodied visual navigation, which is crucial in fields such as autonomous robotics and augmented reality, enables a robot to navigate and search for target objects in an unknown environment while localizing itself. However, existing deep reinforcement learning (RL) approaches often suffer from performance degradation in out-of-distribution environments due to statistical shifts between training and testing […]

Read More

Visual localization is a critical task in many computer vision applications such as Structure-from-Motion (SfM) and SLAM, as it involves estimating the 6-DoF camera pose. Traditional approaches extract global features for image retrieval and local features for precise pose estimation using separate networks. This separation results in high computational costs and significant memory consumption, posing […]

Read More

This paper proposes a novel method called “Cross-Region Adaptation (CRA)” aimed at improving the accuracy of unsupervised domain adaptation (UDA) for semantic segmentation. Semantic segmentation, which assigns semantic labels to each pixel in an image, is a critical task that requires a large amount of annotated data for high-precision learning. However, annotating real-world images is […]

Read More

This study proposes a novel method for high-precision detection of “logical anomalies” (e.g., misplacements or omissions of parts that depend on the overall contextual information of an image) in applications such as industrial inspection. Conventional anomaly detection methods work well for “structural anomalies” (such as cracks or contaminations) that are local in nature, but they […]

Read More

“Image captioning,” the task of describing the scenery and objects in an image using natural language, is one of the technologies in artificial intelligence that enables visual information to be expressed in words. In recent mainstream approaches, features—informative representations extracted from the image—are first obtained and then used to generate natural-sounding captions. The quality and […]

Read More

In recent years, deep learning-based image recognition has expanded into practical applications such as agriculture, fisheries, and livestock management. In these domains, low-cost and low-power systems are often more important than high-speed processing, making single board computers (SBCs) an appealing platform. However, many lightweight neural networks developed thus far have been designed with smartphones in […]

Read More

This paper focuses on enhancing visual question answering (VQA) for bridge inspection using multimodal AI techniques that process both images and natural language. Traditionally, bridge inspections rely on expert visual assessments, which are time-consuming, costly, and sometimes inconsistent. To address these challenges, the authors propose a novel approach that leverages existing bridge inspection reports containing […]

Read More

In recent years, anomaly detection (AD) in images has become increasingly important in industrial inspection and quality control. In particular, detecting anomalies in texture images has encountered a challenge: conventional methods assume the availability of numerous normal images, but when the orientations of the input and normal images do not match, accuracy degrades. Existing approaches […]

Read More