Tag

Robotics

193 articles archived under #robotics · RSS

arXiv — Machine Learning research 2h ago

HydraCollab: Adaptive Collaborative-Perception for Distributed Autonomous Systems

arXiv:2607.00191v1 Announce Type: cross Abstract: Collaborative-perception enables multi-robot systems to enhance situational awareness by sharing perceptual information. Existing collaborative-perception systems face an inherent trade-off between communication bandwidth…

22
Hugging Face Daily Papers research 13h ago

Play2Perfect: What Matters in Dexterous Play Pretraining for Precise Assembly?

Abstract A reinforcement learning framework called Play2Perfect enables sample-efficient robotic assembly tasks by first learning general manipulation skills through playful interaction with diverse objects, then adapting these skills for precise assembly through fine-tuning.…

34
Hugging Face Daily Papers research 17h ago

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

Abstract A large-scale video editing dataset and model are introduced that support multi-task and structural manipulations through advanced data synthesis and network architectures. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Existing instruction-based video editing datasets…

38
Hugging Face Daily Papers research 21h ago

Scenes as Objects, Not Primitives: Instance-Structured 3D Tokenization from Unposed Views

Abstract A feed-forward framework decomposes 3D scenes into instance-structured token groups from multi-view images, enabling direct object-level reconstruction, segmentation, and manipulation without 3D annotations. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A 3D scene is…

38
arXiv — Machine Learning research 1d ago

Warp RL: Reshaping Base Policy Distributions for Dynamics Adaptation

arXiv:2606.31043v1 Announce Type: new Abstract: Residual reinforcement learning adapts a pretrained robot policy by learning an additive correction to its actions. While effective when adaptation amounts to shifting the base policy's action distribution, additive corrections…

26
arXiv — NLP / Computation & Language research 1d ago

ViTL: Temporal Logic-Guided Zero-Shot Natural Language Navigation via Vision-Language Models

arXiv:2606.30696v1 Announce Type: cross Abstract: Enabling robots to follow natural language commands to complete zero-shot long-horizon tasks remains challenging. It requires extracting implicit temporal and logical constraints from natural language commands and executing…

4
arXiv — NLP / Computation & Language research 1d ago

RCT: A Robot-Collected Touch-Vision-Language Dataset for Tactile Generalization

arXiv:2606.31694v1 Announce Type: cross Abstract: For robots manipulating open-world objects, tactile representations must generalize to unseen materials. We introduce RCT (Robotic Contact Tactile), a robot-collected touch-vision-language dataset with 29,279 tactile frames from…

18
Hugging Face Daily Papers research 1d ago

Drop-Then-Recovery: How Redundant Are Vision-Language-Action Models?

Abstract Research reveals that language backbones in Vision-Language-Action models are highly redundant for robotic manipulation tasks, while vision and action pathways are more critical, suggesting need for deliberate capacity allocation in future architectures. Generated by…

11
Hugging Face Daily Papers research 1d ago

Learning Transferable Dynamics Priors from Action to World Modeling

Abstract Action-conditioned world modeling enables transferable dynamics priors for robot learning through pretraining on large-scale manipulation data, supporting both simulator-based policy evaluation and video-action prediction. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We…

27
arXiv — Machine Learning research 2d ago

A Linear Matching Bandit Approach to Online Multi-Human Multi-Robot Teaming

arXiv:2606.29221v1 Announce Type: new Abstract: We address the problem of online multi-human multi-robot teaming through the lens of a linear matching bandit framework, where a learner assigns robots with unknown features from a fixed pool to distinct sets of human agents over…

15
Ars Technica — AI news-outlet 2d ago

South Korea to spend $1T on more memory chip production and humanoid robots

South Korea targets physical AI lead and commercial humanoid robots by 2028.

9
r/MachineLearning community 2d ago

I do historical swordfighting and noticed AI struggles to track it. I’m building an open dataset to help fix this. Does my schema make sense? [P]

Hi everyone, I’m a historical swordfighter (HEMA practitioner), and while I’m not a computer vision engineer or a roboticist, I’ve been reading a lot about the current bottlenecks in embodied AI, specifically around the Sim2Real gap and thin-object tracking. It occurred to me…

18
TechCrunch — AI news-outlet 2d ago

Robot hand company settles Tesla trade secret suit and announces $11M raise

Jay Li doesn’t recommend getting sued by Tesla if you’re trying to get a startup off the ground. But he does think his company, Proception, might be better off for having endured the experience. “I think it’s kind of like a resilience test, or pressure…

15
Import AI (Jack Clark) community 2d ago

Import AI 463: Self-improving robots; a 10k Chinese GPU cluster; and an elegiac essay for the human era

What eras bookend our interregnum?

36
arXiv — Machine Learning research 3d ago

Support-Constrained RL Enables Real-World Policy Improvement without Real-World Experience

arXiv:2606.27475v1 Announce Type: cross Abstract: Robots trained on real world data tend to be imprecise, slow, and brittle to perturbations. Improving these policies with reinforcement learning (RL) is an appealing alternative, but this process often requires expensive training…

28
arXiv — Machine Learning research 3d ago

Physics-Guided Robotic Radiation Source Localization along Arbitrary Measurement Paths in Unstructured Environments

arXiv:2606.27624v1 Announce Type: cross Abstract: Using robots to estimate the location of the radiation source is an effective way to improve efficiency and safety. Existing methods focus on planning the robot's path to achieve precise estimation, typically approaching the…

19
MIT News — AI research 5d ago

LLMs help robots understand vague instructions and focus on key details

To help robots do chores in places like homes and factories, a new approach from MIT uses one language model to clarify users’ instructions, then another to ignore irrelevant info.

19
arXiv — Machine Learning research 6d ago

Revisiting Action Factorization for Complex Action Spaces

arXiv:2606.26574v1 Announce Type: new Abstract: Many real-world control problems involve hybrid discrete-continuous action spaces. For example, steering and signaling in autonomous driving, and aiming and firing in robotics or video-games. Despite real-world hybrid factorization…

10
arXiv — NLP / Computation & Language research 6d ago

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

arXiv:2606.26382v1 Announce Type: new Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic…

35
Hugging Face Daily Papers research 6d ago

In-Context World Modeling for Robotic Control

Abstract ICWM enables robot policies to infer system variables from self-generated interactions, allowing adaptation to novel configurations without parameter updates by treating system identification as an in-context adaptation problem. Generated by…

8
arXiv — NLP / Computation & Language research 7d ago

RAVEN: Long-Horizon Reasoning & Navigation with a Visuo-Spatio-Temporal Memory

arXiv:2606.25206v1 Announce Type: cross Abstract: Long-term robot deployment requires a compact and scalable memory that preserves fine-grained visual semantics, grounds observations in space and time, and enables efficient storage and retrieval. In this paper, we propose RAVEN,…

21
Hugging Face Daily Papers research 7d ago

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

Abstract EBench is a comprehensive simulation benchmark for evaluating generalist mobile manipulation policies across diverse tasks and dimensions, revealing distinct capability profiles and generalization patterns among state-of-the-art models. Generated by…

18
Hugging Face Daily Papers research 7d ago

InSight: Self-Guided Skill Acquisition via Steerable VLAs

Abstract InSight enables autonomous skill acquisition for vision-language-action models through primitive-action level steerability and automated demonstration generation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Vision-language-action (VLA) models can learn manipulation…

19
TechCrunch — AI news-outlet 7d ago

Agility Robotics plans to go public via SPAC in a $2.5B deal

Agility Robotics, the humanoid robotics startup that spun out of Oregon State University in 2015, expects to generate $620 million in proceeds.

13
NVIDIA Developer Blog official-blog 7d ago

Accelerating BEV Pooling on NVIDIA GPUs for Physical AI Applications

An increasingly common design pattern for autonomous vehicles (AVs), robotics, and spatial AI systems is bird's-eye-view (BEV) perception. BEV models project...

31
Hugging Face Daily Papers research 7d ago

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

Abstract EventVLA addresses long-horizon robotic manipulation challenges by introducing a sparse visual evidence memory framework with visual anchors and dynamic Keyframe Evidence Memory module for improved task performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Memory…

23
Hugging Face Daily Papers research 7d ago

World Value Models for Robotic Manipulation

Abstract World Value Model combines world models with value estimation to provide accurate task progression assessment and improve robotic policy learning from mixed-quality data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Generalist value models play a pivotal role in scaling…

6
arXiv — Machine Learning research 8d ago

Verifiable Foundation Models for Robot Safety

arXiv:2606.23754v1 Announce Type: cross Abstract: Deploying foundation models for robot control raises a central challenge: the expressive power that enables rich, multimodal perception also makes these models opaque and difficult to analyze formally, rendering them intractable…

4
arXiv — Machine Learning research 8d ago

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

arXiv:2606.24403v1 Announce Type: cross Abstract: Object interaction tasks have been a focus of advances in imitation learning. End-to-end methods, dominated by diffusion and flow-based variants have shown leaps in performance while sacrificing interpretability. Object-centric…

23
Hugging Face Daily Papers research 8d ago

ShotcreteDepth: A Bi-modal Dataset for Robust Robotic Depth Perception in Shotcrete Construction Environments

Abstract A bi-modal construction domain dataset combining stereo RGB and LiDAR data under challenging environmental conditions is introduced for autonomous system perception research. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce ShotcreteDepth, a bi-modal dataset…

22
Hugging Face Daily Papers research 8d ago

Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents

Abstract A failure detection framework for long-horizon robotic tasks uses action-conditioned world models and functional conformal prediction to monitor manipulation trajectories with only final task labels. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Long-horizon tasks are…

8
Hugging Face Daily Papers research 9d ago

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

Abstract PoLAR introduces a geometrically structured latent action representation in hyperbolic space that separates transition extent from transition mode, improving robotic policy learning performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Latent action pretraining…

12
MIT News — AI research 9d ago

New chip could help tiny robots traverse complex environments

Researchers combined an efficient algorithm with dedicated hardware to rapidly generate 3D maps for navigation using minimal memory and power.

10
Ars Technica — AI news-outlet 9d ago

GM installs robots at flagship EV factory after laying off 1,300 workers

US autoworkers union warns of robot automation as dark factory future looms.

23
NVIDIA Developer Blog official-blog 9d ago

Inside NVIDIA Halos for Robotics: A Full-Stack Functional Safety System for Physical AI

Physical AI—robots working autonomously alongside people in factories, warehouses, hospitals, and homes—is arriving faster than most expected. Traditional...

12
Hugging Face Daily Papers research 9d ago

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Abstract GeneralVLA-2 addresses limitations in vision-language-action systems by introducing GeoFuse-MV3D for improved 3D reconstruction and an enhanced KnowledgeBank for better memory management in robotic manipulation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

32
Hacker News — AI on Front Page community 12d ago

Hyundai buys Boston Dynamics

Article URL: https://startupfortune.com/hyundai-takes-full-control-of-boston-dynamics-as-softbank-exits-for-325-million/ Comments URL: https://news.ycombinator.com/item?id=48600312 Points: 227 # Comments: 118

12
Hugging Face Daily Papers research 12d ago

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

Abstract ImageWAM demonstrates that pretrained image editing models can effectively replace video generation in world action models for robot control, achieving better performance with reduced computational costs. Generated by Qwen/Qwen2.5-Coder-32B-Instruct World Action Models…

25
Hugging Face Daily Papers research 12d ago

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Abstract ENPIRE framework enables autonomous robotics research through a closed-loop system that automates policy improvement via environment feedback, policy refinement, and evolutionary code optimization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Achieving dexterous robotic…

27
Hugging Face Daily Papers research 12d ago

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

Abstract DragMesh-2 enables dexterous hand-object interaction through contact-driven manipulation, with PICA enhancing robustness under varying contact loads without tactile feedback. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Dexterous interaction with articulated objects is…

19
Hugging Face Daily Papers research 12d ago

HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining

Abstract Egocentric human video can effectively replace teleoperated robot trajectories for embodied model pretraining, achieving better performance with reduced data collection costs. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Embodied foundation models are expected to…

22
Hugging Face Daily Papers research 13d ago

Playful Agentic Robot Learning

Abstract Embodied robots learn reusable skills through self-directed play and exploration, then apply these skills to improve performance on downstream tasks without additional training. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Current agentic robot systems can write…

4
Hugging Face Daily Papers research 13d ago

Reinforcement Learning-Guided Retrieval with Soft Fusion for Robust Multimodal Imitation Learning under Missing Modalities

Abstract RL4IL enables robust robotic manipulation under sensor dropout by using reinforcement learning to retrieve relevant demonstrations and cross-attention fusion to impute missing modalities without retraining. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Robotic systems…

23
r/LocalLLaMA community 13d ago

My suitcase robot gets high now off a real gas sensor wired straight into the LLM sampler. Smoke raises temperature/top_p/top_k live, so his speech genuinely gets loopier and never repeats.

Follow-up on Sparky, my offline suitcase robot I keep overdeveloping. He gets high now, and there's no scripted "stoned mode" anywhere in it. A real MQ-2 gas sensor sits in the case. Every 0.5s I read it against an adaptive clean-air baseline and turn a smoke hit into a 0 to 10…

30
Hugging Face Daily Papers research 13d ago

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

Abstract 3D point motion forecasting model predicts object trajectories from visual history and language goals, demonstrating superior performance on benchmarks and transferring effectively to robot manipulation and video generation tasks. Generated by…

4
Hugging Face Daily Papers research 13d ago

PAIWorld: A 3D-Consistent World Foundation Model for Robotic Manipulation

Abstract PAIWorld enhances diffusion-transformer world models with geometric awareness and cross-view attention to improve multi-view 3D consistency for robotic manipulation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct World foundation models (WFMs) are powerful…

18
arXiv — Machine Learning research 14d ago

Stealthy World Model Manipulation via Data Poisoning

arXiv:2606.18697v1 Announce Type: new Abstract: Model-based learning agents use learned world models to predict future states, plan actions, and adapt to new environments. However, the process of updating world models from collected experience creates a training-time attack…

18
arXiv — Machine Learning research 14d ago

Strategic Feature Selection

arXiv:2606.18867v1 Announce Type: new Abstract: When algorithmic predictors inform resource allocation in high-stakes domains such as healthcare, these predictors must account for strategic manipulation of input features. The typical solution is to redesign the predictor itself…

35
Hugging Face Daily Papers research 14d ago

Kairos: A Native World Model Stack for Physical AI

Abstract Kairos is a native world model framework that learns from diverse experiences, maintains persistent states through hybrid temporal attention, and supports efficient deployment for physical AI applications. Generated by Qwen/Qwen2.5-Coder-32B-Instruct World models are…

33
Hugging Face Daily Papers research 14d ago

Guava: An Effective and Universal Harness for Embodied Manipulation

Abstract A harness framework for embodied tool use combines high-level reasoning with external modules, enabling compact models to perform complex manipulation tasks with minimal training data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Language models trained on large-scale…

15

HydraCollab: Adaptive Collaborative-Perception for Distributed Autonomous Systems

Play2Perfect: What Matters in Dexterous Play Pretraining for Precise Assembly?

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

Scenes as Objects, Not Primitives: Instance-Structured 3D Tokenization from Unposed Views

Warp RL: Reshaping Base Policy Distributions for Dynamics Adaptation

ViTL: Temporal Logic-Guided Zero-Shot Natural Language Navigation via Vision-Language Models

RCT: A Robot-Collected Touch-Vision-Language Dataset for Tactile Generalization

Drop-Then-Recovery: How Redundant Are Vision-Language-Action Models?

Learning Transferable Dynamics Priors from Action to World Modeling

A Linear Matching Bandit Approach to Online Multi-Human Multi-Robot Teaming

South Korea to spend $1T on more memory chip production and humanoid robots

I do historical swordfighting and noticed AI struggles to track it. I’m building an open dataset to help fix this. Does my schema make sense? [P]

Robot hand company settles Tesla trade secret suit and announces $11M raise

Import AI 463: Self-improving robots; a 10k Chinese GPU cluster; and an elegiac essay for the human era

Support-Constrained RL Enables Real-World Policy Improvement without Real-World Experience

Physics-Guided Robotic Radiation Source Localization along Arbitrary Measurement Paths in Unstructured Environments

LLMs help robots understand vague instructions and focus on key details

Revisiting Action Factorization for Complex Action Spaces

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

In-Context World Modeling for Robotic Control

RAVEN: Long-Horizon Reasoning & Navigation with a Visuo-Spatio-Temporal Memory

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

InSight: Self-Guided Skill Acquisition via Steerable VLAs

Agility Robotics plans to go public via SPAC in a $2.5B deal

Accelerating BEV Pooling on NVIDIA GPUs for Physical AI Applications

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

World Value Models for Robotic Manipulation

Verifiable Foundation Models for Robot Safety

RE4: Transformation-aware Imitation of Object Interactions Using Manipulation Modes

ShotcreteDepth: A Bi-modal Dataset for Robust Robotic Depth Perception in Shotcrete Construction Environments

Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

New chip could help tiny robots traverse complex environments

GM installs robots at flagship EV factory after laying off 1,300 workers

Inside NVIDIA Halos for Robotics: A Full-Stack Functional Safety System for Physical AI

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Hyundai buys Boston Dynamics

ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects

HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining

Playful Agentic Robot Learning

Reinforcement Learning-Guided Retrieval with Soft Fusion for Robust Multimodal Imitation Learning under Missing Modalities

My suitcase robot gets high now off a real gas sensor wired straight into the LLM sampler. Smoke raises temperature/top_p/top_k live, so his speech genuinely gets loopier and never repeats.

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

PAIWorld: A 3D-Consistent World Foundation Model for Robotic Manipulation

Stealthy World Model Manipulation via Data Poisoning

Strategic Feature Selection

Kairos: A Native World Model Stack for Physical AI

Guava: An Effective and Universal Harness for Embodied Manipulation