HARL-A: Hardware Agnostic Reinforcement Learning Through Adversarial Selection

LUCY ELAINE JACKSON; Steve Eckersley; Pete Senior; SIMON J HADFIELD

doi:10.1109/IROS51168.2021.9636167

Back

HARL-A: Hardware Agnostic Reinforcement Learning Through Adversarial Selection

Conference proceeding

Open access

HARL-A: Hardware Agnostic Reinforcement Learning Through Adversarial Selection

LUCY ELAINE JACKSON, Steve Eckersley, Pete Senior and SIMON J HADFIELD

2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), pp.3499-3505

IEEE International Conference on Intelligent Robots and Systems

2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021) (Prague, Czech Republic, 27/09/2021 - 01/10/2021)

16/12/2021

DOI: https://doi.org/10.1109/IROS51168.2021.9636167

Abstract

Automation & Control Systems

Computer Science, Artificial Intelligence

Engineering, Electrical & Electronic

Science & Technology

Computer Science

Engineering

Robotics

Technology

The use of reinforcement learning (RL) has led to huge advancements in the field of robotics. However data scarcity, brittle convergence and the gap between simulation & real world environments, mean that most common RL approaches are subject to over fitting and fail to generalise to unseen environments. Hardware agnostic policies would mitigate this by allowing a single network to operate in a variety of test domains, where dynamics vary due to changes in robotic morphologies or internal parameters. We utilise the idea that learning to adapt a known and successful control policy is easier and more flexible than jointly learning numerous control policies for different morphologies. This paper presents the idea of Hardware Agnostic Reinforcement Learning using Adversarial selection (HARL-A). In this approach training examples are sampled using a novel adversarial loss function. This is designed to self regulate morphologies based on their learning potential. Simply applying our learning potential based loss function to current state-of- the-art already provides ~ 30% improvement in performance. Meanwhile experiments using the full implementation of HARL-A report an average increase of 70% to a standard RL baseline and 55% compared with current state-of-the-art.

Files and links (2)

pdf

HARL-A Hardware Agnostic Reinforcement Learning ThroughAdversarial Selection846.59 kBDownload View

Author Open Access

url

https://www.iros2021.org/View

Conference website

Metrics

187 File views/ downloads

72 Record Views

Details

Title: HARL-A: Hardware Agnostic Reinforcement Learning Through Adversarial Selection
Creators: LUCY ELAINE JACKSON - University of Surrey, School of Computer Science and Electronic Engineering
Steve Eckersley - Surrey Satellite Technology
Pete Senior
SIMON J HADFIELD - University of Surrey, School of Computer Science and Electronic Engineering
Publication Details: 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), pp.3499-3505
Conference: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021) (Prague, Czech Republic, 27/09/2021 - 01/10/2021)
Series: IEEE International Conference on Intelligent Robots and Systems
Publisher: IEEE
Number of pages: 7
Date published: 16/12/2021
Date accepted: 30/06/2021
Grants: Reflexive robotics using asynchronous perception, EP/S035761/1, Engineering and Physical Sciences Research Council (United Kingdom, Swindon) - EPSRC
Surrey Satellite Technology (United Kingdom, Guildford) - SSTL
Grant note: This work was partially supported by Surrey Satellite Technology Ltd and the UK Engineering and Physical Sciences Research Council (EPSRC) grant agreement EP/S035761/1 ‘Reflexive Robotics’.
Identifiers: 99587623202346
Academic Unit: School of Computer Science and Electronic Engineering
Language: English
Resource Type: Conference proceeding

HARL-A: Hardware Agnostic Reinforcement Learning Through Adversarial Selection

Abstract

Files and links (2)

Metrics

Details

Usage Policy