Virtual Learning Environment for Emergent Behaviour AI and Transference into Real-World Application

Steggles, Benjamin (2018) Virtual Learning Environment for Emergent Behaviour AI and Transference into Real-World Application. [USQ Project]


The possibility of simulated trained artificial intelligence (AI) for robotic locomotive behaviour has been shown to be a possibility through work such done by others including work done by Schulman and his colleagues (Schulman, Wolski, Dhariwal, Radford, Klimov 2017). However, how accurate this AI is when implemented into an actual robot has not been fully looked at. An attempt to understand how accurate this is was attempted. The first component was to investigate and settle on the optimization algorithm and the AI to be used. The conclusion was to use a policy and value function neural network (NN) and a proximal policy optimization (PPO). This method was sourced from Github, it is titled TRPO (trust region policy optimization) and effectively develops and optimizes the NN (Coady, P, 2017). The benefit of this code was that it was built specifically to make use of a simulation software named Mujoco (Todorov, E, Erez, T, Tassa, Y 2012). It also used OpenAI Gym to develop the AI environment (Brockman, G, Cheung, V, Pettersson, L, Schneider, J, Schulman, J, Tang, J, Zaremba, W 2016). This gave access to pre-built Mujoco models for a bipedal and quadrupedal robot model. A robot was needed after this, the robot that was decided upon was a quadrupedal Arduino based robot it provided complexity that would likely increase any divergences in the operation of the NN when moved from the simulation to the real world. A Mujoco model was altered to match the design of the robot and its operation. Separate code needed to be written and interfaced with the TRPO code to allow for access to the robot with observation of the robot being formed from SHARP infrared height sensor and a 3- axis accelerometer on the robot and an OpenCV ball tracking program running on a raspberry pi with a camera positioned over the board. The ball tracking uses colour masking and contour detection to identify three balls of different colours which represent left, right, and a reference point. A virtual machine (VM) running ubuntu was set up to run the NN and optimization it is setup as a server for transmission control protocol (TCP) communication. The robot is setup as a client and communicates sensor data to the VM and receives servo angles. The raspberry pi is set up as a client and only transfers position and angle information to the VM. With this setup it would be possible to transfer and run the NN on the robot. Unfortunately, this was not achieved, time was a major factor with solvable issues not being addressed as there was not enough time to address them. The only completion that was possible was a 100,000-episode run of the simulation with optimization done at every 20 episodes. This was promising in showing signs that the model was optimizing. However, locomotive behaviour was not achieved. To achieve the locomotive behaviour a longer episode count may have been needed along with shorter batches before optimization. The test run of the robot, the main issue was the wooden board that the robot moved on made it impossible to identify the red ball. The solution was in painting the board black however time constraints meant this could not be attempted. The other issue is that the NN did not provide effective locomotive behaviour. It is a suggestion that any attempts to complete these tests would require an updating of the Mujoco model for the robot as well as an increase on the episode count and decrease of the episode batch size before optimization. Hopefully this paper provides enough inspiration that this issue will be looked at with closer detail.

Statistics for USQ ePrint 40655
Statistics for this ePrint Item
Item Type: USQ Project
Item Status: Live Archive
Additional Information: Bachelor of Engineering (Honours)(Mechatronics)
Faculty/School / Institute/Centre: Historic - Faculty of Health, Engineering and Sciences - School of Mechanical and Electrical Engineering (1 Jul 2013 - 31 Dec 2021)
Supervisors: Low, Tobias
Date Deposited: 15 Sep 2021 05:25
Last Modified: 15 Sep 2021 05:25

Actions (login required)

View Item Archive Repository Staff Only