Skip to main content

Autonomous Agents & Reinforcement Learning Simulation

Designed and evaluated learning-based autonomous agents competing to optimise task performance in a simulated environment.

Overview

This project was a Master’s-level academic AI assignment focused on the design, implementation, and evaluation of autonomous learning agents operating within a controlled simulation environment.

The system simulated three independent agents competing to clean a grid-based environment containing randomly distributed dust. Each agent followed a different behavioural strategy and was evaluated on its ability to learn, adapt, and optimise performance over time through reward and punishment mechanisms.


Core Objectives

The project was guided by several learning and experimentation goals:

  • model autonomous agents interacting with an environment
  • apply reward–punishment mechanisms to drive behavioural improvement
  • compare multiple learning strategies under identical conditions
  • visualise agent behaviour and environment state over time
  • measure performance objectively across repeated iterations
  • design the system for experimentation rather than one-off execution

Simulation Environment

The environment was implemented as a discrete grid world, containing:

  • dynamic dust distribution
  • agent position, movement, and action constraints
  • scoring and feedback mechanisms
  • iteration-based execution to observe learning effects

The simulation was visualised using Pygame, allowing real-time observation of agent decisions, movement patterns, and overall system behaviour.


Agent Design & Learning Behaviour

Three distinct agents were implemented, each embodying a different behavioural or learning strategy.

Key characteristics included:

  • action selection influenced by prior outcomes
  • reward for effective cleaning actions
  • penalties for inefficient or counterproductive behaviour
  • iterative improvement across simulation runs

The focus was not on complex neural models, but on clear, interpretable learning logic, making agent behaviour observable and explainable.


Architecture & Code Structure

The system was designed with modularity and clarity in mind:

  • clean separation between environment, agents, and evaluation logic
  • reusable components to support additional agent strategies
  • clear interfaces for experimentation and parameter tuning
  • unit tests validating core logic and behavioural rules

This allowed agents to be swapped, compared, and extended without destabilising the system.


Evaluation & Comparison

Agent performance was evaluated across multiple runs using consistent metrics, such as:

  • time to clean the environment
  • efficiency of movement
  • learning improvement across iterations
  • stability and predictability of behaviour

Results were analysed comparatively to understand the strengths and limitations of each approach.


Timeframe & Context

  • Duration: ~7 months
  • Context: Master’s degree academic project
  • Focus: AI fundamentals, learning systems, experimentation
  • Constraints: Explainability, reproducibility, academic rigour

The project coincided with growing public interest in AI systems, reinforcing the importance of understanding learning mechanisms at a foundational level.


Skills Demonstrated

This project highlights skills in:

  • AI system design
  • reinforcement learning fundamentals
  • simulation modelling
  • autonomous agent behaviour
  • experimental comparison and evaluation
  • Python-based architecture and visualisation
  • writing clean, testable, and extensible code

Why This Project Matters

Rather than focusing on black-box models, this project emphasised:

  • how agents learn
  • why behaviour changes
  • what trade-offs different strategies introduce

It reflects an engineering mindset that values understanding and control alongside performance — a perspective that remains crucial even as AI tooling evolves.


Final Note

This project is fully academic and not subject to confidentiality constraints.

It represents an early, structured engagement with learning systems and autonomous decision-making, forming a strong conceptual foundation for later work involving more advanced AI tools and frameworks.