Replication of Multi-Agent Reinforcement Learning for “Hide & Seek” Problem / (Record no. 594849)

000 -LEADER
fixed length control field 01860nam a22001697a 4500
003 - CONTROL NUMBER IDENTIFIER
control field NUST
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.1,KAM
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name Kamal, Muhammad Haider
245 ## - TITLE STATEMENT
Title Replication of Multi-Agent Reinforcement Learning for “Hide & Seek” Problem /
Statement of responsibility, etc. Muhammad Haider Kamal,
264 ## - PRODUCTION, PUBLICATION, DISTRIBUTION, MANUFACTURE, AND COPYRIGHT NOTICE
Place of production, publication, distribution, manufacture Rawalpindi
Name of producer, publisher, distributor, manufacturer MCS, NUST
Date of production, publication, distribution, manufacture, or copyright notice 2023
300 ## - PHYSICAL DESCRIPTION
Extent viii, 79 p
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note Reinforcement learning generates policies based on reward functions and hyperparameters. Slight changes in these can significantly affect results. The lack of documentation and reproducibility in Reinforcement learning research makes it difficult to replicate once-deduced strategies. While previous research has identified strategies using grounded maneuver, there is limited work in the more complex environments. The agents in this study are simulated similarly to Open Al’s hide and seek agents, in addition to a flying mechanism, enhancing their mobility, and expanding their range of possible actions and strategies. This added functionality improves the Hider agents to develop chasing strategy from approximately 2 million steps to 1.6 million steps and hiders shelter strategy from approximately 25 million steps to 2.3 million steps while using a smaller batch size of 3072 instead of 64000. We also discuss the importance of reward functions design and deployment in a curriculum-based environment to encourage agents to learn basic skills along with the challenges in replicating these Reinforcement learning strategies. We demonstrated that the results of the reinforcement agent can be replicated in more complex environment and similar strategies are evolved including” running and chasing” and ”fort building”.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name entry element MSCSE / MSSE-27
690 ## - LOCAL SUBJECT ADDED ENTRY--TOPICAL TERM (OCLC, RLIN)
Topical term following geographic name as entry element MSCSE / MSSE
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name Supervisor Dr. Muaz Ahmed Khan Niazi
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme
Koha item type Thesis
Holdings
Withdrawn status Permanent Location Current Location Shelving location Date acquired Full call number Barcode Koha item type
  Military College of Signals (MCS) Military College of Signals (MCS) Thesis 05/25/2023 005.1,KAM MCSTCS-544 Thesis
© 2023 Central Library, National University of Sciences and Technology. All Rights Reserved.