QMIX paper ripped: Monotonic Value Function Factorization for Deep Multi-agent Reinforcement Learning in StarCraft II

Image for post
Image for post
‘I knew you would find your way here…eventually ‘— Queen of Blades to Zeratul —
Image for post
Image for post
Image for post
Image for post
Qa represent the agent network, calculated for each one of the agents that act cooperatively
import torch.nn as nn
import torch.nn.functional as F
class RNNAgent(nn.Module):
def __init__(self, input_shape, args):
super(RNNAgent, self).__init__()
self.args = args
self.fc1 = nn.Linear(input_shape, args.rnn_hidden_dim)
self.rnn = nn.GRUCell(args.rnn_hidden_dim, args.rnn_hidden_dim)
self.fc2 = nn.Linear(args.rnn_hidden_dim, args.n_actions)
def init_hidden(self):
# make hidden states on same device as model
return self.fc1.weight.new(1, self.args.rnn_hidden_dim).zero_()
Image for post
Image for post
Hypernetwork that calculates the forced positive weights of the Mixing Network
class QMixer(nn.Module):
def __init__(self, args):
super(QMixer, self).__init__()
self.args = args
self.n_agents = args.n_agents
self.state_dim = int(np.prod(args.state_shape)) self.embed_dim = args.mixing_embed_dim
self.hyper_w_1 = nn.Linear(self.state_dim, self.embed_dim * self.n_agents)
self.hyper_w_final = nn.Linear(self.state_dim, self.embed_dim)
# State dependent bias for hidden layer self.hyper_b_1 = nn.Linear(self.state_dim, self.embed_dim)# V(s) instead of a bias for the last layers
self.V = nn.Sequential(nn.Linear(self.state_dim, self.embed_dim),
nn.ReLU(), nn.Linear(self.embed_dim, 1))
Image for post
Image for post
Overall architecture of QMIX, in which each DRQN agent is feed into a Mixing Network that uses a constrained hypernetwork to calculate the Global Action-Value function
Image for post
Image for post
‘We adapt’— Queen of Blades

Written by

Artificial Intelligence. Data visualization

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store