-
Notifications
You must be signed in to change notification settings - Fork 6
added initial reward collision model #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
oravner
wants to merge
20
commits into
master
Choose a base branch
from
reward-collision-model
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
57b3ac7
added initial reward collision model
or-vsec fc9d687
fixes according to CR
or-vsec afedc49
fixed bug with test measures not updating on first epoch
or-vsec 3ff7ac6
added assert if batch contains goal statuses
or-vsec 17d36ed
added initial reward model
or-vsec 7bc40ca
added coordinates inputs to conv layer as in Uber's CoordConv paper
or-vsec ff14c40
changed to official coordnet code
or-vsec f8ad051
fixed bug with learning rate
or-vsec 0ce2d4e
added flag for coordnet; added accuracy measures for training
or-vsec f316029
modified to work with docker version
83de16a
Added script for creating simple workspaces for reward model
or-vsec 17658af
merge
or-vsec 8641727
Added support for Resnet for vision scenarios
or-vsec 115fce6
Now testing and saving model every specified number of training batches
or-vsec 3f3e906
Now resnet can work without coordnet
or-vsec 7b4c1fd
minor change in resnet_model
or-vsec 7e5a2fd
added vae support
or-vsec baa8367
changed config structure
or-vsec 5e7379d
changed reward config to best vision_harder performence
or-vsec 017b9c8
bug fixed in reward_model.py
or-vsec File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| import tensorflow as tf | ||
| from tensorflow.python.layers import base | ||
|
|
||
|
|
||
| class AddCoords(base.Layer): | ||
| """Add coords to a tensor""" | ||
| def __init__(self, x_dim=64, y_dim=64, with_r=False): | ||
| super(AddCoords, self).__init__() | ||
| self.x_dim = x_dim | ||
| self.y_dim = y_dim | ||
| self.with_r = with_r | ||
|
|
||
| def call(self, input_tensor, **kwargs): | ||
| """ | ||
| input_tensor: (batch, x_dim, y_dim, c) | ||
| """ | ||
| batch_size_tensor = tf.shape(input_tensor)[0] | ||
| xx_ones = tf.ones([batch_size_tensor, self.x_dim], dtype=tf.int32) | ||
| xx_ones = tf.expand_dims(xx_ones, -1) | ||
| xx_range = tf.tile(tf.expand_dims(tf.range(self.y_dim), 0), | ||
| [batch_size_tensor, 1]) | ||
| xx_range = tf.expand_dims(xx_range, 1) | ||
| xx_channel = tf.matmul(xx_ones, xx_range) | ||
| xx_channel = tf.expand_dims(xx_channel, -1) | ||
| yy_ones = tf.ones([batch_size_tensor, self.y_dim], dtype=tf.int32) | ||
| yy_ones = tf.expand_dims(yy_ones, 1) | ||
| yy_range = tf.tile(tf.expand_dims(tf.range(self.x_dim), 0), | ||
| [batch_size_tensor, 1]) | ||
| yy_range = tf.expand_dims(yy_range, -1) | ||
| yy_channel = tf.matmul(yy_range, yy_ones) | ||
| yy_channel = tf.expand_dims(yy_channel, -1) | ||
| xx_channel = tf.cast(xx_channel, 'float32') / (self.x_dim - 1) | ||
| yy_channel = tf.cast(yy_channel, 'float32') / (self.y_dim - 1) | ||
| xx_channel = xx_channel*2 - 1 | ||
| yy_channel = yy_channel*2 - 1 | ||
| ret = tf.concat([input_tensor, xx_channel, yy_channel], axis=-1) | ||
| if self.with_r: | ||
| rr = tf.sqrt(tf.square(xx_channel) + tf.square(yy_channel)) | ||
| ret = tf.concat([ret, rr], axis=-1) | ||
| return ret | ||
|
|
||
|
|
||
| class CoordConv(base.Layer): | ||
| """CoordConv layer as in the paper.""" | ||
| def __init__(self, x_dim, y_dim, with_r, *args, **kwargs): | ||
| super(CoordConv, self).__init__() | ||
| self.addcoords = AddCoords(x_dim=x_dim, | ||
| y_dim=y_dim, | ||
| with_r=with_r) | ||
| self.conv = tf.layers.Conv2D(*args, **kwargs) | ||
|
|
||
| def call(self, input_tensor, **kwargs): | ||
| ret = self.addcoords(input_tensor) | ||
| ret = self.conv(ret) | ||
| return ret | ||
|
|
||
|
|
||
| def coord_conv(x_dim, y_dim, with_r, inputs, *args, **kwargs): | ||
| layer = CoordConv(x_dim, y_dim, with_r, *args, **kwargs) | ||
| return layer.apply(inputs) | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,20 +1,28 @@ | ||
| import tensorflow as tf | ||
| from coordnet_model import coord_conv | ||
|
|
||
|
|
||
| class DqnModel: | ||
| def __init__(self, prefix): | ||
| def __init__(self, prefix, config): | ||
| self.prefix = '{}_dqn'.format(prefix) | ||
| self.config = config | ||
| self.use_coordnet = self.config['network']['use_coordnet'] | ||
|
|
||
| def predict(self, workspace_image, reuse_flag): | ||
| conv1 = tf.layers.conv2d(workspace_image, 32, 8, 4, padding='same', activation=tf.nn.relu, use_bias=True, | ||
| name='{}_conv1'.format(self.prefix), reuse=reuse_flag) | ||
| conv2 = tf.layers.conv2d(conv1, 64, 4, 2, padding='same', activation=tf.nn.relu, use_bias=True, | ||
| if self.use_coordnet: | ||
| workspace_image = coord_conv(55, 111, False, workspace_image, 32, 8, 4, padding='same', | ||
| activation=tf.nn.relu, use_bias=True, name='{}_conv1'.format(self.prefix), | ||
| _reuse=reuse_flag) | ||
|
|
||
| conv2 = tf.layers.conv2d(workspace_image, 64, 4, 2, padding='same', activation=tf.nn.relu, use_bias=True, | ||
| name='{}_conv2'.format(self.prefix), reuse=reuse_flag) | ||
| # conv3 = tf.layers.conv2d(conv2, 64, 3, 1, padding='same', activation=tf.nn.relu, use_bias=True) | ||
| # flat = tf.layers.flatten(conv3) | ||
| flat = tf.layers.flatten(conv2, name='{}_flat'.format(self.prefix)) | ||
| conv3 = tf.layers.conv2d(conv2, 64, 3, 1, padding='same', activation=tf.nn.relu, use_bias=True) | ||
|
|
||
| flat = tf.layers.flatten(conv3, name='{}_flat'.format(self.prefix)) | ||
| dense1 = tf.layers.dense(flat, 512, activation=tf.nn.relu, name='{}_dense1'.format(self.prefix), | ||
| reuse=reuse_flag) | ||
| dense2 = tf.layers.dense(dense1, 512, activation=None, name='{}_dense2'.format(self.prefix), reuse=reuse_flag) | ||
| return dense2 | ||
|
|
||
| dense2 = tf.layers.dense(dense1, 512, activation=tf.nn.relu, name='{}_dense2'.format(self.prefix), | ||
| reuse=reuse_flag) | ||
| dense3 = tf.layers.dense(dense2, 512, activation=tf.nn.relu, name='{}_dense3'.format(self.prefix), | ||
| reuse=reuse_flag) | ||
| return dense3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| from workspace_generation_utils import * | ||
| from image_cache import ImageCache | ||
| import os | ||
|
|
||
| TOTAL_WORKSPACES = 10000 | ||
| OUTPUT_DIR = "scenario_params/vision_harder" | ||
|
|
||
|
|
||
| if not os.path.isdir(OUTPUT_DIR): | ||
| os.makedirs(OUTPUT_DIR) | ||
|
|
||
| generator = WorkspaceGenerator(obstacle_count_probabilities={2: 0.05, 3: 0.5, 4: 0.4, 5: 0.05}) | ||
| for i in range(TOTAL_WORKSPACES): | ||
| save_path = os.path.join(OUTPUT_DIR, '{}_workspace.pkl'.format(i)) | ||
|
|
||
| if os.path.exists(save_path): | ||
| print("workspace %d already exists" % i) | ||
| continue | ||
|
|
||
| print("generateing workspace %d" % i) | ||
| workspace_params = generator.generate_workspace() | ||
| workspace_params.save(save_path) | ||
|
|
||
| print("Creating Image Cache") | ||
| ImageCache(OUTPUT_DIR, True) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,194 @@ | ||
| from reward_data_manager import get_image_cache | ||
| import time | ||
| import datetime | ||
| import numpy as np | ||
| import os | ||
| import yaml | ||
| import tensorflow as tf | ||
| from vae_network import VAENetwork | ||
|
|
||
|
|
||
| class VAEModel: | ||
|
|
||
| def __init__(self, model_name, config, models_base_dir, tensorboard_dir): | ||
|
|
||
| self.model_name = model_name | ||
| self.config = config | ||
|
|
||
| self.model_dir = os.path.join(models_base_dir, self.model_name) | ||
| if not os.path.exists(self.model_dir): | ||
| os.makedirs(self.model_dir) | ||
|
|
||
| self.train_summaries = [] | ||
| self.test_summaries = [] | ||
|
|
||
| self.epochs = config['general']['epochs'] | ||
| self.save_every_epochs = config['general']['save_every_epochs'] | ||
| self.train_vae = config['reward']['train_vae'] | ||
|
|
||
| inputs_example = tf.placeholder(tf.float32, (None, 55, 111), name='example') | ||
| self.network = VAENetwork(config, self.model_dir, inputs_example.shape) | ||
|
|
||
| self.global_step = 0 | ||
| self.global_step_var = tf.Variable(0, trainable=False) | ||
|
|
||
| self.loss = self.init_loss() | ||
| self.optimizer = self.init_optimizer() | ||
|
|
||
| with open(os.path.join(self.model_dir, 'config.yml'), 'w') as fd: | ||
| yaml.dump(config, fd) | ||
|
|
||
| self.train_board = self.TensorBoard(tensorboard_dir, 'train_' + model_name, self.train_summaries) | ||
| self.test_board = self.TensorBoard(tensorboard_dir, 'test_' + model_name, self.test_summaries) | ||
|
|
||
| def load(self, session): | ||
| self.network.load_weights(session) | ||
|
|
||
| def make_feed(self, data_batch): | ||
| return self.network.make_feed(*data_batch) | ||
|
|
||
| def predict(self, data_batch, session): | ||
| feed = self.make_feed(data_batch) | ||
| return session.run([self.prediction], feed)[0] | ||
|
|
||
| def init_loss(self): | ||
| status_loss_scale = self.config['reward']['cross_entropy_coefficient'] | ||
| img_loss, latent_loss, total_loss = self.network.get_loss() | ||
|
|
||
| image_loss_summary = tf.summary.scalar('Image_Loss', img_loss) | ||
| latent_loss_summary = tf.summary.scalar('Latent_Loss', latent_loss) | ||
|
|
||
| regularization_loss = tf.losses.get_regularization_loss() | ||
| regularization_loss_summary = tf.summary.scalar('Regularization_Loss', regularization_loss) | ||
|
|
||
| # total_loss = total_loss + regularization_loss | ||
| total_loss_summary = tf.summary.scalar('Total_Loss', total_loss) | ||
|
|
||
| self.train_summaries += [image_loss_summary, latent_loss_summary, regularization_loss_summary, total_loss_summary] | ||
| self.test_summaries += [image_loss_summary, latent_loss_summary, regularization_loss_summary, total_loss_summary] | ||
|
|
||
| return total_loss | ||
|
|
||
| def init_optimizer(self): | ||
| initial_learn_rate = self.config['reward']['initial_learn_rate'] | ||
| decrease_learn_rate_after = self.config['reward']['decrease_learn_rate_after'] | ||
| learn_rate_decrease_rate = self.config['reward']['learn_rate_decrease_rate'] | ||
|
|
||
| learning_rate = tf.train.exponential_decay(initial_learn_rate, | ||
| self.global_step_var, | ||
| decrease_learn_rate_after, | ||
| learn_rate_decrease_rate, | ||
| staircase=True) | ||
| self.train_summaries.append(tf.summary.scalar('Learn_Rate', learning_rate)) | ||
|
|
||
| optimizer = tf.train.AdamOptimizer(learning_rate) | ||
|
|
||
| gradients, variables = zip(*optimizer.compute_gradients(self.loss, tf.trainable_variables())) | ||
| initial_gradients_norm = tf.global_norm(gradients) | ||
| gradient_limit = self.config['reward']['gradient_limit'] | ||
| if gradient_limit > 0.0: | ||
| gradients, _ = tf.clip_by_global_norm(gradients, gradient_limit, use_norm=initial_gradients_norm) | ||
| clipped_gradients_norm = tf.global_norm(gradients) | ||
| initial_gradients_norm_summary = tf.summary.scalar('Gradients_Norm_Initial', initial_gradients_norm) | ||
| clipped_gradients_norm_summary = tf.summary.scalar('Gradients_Norm_Clipped', clipped_gradients_norm) | ||
| self.train_summaries += [initial_gradients_norm_summary, clipped_gradients_norm_summary] | ||
| self.test_summaries += [initial_gradients_norm_summary, clipped_gradients_norm_summary] | ||
|
|
||
| return optimizer.apply_gradients(zip(gradients, variables), global_step=self.global_step_var) | ||
|
|
||
| def _train_batch(self, train_batch, session): | ||
| train_feed = {self.network.workspace_image_inputs: train_batch} | ||
| train_summary, self.global_step, img_loss, _ = session.run( | ||
| [self.train_board.summaries, self.global_step_var, self.network.encoded, self.optimizer], | ||
| train_feed) | ||
| # print(img_loss) | ||
| self.train_board.writer.add_summary(train_summary, self.global_step) | ||
|
|
||
| def _test_batch(self, test_batch, session): | ||
| test_feed = {self.network.workspace_image_inputs: test_batch} | ||
| test_summary = session.run( | ||
| [self.test_board.summaries], | ||
| test_feed)[0] | ||
| self.test_board.writer.add_summary(test_summary, self.global_step) | ||
| self.test_board.writer.flush() | ||
|
|
||
| def train(self, train_data, test_data, session): | ||
| session.run(tf.global_variables_initializer()) | ||
| session.run(tf.local_variables_initializer()) | ||
|
|
||
| test_every_batches = self.config['reward']['test_every_batches'] | ||
|
|
||
| total_train_batches = 0 | ||
| for epoch in range(self.epochs): | ||
|
|
||
| train_batch_count = 1 | ||
| for train_batch in train_data: | ||
| self._train_batch(train_batch, session) | ||
| print("Finished epoch %d/%d batch %d/%d" % (epoch+1, self.epochs, train_batch_count, total_train_batches)) | ||
| train_batch_count += 1 | ||
|
|
||
| if train_batch_count % test_every_batches == 0: | ||
| test_batch = next(test_data.__iter__()) # random test batch | ||
| self._test_batch(test_batch, session) | ||
| # save the model | ||
| # self.network.save_weights(session, self.global_step) | ||
|
|
||
| total_train_batches = train_batch_count - 1 | ||
| self.train_board.writer.flush() | ||
|
|
||
| test_batch = next(test_data.__iter__()) # random test batch | ||
| self._test_batch(test_batch, session) | ||
|
|
||
| # save the model | ||
| # if epoch == self.epochs - 1 or epoch % self.save_every_epochs == self.save_every_epochs - 1: | ||
| # self.network.save_weights(session, self.global_step) | ||
|
|
||
| print('done epoch {} of {}, global step {}'.format(epoch, self.epochs, self.global_step)) | ||
|
|
||
| class TensorBoard: | ||
|
|
||
| def __init__(self, tensorboard_path, board_name, summaries): | ||
| self.writer = tf.summary.FileWriter(os.path.join(tensorboard_path, board_name)) | ||
| self.summaries = tf.summary.merge(summaries) | ||
|
|
||
|
|
||
| def count_weights(): | ||
| total_parameters = 0 | ||
| for variable in tf.trainable_variables(): | ||
| # shape is an array of tf.Dimension | ||
| shape = variable.get_shape() | ||
| variable_parameters = 1 | ||
| for dim in shape: | ||
| variable_parameters *= dim.value | ||
| total_parameters += variable_parameters | ||
| print(total_parameters) | ||
|
|
||
| if __name__ == '__main__': | ||
| # read the config | ||
| config_path = os.path.join(os.getcwd(), 'data/config/reward_config.yml') | ||
| with open(config_path, 'r') as yml_file: | ||
| config = yaml.load(yml_file) | ||
| print('------------ Config ------------') | ||
| print(yaml.dump(config)) | ||
|
|
||
| model_name = "vae" + datetime.datetime.fromtimestamp(time.time()).strftime('%Y_%m_%d_%H_%M_%S') | ||
|
|
||
| image_cache = get_image_cache(config) | ||
| batch_size = 1 | ||
| images_data = [image.np_array for image in image_cache.items.values()] | ||
| images_batch_data = [images_data[i:i+batch_size] for i in range(0, len(images_data), batch_size)] | ||
|
|
||
| train_data_count = int(len(images_batch_data) * 0.8) | ||
| train_data = images_batch_data[:train_data_count] | ||
| test_data = images_batch_data[train_data_count:] | ||
|
|
||
| models_base_dir = os.path.join('data', 'reward', 'model') | ||
| vae_model = VAEModel(model_name, config, models_base_dir, tensorboard_dir=models_base_dir) | ||
|
|
||
|
|
||
|
|
||
| gpu_usage = config['general']['gpu_usage'] | ||
| session_config = tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=gpu_usage)) | ||
| with tf.Session(config=session_config) as session: | ||
| count_weights() | ||
| vae_model.train(train_data, test_data, session) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all the below changes justified? especially the oversampling ratios...