Skip to content

added initial reward collision model#2

Open
oravner wants to merge 20 commits intomasterfrom
reward-collision-model
Open

added initial reward collision model#2
oravner wants to merge 20 commits intomasterfrom
reward-collision-model

Conversation

@oravner
Copy link
Collaborator

@oravner oravner commented May 2, 2019

No description provided.

self.workspace_image_inputs = tf.placeholder(tf.float32, (None, 55, 111), name='workspace_image_inputs')
self.images_3d = tf.expand_dims(self.workspace_image_inputs, axis=-1)
self.goal_pose_inputs = tf.placeholder(tf.float32, (None, 2), name='goal_pose_inputs')
self.goal_pose_inputs = None # tf.placeholder(tf.float32, (None, 2), name='goal_pose_inputs')
Copy link
Owner

@tomjur tomjur May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove unused inputs - you will also need to change make feed, remove _generate_goal_features and anything else that is no longer relevant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot to remove this file. Please notice that it breaks the whole project since we haven't wrote the deterministic reward model which wraps the collision model.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem, the next step should fix this.
i just don't want to carry over dead code that we'll forget to delete

else:
workspace_id, start_joints, goal_joints, action, next_joints, reward, terminated, status = batch[i]
goal_pose = openrave_manager.get_target_pose(goal_joints)
goal_pose = None #openrave_manager.get_target_pose(goal_joints)
Copy link
Owner

@tomjur tomjur May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove goal_pose entirely + remove the newly unused openrave_manager -> see if you can also remove the dependency on OpenRaveManager completely for the training process

Copy link
Collaborator Author

@oravner oravner May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot to remove this file. Please notice that it breaks the whole project since we haven't wrote the deterministic reward model which wraps the collision model.

@@ -0,0 +1,171 @@
from reward_collision_network import CollisionNetwork
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general comments for file:

  1. why do we need the class TensorBoard and the logic is not inside CollisionModel? it would be less messy to have it inside CollisionModel - or do we need it elsewhere also?
  2. this file is a trainer class right? if so please rename CollisionModel to CollisionTrainer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I had many dilemmas how to use the tensorboard reporting since it's API is not very design convenient. Good idea, it will be an internal class inside the CollisionModel. If we will use it in other models as well (such as the Reward itself) I will move it to some utility file.
  2. I thought to rather separate the 2 models to: 1. Model which only holds the network logic and architecture. 2. Model which wraps the network logic for training and prediction. What do you think is better and why?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have it right - 2 classes work best. what I meant in my comment was to just to rename to "Trainer" since the role of the non-network class is to train the network class.

self.status_input = tf.placeholder(tf.int32, [None, ])
self.prediction = tf.argmax(tf.nn.softmax(self.net_output), axis=1)

self.global_step = 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that if you don't use a global_step var, then tf keeps a global step var that gets updated every optimize call.
summaries are automatically updated accordingly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. You recommend removing this var and leave only the tf Variable? This is the same logic as was in the reward model code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the reward model was also incorrect (:

I think that if you don't specify a global step (ever) than tf holds an internal counter that raises every time you call an optimize function.

maybe we can remove and see if everything just works as expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the tf doc, i need to create tf.variable of global_step and pass it to minimize / apply_gradients for it to be updated. Then I need to create my own pythonic variable of global step to pass to tensorboard and the tf.saver

[self.test_board.summaries] + self.test_measures,
test_feed)[0]
self.test_board.writer.add_summary(test_summary, self.global_step)
self.test_board.writer.flush()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both methods above see comments about:
global step
data_batch -> explicit parameters


def train(self, train_data, test_data, image_cache, session):
session.run(tf.global_variables_initializer())
session.run(tf.local_variables_initializer())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tf.initialize_all_variables()

Copy link
Collaborator Author

@oravner oravner May 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got warning tf.initialize_all_variables() is deprecated and should use tf.global_variables_initializer() instead ):

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so why the second line? also a warning?
this comment shows i was working with a deprecated tf version (:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason some variables are classified as globals and some are locals


for train_batch in train_data:
train_batch, train_status_batch = get_batch_and_labels(train_batch, image_cache)
# TODO: assert if train_status contains goal status
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that all the data doesn't contain any goal status sample, so it will be best practice to assert here if it does exist

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so use assert so it is not left as a comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V

Copy link
Owner

@tomjur tomjur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great job - definitely more readable than my previous main function (:
see comments and questions inline!

@@ -1,14 +1,14 @@
general:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all the below changes justified? especially the oversampling ratios...

self.net_output = self.network.status_softmax_logits
self.status_input = tf.placeholder(tf.int32, [None, ])
# TODO: fix collision_prob
self.collision_prob = tf.expand_dims(tf.nn.softmax(self.net_output)[:, 1], -1)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo-?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants