Skip to content

Conversation

@shahinmv
Copy link
Contributor

@shahinmv shahinmv commented Dec 2, 2021

This is the initial implementation of RBM. Algorithm should be correct, tested each variable for rows and cols, also dot products just to be sure. Created new method for Neuronlayer class, for multiplying visible layer by weights of hidden layer. Initial implementation of it is commented out as I found out that the gemm function works perfectly for it.

Copy link
Member

@ereator ereator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shahinmv, thank you for this Pull Request.
Well done!

I have made 2 changes to the code:

  1. Replaced the rand() method with the random::U() method , so all the CI checks are passed
  2. Removed mentioning obsolete files for neuron layer class with explicit biases.

Next, I have left some comments in the review. Once the demo code will produce the results comparable to Perceptron, I will take a deeper look into the RBM class

* @details This function updates the neurons' values as \f$ netValues_{(1\times N)} = weights^{\top}_{(C\times N)}\times values_{(1\times C)} + biases_{((1\times N))}\f$
* @note This method updates only the nodes' net values
*/
DllExport void dotProdVis(const Mat& values, Mat weights); //new method was added, purpose for it is to multiply visible neurons by weights of hidden layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please move the declaration of this method up to be above the documentation (comment between /** and */) for the ditProd() method.
  • What is the difference between old ditProd() and new dotProdVis() methods?
  • Argument weights is better to pass not by value, but by constant reference, i.e. const Mat& weights

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next changes, I have:

  • moved the declaration of this method and added more detailed comment on why new dotProd was needed.
  • Argument is passed by constant reference now.

m_weights = random::U(m_weights.size(), m_weights.type(), -0.5f, 0.5f);
m_biases = random::U(m_biases.size(), m_biases.type(), -0.5f, 0.5f);
}
void CNeuronLayer::generateRandomWeights(void)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not change the methods which are not affected by the Pull Request. Changing just indentation makes it extremely difficult to track other important changes in the code

// this->m_netValues = this->m_weights * values + m_biases;
gemm(m_weights.t(), values, 1, m_biases, 1, m_netValues);
}
void CNeuronLayer::dotProd(const Mat& values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the dotProd(values) method is:
m_netValues = m_weights * values + m_biases;

And the dotProdVis(values, weights) method is:
m_netValues = weights * values + m_biases;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, reason is dotProd(values) multiplied the values by this->weights, but dotProdVis(values, weights) multiplies values by weights that are passed through. When algorithm updates the visible units, we need to multiply the hidden units by weights, and since we only generate random weights for hidden layer, and we know that visible units = hidden weights x hidden neurons , It was not possible to replicate in already available dotProd(values).


DllExport void debug(); //printing out rows and cols just to be sure they are correct

DllExport Mat getBinomial(Mat mean);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All the matrices is better to pass not by value, but by reference
  • Please add some comments to the methods and member-variable. What do they do, why are they needed, etc.

}

Mat CRBM::propagateDown(Mat values){
m_vpNeuronLayers[0]->dotProdVis(values, m_vpNeuronLayers[1]->getWeights());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use here just the old dotProd() method?
I.e.m_vpNeuronLayers[1]->dotProd(values); (index 1 instead of 0) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried my best to explain why I used dotProdVis(values, weights) above. Please check it out.

@shahinmv
Copy link
Contributor Author

I made the changes that you requested. About neuronlayer.cpp file, it still shows that I changed whole code. Only part changed is that I added dotProdVis(values, weights) method. All the matrices has been changed to const references. I added more comments on why I created new method inside neuron layer, also commented above in the pull request. Please let me know after you check it out.

Copy link
Member

@ereator ereator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not fully understand the reason for new method. Let us go deeper in it.
We have now:

  1. void CNeuronLayer::dotProd(const Mat& values) which does
    m_netValues = m_weights * values + m_biases;
  2. void CNeuronLayer::dotProdVis(const Mat& values, const Mat& weights) which does
    m_netValues = weights * values + m_biases;

The only place in code where the new method is used is:

Mat CRBM::propagateDown(const Mat& values) 
{
	m_vpNeuronLayers[0]->dotProdVis(values, m_vpNeuronLayers[1]->getWeights()); 			
	return m_vpNeuronLayers[0]->getValues();
}

if we abbreviate m_vpNeuronLayers to m_vpNL we can write that the following calculation is actually happening:
m_vpNL[0]->m_netValues = m_vpNL[1]->weights * values + m_vpNL[0]->m_biases;

All right, now I see that this is smth different from what I've suggested.
One question here, don't we need also to use the corresponding biases here? i.e. m_vpNL[1]->m_biases instead of m_vpNL[0]->m_biases ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants