-
Notifications
You must be signed in to change notification settings - Fork 41
Initial RBM Implementation - Thesis #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Added new method so we can propagate down to multiply visible neurons by weights of hidden layer
ereator
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @shahinmv, thank you for this Pull Request.
Well done!
I have made 2 changes to the code:
- Replaced the
rand()method with therandom::U()method , so all the CI checks are passed - Removed mentioning obsolete files for neuron layer class with explicit biases.
Next, I have left some comments in the review. Once the demo code will produce the results comparable to Perceptron, I will take a deeper look into the RBM class
modules/DNN/NeuronLayer.h
Outdated
| * @details This function updates the neurons' values as \f$ netValues_{(1\times N)} = weights^{\top}_{(C\times N)}\times values_{(1\times C)} + biases_{((1\times N))}\f$ | ||
| * @note This method updates only the nodes' net values | ||
| */ | ||
| DllExport void dotProdVis(const Mat& values, Mat weights); //new method was added, purpose for it is to multiply visible neurons by weights of hidden layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please move the declaration of this method up to be above the documentation (comment between /** and */) for the
ditProd()method. - What is the difference between old
ditProd()and newdotProdVis()methods? - Argument
weightsis better to pass not by value, but by constant reference, i.e.const Mat& weights
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next changes, I have:
- moved the declaration of this method and added more detailed comment on why new dotProd was needed.
- Argument is passed by constant reference now.
modules/DNN/NeuronLayer.cpp
Outdated
| m_weights = random::U(m_weights.size(), m_weights.type(), -0.5f, 0.5f); | ||
| m_biases = random::U(m_biases.size(), m_biases.type(), -0.5f, 0.5f); | ||
| } | ||
| void CNeuronLayer::generateRandomWeights(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not change the methods which are not affected by the Pull Request. Changing just indentation makes it extremely difficult to track other important changes in the code
modules/DNN/NeuronLayer.cpp
Outdated
| // this->m_netValues = this->m_weights * values + m_biases; | ||
| gemm(m_weights.t(), values, 1, m_biases, 1, m_netValues); | ||
| } | ||
| void CNeuronLayer::dotProd(const Mat& values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the dotProd(values) method is:
m_netValues = m_weights * values + m_biases;
And the dotProdVis(values, weights) method is:
m_netValues = weights * values + m_biases;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, reason is dotProd(values) multiplied the values by this->weights, but dotProdVis(values, weights) multiplies values by weights that are passed through. When algorithm updates the visible units, we need to multiply the hidden units by weights, and since we only generate random weights for hidden layer, and we know that visible units = hidden weights x hidden neurons , It was not possible to replicate in already available dotProd(values).
modules/DNN/RBM.h
Outdated
|
|
||
| DllExport void debug(); //printing out rows and cols just to be sure they are correct | ||
|
|
||
| DllExport Mat getBinomial(Mat mean); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- All the matrices is better to pass not by value, but by reference
- Please add some comments to the methods and member-variable. What do they do, why are they needed, etc.
| } | ||
|
|
||
| Mat CRBM::propagateDown(Mat values){ | ||
| m_vpNeuronLayers[0]->dotProdVis(values, m_vpNeuronLayers[1]->getWeights()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use here just the old dotProd() method?
I.e.m_vpNeuronLayers[1]->dotProd(values); (index 1 instead of 0) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried my best to explain why I used dotProdVis(values, weights) above. Please check it out.
|
I made the changes that you requested. About |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not fully understand the reason for new method. Let us go deeper in it.
We have now:
void CNeuronLayer::dotProd(const Mat& values)which does
m_netValues = m_weights * values + m_biases;void CNeuronLayer::dotProdVis(const Mat& values, const Mat& weights)which does
m_netValues = weights * values + m_biases;
The only place in code where the new method is used is:
Mat CRBM::propagateDown(const Mat& values)
{
m_vpNeuronLayers[0]->dotProdVis(values, m_vpNeuronLayers[1]->getWeights());
return m_vpNeuronLayers[0]->getValues();
}
if we abbreviate m_vpNeuronLayers to m_vpNL we can write that the following calculation is actually happening:
m_vpNL[0]->m_netValues = m_vpNL[1]->weights * values + m_vpNL[0]->m_biases;
All right, now I see that this is smth different from what I've suggested.
One question here, don't we need also to use the corresponding biases here? i.e. m_vpNL[1]->m_biases instead of m_vpNL[0]->m_biases ?
This is the initial implementation of RBM. Algorithm should be correct, tested each variable for rows and cols, also dot products just to be sure. Created new method for Neuronlayer class, for multiplying visible layer by weights of hidden layer. Initial implementation of it is commented out as I found out that the gemm function works perfectly for it.