77 sparseoptimization pattern discrepancy#125
Conversation
dimalvovs
commented
Nov 15, 2024
- add comments to chi-squared calc
- add 1 to denominator and numerator to avoid NaN in chi-squared calc
vignette
…her" This reverts commit 6a20d7b.
116 wrong chisq reported
…o-main add container build action
…o-main 119 build container on merge to main
drbergman
left a comment
There was a problem hiding this comment.
I can't comment on how the 1+'s will affect calculations and their accuracy for a chisq calculation. I'll only ask: isn't the purpose of the SparseIterator that it will skip all 0 values? So dsq = get<1>(it) * get<1>(it) must be > 0?
Also, though it's clearly outside the scope of this PR, it feels like there's the possibility for computational savings by storing the results of the dot products inside the for loop to be re-used in the while loop. Similarly, I think the dsq values are constant throughout the CoGAPS run (they're just the element-wise square of the genes x sample matrix, right?). Could a static variable be used to store these?
that is exactly right, I think the root cause is the implementation of the sparseIterator, so maybe this draft PR should be more of a proof that that the |
f25f63c to
b06f01b
Compare
|
additional to-dos from a call with @favorov:
|
There was a problem hiding this comment.
Pull request overview
This PR updates the sparse-model chi-squared calculation to avoid NaNs (by adjusting the division terms) and clarifies matrix-role documentation in the normal-model headers.
Changes:
- Updated
SparseNormalModel::chiSq()to avoid divide-by-zero/NaN during chi-squared accumulation. - Added/updated inline comments in
SparseNormalModel::chiSq()for readability. - Refined member comments in
SparseNormalModel.handDenseNormalModel.hto better describe matrix roles.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/gibbs_sampler/SparseNormalModel.h |
Updates member comments describing the data matrix and factor matrices. |
src/gibbs_sampler/SparseNormalModel.cpp |
Changes chi-squared computation logic and adds explanatory inline comments. |
src/gibbs_sampler/DenseNormalModel.h |
Updates member comments describing the data matrix and factor matrices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| float val = get<1>(it); // val = Data | ||
| float dot = gaps::dot(mMatrix.getRow(j), mOtherMatrix->getRow(it.getIndex())); // dot = A * P again | ||
| float dsq = val * val; // dsq = Data squared | ||
| chisq += (1 + dot * (dot - 2 * val - dsq * dot)) / (1 + dsq); // chisq |
| SparseMatrix mDMatrix; // Data Matrix D, samp x genes or genes x samp | ||
| HybridMatrix mMatrix; // A (left mult) or P (right mult) based on mDMatrix | ||
| const HybridMatrix *mOtherMatrix; // pointer to vis-a-vis of mMartix |
| Matrix mDMatrix; // Data Matrix, samples x genes or genes x samples | ||
| Matrix mMatrix; // A (left mult) or P (right mult) based on mDMatrix | ||
| const Matrix *mOtherMatrix; // pointer to vis-a-vis of mMartix |