NaN-handling for ensemble CRPS#118
Conversation
|
@sallen12 I reimplemented the NaN-handling like you suggested, using the ensemble weighting functionality and setting NaN members weights (and values) to zero. I only realized this was indeed an option after merging main into this branch. My bad! This is now greatly simplified. There is still a couple of edge cases that need to be handled (AKR estimators). Perhaps we can have a chat about this soon. |
…', 'int' estimators
|
@nicholasloveday any feedback here would be welcome! |
There was a problem hiding this comment.
Note: I left the ad-hoc AKR estimators for the 'omit' nan policy, but we are not using them in the high level API. We will decide later. For now, choosing AKR estimators or INT with 'omit' policy will raise a NotImplementedException.
Thanks @frazane ! Great to see this up! Will the plan be to expand this approach to all of the ensemble scores? |
Yes, we started with the CRPS to keep the problem size a bit smaller to start with. Once we are happy with the pattern, we will extend to other scores. |
| sigma = abs(np.random.randn(N)) * 0.5 | ||
| fct = np.random.randn(N, ENSEMBLE_SIZE) * sigma[..., None] + mu[..., None] | ||
| uniform_ens_w = np.ones(fct.shape) | ||
| non_uniform_ens_w = np.random.rand(*fct.shape) |
There was a problem hiding this comment.
We should also test here whether we get the correct result when there are NaNs in ens_w
There was a problem hiding this comment.
Added test_crps_ensemble_nan_weights, covering propagate/raise/omit for NaN entries in ens_w — including equivalence with dropping the NaN-weighted members and a non-default m_axis check.
sallen12
left a comment
There was a problem hiding this comment.
Looks great, thanks a lot for doing this! I have a few minor comments, but otherwise ready to merge
This pull request introduces handling of NaN values in ensemble CRPS scoring functions via NaN policy selection. Three policies are supported:
'propagate': if any member of an ensemble is NaN, the function returns NaN'omit': the function will omit NaN values from the computation'raise': the function will raise an error if any NaNs are present in the input arraysThese are now values given to a new
nan_policyparameter forcrps_ensemble,twcrps_ensemble,owcrps_ensemble, andvrcrps_ensemble.