Regression Metrics
Functional API
SeqMetrics also provides a functional API for all the performance metrics.
- SeqMetrics.r2(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
R2 is a statistical measure of how well the regression line approximates the actual data. Quantifies the percent of variation in the response that the ‘model’ explains. The ‘model’ here is anything from which we obtained predicted array. It is also called coefficient of determination or square of pearson correlation coefficient. More heavily affected by outliers than pearson correlatin r.
\[R^2 = \left( \frac{\sum_{i=1}^{N} \left( \frac{true_i - \bar{true}}{\sigma_{true}} \cdot \frac{predicted_i - \bar{predicted}}{\sigma_{predicted}} \right)}{N - 1} \right)^2\]where the bar above predicted and true indicates the mean of the array.
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import r2 >>> t = np.random.random(10) >>> p = np.random.random(10) >>> r2(t, p)
- SeqMetrics.nse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Nash-Sutcliff Efficiency.
The Nash-Sutcliffe efficiency (NSE) is a normalized statistic that determines the relative magnitude of the residual variance compared to the measured data variance It determines how well the model simulates trends for the output response of concern. But cannot help identify model bias and cannot be used to identify differences in timing and magnitude of peak flows and shape of recession curves; in other words, it cannot be used for single-event simulations. It is sensitive to extreme values due to the squared differ-ences [1]. To make it less sensitive to outliers, [2] proposed log and relative nse.
\[\text{NSE} = 1 - \frac{\sum_{i=1}^{N} (predicted_i - true_i)^2}{\sum_{i=1}^{N} (true_i - \bar{true})^2}\]where the bar above predicted and true indicates the mean of the array.
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse(t, p)
- SeqMetrics.nse_alpha(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Alpha decomposition of the NSE, see Gupta et al. 2009 used in kratzert et al., 2019.
\[\text{NSE}_{\text{alpha}} = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]- Returns:
Alpha decomposition of the NSE
- Return type:
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nse_alpha >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse_alpha(t, p)
- SeqMetrics.nse_beta(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Beta decomposition of NSE. See Gupta et al. 2009 used in kratzert et al., 2019.
\[\text{NSE}_{\text{beta}} = \frac{\mu_{\text{predicted}} - \mu_{\text{true}}}{\sigma_{\text{true}}}\]- Returns:
Beta decomposition of the NSE
- Return type:
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nse_beta >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse_beta(t, p)
- SeqMetrics.nse_mod(true, predicted, treat_arrays: bool = True, j=1, **treat_arrays_kws) float[source]
Gives less weightage to outliers if j=1 and if j>1 then it gives more weightage to outliers. Reference: Krause_ et al., 2005.
\[\text{NSE}_{\text{mod}} = 1 - \frac{\sum_{i=1}^{N} \left| \text{predicted}_i - \text{true}_i \right|^j}{\sum_{i=1}^{N} \left| \text{true}_i - \bar{ ext{true}} \right|^j}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
j –
Examples
>>> import numpy as np >>> from SeqMetrics import nse_mod >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse_mod(t, p)
- SeqMetrics.nse_rel(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Relative Nash-Sutcliff Efficiency.
\[\text{NSE}_{\text{rel}} = 1 - \frac{\sum_{i=1}^{N} \left( \frac{|\text{predicted}_i - \text{true}_i|}{\text{true}_i} \right)^2}{\sum_{i=1}^{N} \left( \frac{|\text{true}_i - \overline{\text{true}}|}{\overline{\text{true}}} \right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nse_rel >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse_rel(t, p)
- SeqMetrics.nse_bound(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Bounded Version of the Nash-Sutcliffe Efficiency (nse)
\[\text{NSE}_{\text{bound}} = \frac{\text{NSE}}{2 - \text{NSE}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nse_bound >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nse_bound(t, p)
- SeqMetrics.r2_score(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws)[source]
This is not a symmetric function. Unlike most other scores, R^2 score score may be negative (it need not actually be the square of a quantity R). This metric is not well-defined for single samples and will return a NaN value if n_samples is less than two.
\[\text{R2}_{\text{score}} = 1 - \frac{\sum_{i=1}^{n} w_i (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} w_i (\text{true}_i - \bar{\text{true}})^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import r2_score >>> t = np.random.random(10) >>> p = np.random.random(10) >>> r2_score(t, p)
- SeqMetrics.adjusted_r2(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[\text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2) \cdot (n - 1)}{n - k - 1} \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import adjusted_r2 >>> t = np.random.random(10) >>> p = np.random.random(10) >>> adjusted_r2(t, p)
- SeqMetrics.kge(true, predicted, treat_arrays: bool = True, return_all=False, **treat_arrays_kws)[source]
Kling-Gupta Efficiency following Gupta_ et al. 2009. This error considers correlation, variability and mean difference/error.
\[\text{KGE} = 1 - \sqrt{(r - 1)^2 + (\alpha - 1)^2 + (\beta - 1)^2}\]\[\alpha = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]\[\beta = \frac{\mu_{\text{predicted}}}{\mu_{\text{true}}}\]In this equation, alpha accounts for the variability (standard deviation), beta accounts for the mean difference and r accounts for the correlation between the true and predicted values. This equation can also be written as belowS
\[\text{KGE} = \frac{\sum_{i=1}^{N} ( \text{true}_i - \bar{\text{true}} ) ( \text{predicted}_i - \bar{\text{predicted}} )}{\sqrt{\sum_{i=1}^{N} ( \text{true}_i - \bar{\text{true}} )^2} \sqrt{\sum_{i=1}^{N} ( \text{predicted}_i - \bar{\text{predicted}} )^2}}\]- output:
If return_all is True, it returns a numpy array of shape (4, ) containing kge, cc, alpha, beta. Otherwise, it returns kge.
kge: Kling-Gupta Efficiency cc: correlation alpha: ratio of the standard deviation beta: ratio of the mean
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
return_all –
Examples
>>> import numpy as np >>> from SeqMetrics import kge >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kge(t, p)
- SeqMetrics.kge_bound(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Bounded Version of the Original Kling-Gupta Efficiency after Mathevet et al. 2006.
\[\text{KGE}_{\text{bound}} = \frac{\text{KGE}}{2 - \text{KGE}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import kge_bound >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kge_bound(t, p)
- SeqMetrics.kge_mod(true, predicted, treat_arrays: bool = True, return_all=False, **treat_arrays_kws)[source]
Modified Kling-Gupta Efficiency after Kling et al. 2012.
\[\text{KGE}_{\text{mod}} = 1 - \sqrt{ \left( \frac{\sum_{i=1}^{n} (true_i - \bar{true})(predicted_i - \bar{predicted})}{\sqrt{\sum_{i=1}^{n} (true_i - \bar{true})^2} \sqrt{\sum_{i=1}^{n} (predicted_i - \bar{predicted})^2}} - 1 \right)^2 + \left( \frac{\frac{\sigma_{predicted}}{\bar{predicted}}}{\frac{\sigma_{true}}{\bar{true}}} - 1 \right)^2 + \left( \frac{\bar{predicted}}{\bar{true}} - 1 \right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
return_all –
Examples
>>> import numpy as np >>> from SeqMetrics import kge_mod >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kge_mod(t, p)
- SeqMetrics.kge_np(true, predicted, treat_arrays: bool = True, return_all=False, **treat_arrays_kws)[source]
Non-parametric Kling-Gupta Efficiency after Pool et al. 2018.
\[cc = \rho(\text{true}, \text{predicted})\]\[\alpha = 1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted(predicted}_i\text{)}}{\text{mean(predicted)} \cdot n} - \frac{\text{sorted(true}_i\text{)}}{\text{mean(true)} \cdot n} \right|\]\[\beta = \frac{\text{mean(predicted)}}{\text{mean(true)}}\]\[\text{KGE}_{\text{np}} = 1 - \sqrt{(cc - 1)^2 + (\alpha - 1)^2 + (\beta - 1)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
return_all –
output –
------ – kge: Kling-Gupta Efficiency cc: correlation alpha: ratio of the standard deviation beta: ratio of the mean
Examples
>>> import numpy as np >>> from SeqMetrics import kge_np >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kge_np(t, p)
- SeqMetrics.corr_coeff(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Pearson correlation coefficient. It measures linear correlatin between true and predicted arrays. It is sensitive to outliers.
\[r = \frac{\sum ^n _{i=1}(e_i - \bar{e})(s_i - \bar{s})}{\sqrt{\sum ^n _{i=1}(e_i - \bar{e})^2} \sqrt{\sum ^n _{i=1}(s_i - \bar{s})^2}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import corr_coeff >>> t = np.random.random(10) >>> p = np.random.random(10) >>> corr_coeff(t, p)
- SeqMetrics.rmse(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws) float[source]
-
\[\text{RMSE} = \sqrt{\frac{\sum_{i=1}^{n} w_i (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} w_i}}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import rmse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rmse(t, p)
- SeqMetrics.rmsle(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Root mean square log error.
This error is less sensitive to outliers . Compared to RMSE, RMSLE only considers the relative error between predicted and actual values, and the scale of the error is nullified by the log-transformation. Furthermore, RMSLE penalizes underestimation more than overestimation. This is especially useful in those studies where the underestimation of the target variable is not acceptable but overestimation can be tolerated .
\[RMSLE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \log(1 + \text{predicted}_i) - \log(1 + \text{true}_i) \right)^2}\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.root_mean_squared_log_error.html
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rmsle >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rmsle(t, p)
- SeqMetrics.mape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Absolute Percentage Error. The MAPE is often used when the quantity to predict is known to remain way above zero. It is useful when the size or size of a prediction variable is significant in evaluating the accuracy of a prediction. It has advantages of scale-independency and interpretability. However, it has the significant disadvantage that it produces infinite or undefined values for zero or close-to-zero actual values.
\[MAPE = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{true_i - predicted_i}{true_i} \right| \times 100\]References
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mape(t, p)
- SeqMetrics.nrmse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Normalized Root Mean Squared Error
\[NRMSE = \frac{\sqrt{\frac{1}{N} \sum_{i=1}^{N} (\text{true}_i - \text{predicted}_i)^2}}{\max(\text{true}) - \min( ext{true})}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nrmse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nrmse(t, p)
- SeqMetrics.pbias(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Percent Bias. It determines how well the model simulates the average magnitudes for the output response of interest. It can also determine over and under-prediction. It cannot be used (1) for single-event simula-tions to identify differences in timing and magnitude of peak flows and the shape of recession curves nor (2) to determine how well the model simulates residual variations and/or trends for the output response of interest. It can give a deceiving rating of model performance if the model overpredicts as much as it underpredicts, in which case PBIAS will be close to zero even though the model simulation is poor. [1]
\[PBIAS = 100 \times \frac{\sum_{i=1}^{N} (\text{true}_i - \text{predicted}_i)}{\sum_{i=1}^{N} \text{true}_i}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import pbias >>> t = np.random.random(10) >>> p = np.random.random(10) >>> pbias(t, p)
- SeqMetrics.bias(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Bias as and given by Gupta1998 et al., 1998 in Table 1
\[Bias=\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i})\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import bias >>> t = np.random.random(10) >>> p = np.random.random(10) >>> bias(t, p)
- SeqMetrics.med_seq_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Median Squared Error Same as mse, but it takes median which reduces the impact of outliers.
\[\text{MedSE} = \text{median} \left( (\text{predicted}_i - \text{true}_i)^2 \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import med_seq_error >>> t = np.random.random(10) >>> p = np.random.random(10) >>> med_seq_error(t, p)
- SeqMetrics.mae(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Absolute Error. It is less sensitive to outliers as compared to mse/rmse.
\[\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mae(t, p)
- SeqMetrics.gmae(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[GMAE = \left( \prod_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right| \right)^{\frac{1}{n}}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import gmae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> gmae(t, p)
- SeqMetrics.inrse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Integral Normalized Root Squared Error
\[IN\text{-}RSE = \sqrt{\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \overline{\text{true}})^2}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import inrse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> inrse(t, p)
- SeqMetrics.irmse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Inertial RMSE. RMSE divided by standard deviation of the gradient of true.
\[\text{IRMSE} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \text{true}_i - \text{predicted}_i \right)^2}}{\sqrt{\frac{1}{n-2} \sum_{i=1}^{n-1} \left( (\text{true}_{i+1} - \text{true}_i) - \overline{(\text{true}_{i+1} - \text{true}_i)} \right)^2}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import irmse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> irmse(t, p)
- SeqMetrics.mase(true, predicted, treat_arrays: bool = True, seasonality: int = 1, **treat_arrays_kws)[source]
Mean Absolute Scaled Error. Baseline (benchmark) is computed with naive forecasting (shifted by seasonality) modified after this. It is the ratio of MAE of used model and MAE of naive forecast.
\[\text{MASE} = \frac{\frac{1}{n} \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\frac{1}{n-s} \sum_{i=s+1}^{n} \left| \text{true}_i - \text{true}_{i-s} \right|}\]References
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function process the true and predicted arrays using maybe_treat_arrays function
seasonality –
Examples
>>> import numpy as np >>> from SeqMetrics import mase >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mase(t, p)
- SeqMetrics.mare(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Absolute Relative Error. When expressed in %age, it is also known as mape.
\[\text{MARE} = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right|\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mare >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mare(t, p)
- SeqMetrics.msle(true, predicted, treat_arrays=True, weights=None, **treat_arrays_kws) float[source]
-
\[\text{MSLE} = \frac{\sum_{i=1}^{n} w_i \cdot \text{sq_log_error}_i}{\sum_{i=1}^{n} w_i}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import msle >>> t = np.random.random(10) >>> p = np.random.random(10) >>> msle(t, p)
- SeqMetrics.covariance(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[Covariance = \frac{1}{N} \sum_{i=1}^{N}((e_{i} - \bar{e}) * (s_{i} - \bar{s}))\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import covariance >>> t = np.random.random(10) >>> p = np.random.random(10) >>> covariance(t, p)
- SeqMetrics.brier_score(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Adopted from SkillMetrics Calculates the Brier score (BS), a measure of the mean-square error of probability forecasts for a dichotomous (two-category) event, such as the occurrence/non-occurrence of precipitation. The score is calculated using the formula:
\[BS = sum_(n=1)^N (f_n - o_n)^2/N\]where f is the forecast probabilities, o is the observed probabilities (0 or 1), and N is the total number of values in f & o. Note that f & o must have the same number of values, and those values must be in the range [0,1].
- Returns:
BS : Brier score
- Return type:
References
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import brier_score >>> t = np.random.random(10) >>> p = np.random.random(10) >>> brier_score(t, p)
- SeqMetrics.bic(true, predicted, treat_arrays: bool = True, p=1, **treat_arrays_kws) float[source]
Bayesian Information Criterion
Minimising the BIC is intended to give the best model. The model chosen by the BIC is either the same as that chosen by the AIC, or one with fewer terms. This is because the BIC penalises the number of parameters more heavily than the AIC. Modified after RegscorePy.
\[BIC = n \cdot \ln\left(\frac{\text{SSE}}{n}\right) + p \cdot \ln(n)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
p –
Examples
>>> import numpy as np >>> from SeqMetrics import bic >>> t = np.random.random(10) >>> p = np.random.random(10) >>> bic(t, p)
- SeqMetrics.sse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Sum of squared errors (model vs actual). It is measure of how far off our model’s predictions are from the observed values. A value of 0 indicates that all predications are spot on. A non-zero value indicates errors.
This is also called residual sum of squares (RSS) or sum of squared residuals as per tutorialspoint .
\[\text{SSE} = \sum_{i=1}^{n} (true_i - predicted_i)^2\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import sse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> sse(t, p)
- SeqMetrics.amemiya_pred_criterion(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Amemiya’s Prediction Criterion
\[\text{APC} = \left( \frac{n + k}{n - k} \right) \left( \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2 \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import amemiya_pred_criterion >>> t = np.random.random(10) >>> p = np.random.random(10) >>> amemiya_pred_criterion(t, p)
- SeqMetrics.amemiya_adj_r2(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[R^2_{\text{adj, Amemiya}} = 1 - \left( \frac{(1 - R^2) \cdot (n + k)}{n - k - 1} \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import amemiya_adj_r2 >>> t = np.random.random(10) >>> p = np.random.random(10) >>> amemiya_adj_r2(t, p)
- SeqMetrics.aitchison(true, predicted, treat_arrays: bool = True, center='mean', **treat_arrays_kws) float[source]
Aitchison distance. used in Zhang et al., 2020
\[d_{\text{Aitchison}} = \sqrt{\sum_{i=1}^{n} \left( \log(\text{true}_i) - \text{center}(\log(\text{true})) - \left(\log(\text{predicted}_i) - \text{center}(\log(\text{predicted}))\right) \right)^2}\]https://doi.org/10.1007/bf00891269
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
center –
Examples
>>> import numpy as np >>> from SeqMetrics import aitchison >>> t = np.random.random(10) >>> p = np.random.random(10) >>> aitchison(t, p)
- SeqMetrics.aic(true, predicted, treat_arrays: bool = True, p=1, **treat_arrays_kws) float[source]
Akaike Information Criterion. Modifying from this sourcee
\[AIC = n \cdot \ln\left(\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{n}\right) + 2p\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
p –
Examples
>>> import numpy as np >>> from SeqMetrics import aic >>> t = np.random.random(10) >>> p = np.random.random(10) >>> aic(t, p)
- SeqMetrics.acc(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Anomaly correction coefficient. See Langland et al., 2012; Miyakoda et al., 1972 and Murphy et al., 1989.
\[ACC = \frac{\sum_{i=1}^{N} \left( (\text{predicted}_i - \overline{\text{predicted}})(\text{true}_i - \overline{\text{true}}) \right)}{(N-1) \cdot \sigma_{\text{true}} \cdot \sigma_{\text{predicted}}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import acc >>> t = np.random.random(10) >>> p = np.random.random(10) >>> acc(t, p)
- SeqMetrics.cronbach_alpha(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
It is a measure of internal consitency of data. See ucla and stackoverflow pages for more info.
\[alpha = \frac{N}{N - 1} \left(1 - \frac{\sum_{i=1}^{N} \sigma^2_{i}}{\sigma^2_{\text{total}}}\right)\]https://doi.org/10.1016/B0-12-369398-5/00396-0
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import cronbach_alpha >>> t = np.random.random(10) >>> p = np.random.random(10) >>> cronbach_alpha(t, p)
- SeqMetrics.cosine_similarity(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
It is a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. See
\[\text{Cosine Similarity} = \frac{\sum_{i=1}^{n} \text{true}_i \cdot \text{predicted}_i}{\sqrt{\sum_{i=1}^{n} (\text{true}_i)^2} \cdot \sqrt{\sum_{i=1}^{n} (\text{predicted}_i)^2}}\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import cosine_similarity >>> t = np.random.random(10) >>> p = np.random.random(10) >>> cosine_similarity(t, p)
- SeqMetrics.decomposed_mse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Decomposed MSE developed by Kobayashi and Salam (2000) Equation 24
\[dMSE = (\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i}))^2 + SDSD + LCS\]\[SDSD = (\sigma(e) - \sigma(s))^2\]\[LCS = 2 \sigma(e) \sigma(s) * (1 - \frac{\sum ^n _{i=1}(e_i - \bar{e})(s_i - \bar{s})} {\sqrt{\sum ^n _{i=1}(e_i - \bar{e})^2} \sqrt{\sum ^n _{i=1}(s_i - \bar{s})^2}})\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import decomposed_mse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> decomposed_mse(t, p)
- SeqMetrics.euclid_distance(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Euclidian distance taken from `this book <https://doi.org/10.1016/B978-0-12-088735-4.50006-7`_.
\[D = \sqrt{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import euclid_distance >>> t = np.random.random(10) >>> p = np.random.random(10) >>> euclid_distance(t, p)
- SeqMetrics.exp_var_score(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws) float | None[source]
Explained variance score . Best value is 1, lower values are less accurate.
\[\text{EVS} = 1 - \frac{\sum_{i=1}^{n} w_i \left( (true_i - predicted_i) - \frac{\sum_{j=1}^{n} w_j (true_j - predicted_j)}{\sum_{j=1}^{n} w_j} \right)^2}{\sum_{i=1}^{n} w_i (true_i - \frac{\sum_{j=1}^{n} w_j true_j}{\sum_{j=1}^{n} w_j})^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import exp_var_score >>> t = np.random.random(10) >>> p = np.random.random(10) >>> exp_var_score(t, p)
- SeqMetrics.expanded_uncertainty(true, predicted, treat_arrays: bool = True, cov_fact=1.96, **treat_arrays_kws) float[source]
By default, it calculates uncertainty with 95% confidence interval. 1.96 is the coverage factor corresponding 95% confidence level .This indicator is used in order to show more information about the model deviation. Using formula from by Behar et al., 2015 and Gueymard et al., 2014.
\[U = \text{cov_fact} \times \sqrt{\frac{1}{n-1} \sum_{i=1}^{n} \left( \left(\text{true}_i - \text{predicted}_i\right) - \overline{\left(\text{true} - \text{predicted}\right)} \right)^2 + \frac{1}{n} \sum_{i=1}^{n} \left(\text{true}_i - \text{predicted}_i\right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
cov_fact –
Examples
>>> import numpy as np >>> from SeqMetrics import expanded_uncertainty >>> t = np.random.random(10) >>> p = np.random.random(10) >>> expanded_uncertainty(t, p)
- SeqMetrics.fdc_fhv(true, predicted, treat_arrays: bool = True, h: float = 0.02, **treat_arrays_kws) float[source]
modified Kratzert2018 code. Peak flow bias of the flow duration curve (Yilmaz 2008) used in kratzert et al., 2019.
\[FHV = \frac{\sum_{i=1}^{k} (predicted_i - true_i)}{\sum_{i=1}^{k} true_i} \times 100\]- Parameters:
h (float) – Must be between 0 and 1.
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
- Return type:
Bias of the peak flows
Examples
>>> import numpy as np >>> from SeqMetrics import fdc_fhv >>> t = np.random.random(10) >>> p = np.random.random(10) >>> fdc_fhv(t, p)
- SeqMetrics.fdc_flv(true, predicted, treat_arrays: bool = True, low_flow: float = 0.3, **treat_arrays_kws) float[source]
bias of the bottom 30 % low flows. modified Kratzert_ code used in kratzert et al., 2019.
\[\text{FLV} = -1 \times \frac{\sum (\log(\text{predicted}) - \min(\log(\text{predicted}))) - \sum (\log(\text{true}) - \min(\log(\text{true})))}{\sum (\log(\text{true}) - \min(\log(\text{true}))) + 1 \times 10^{-6}}\]- Parameters:
low_flow (float, optional) – Upper limit of the flow duration curve. E.g. 0.3 means the bottom 30% of the flows are considered as low flows, by default 0.3
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
- Return type:
Examples
>>> import numpy as np >>> from SeqMetrics import fdc_flv >>> t = np.random.random(10) >>> p = np.random.random(10) >>> fdc_flv(t, p)
- SeqMetrics.gmean_diff(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Geometric mean difference. First geometric mean is calculated for true and predicted arrays and their difference is calculated.
\[\text{gmean_diff} = \left( \prod_{i=1}^{n} \text{true}_i \right)^{\frac{1}{n}} - \left( \prod_{i=1}^{n} \text{predicted}_i \right)^{\frac{1}{n}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import gmean_diff >>> t = np.random.random(10) >>> p = np.random.random(10) >>> gmean_diff(t, p)
- SeqMetrics.gmrae(true, predicted, treat_arrays: bool = True, benchmark: ndarray | None = None, **treat_arrays_kws) float[source]
Geometric Mean Relative Absolute Error
\[GMRAE = \left( \prod_{i=1}^{n} \frac{|true_i - predicted_i|}{|true_i - benchmark_i|} \right)^{\frac{1}{n}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
benchmark –
Examples
>>> import numpy as np >>> from SeqMetrics import gmrae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> gmrae(t, p)
- SeqMetrics.calculate_hydro_metrics(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) dict[source]
- Calculates the following performance metrics related to hydrology.
fdc_flv
fdc_fhv
kge
kge_np
kge_mod
kge_bound
kgeprime_bound
kgenp_bound
nse
nse_alpha
nse_beta
nse_mod
nse_bound
r2
mape
nrmse
corr_coeff
rmse
mae
mse
mpe
mase
r2_score
- Returns:
Dictionary with all metrics
- Return type:
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import calculate_hydro_metrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> calculate_hydro_metrics(t, p)
- SeqMetrics.JS(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[JS(P \parallel Q) = \frac{1}{2} \sum_{i} \left( P(i) \log_2 \left( \frac{2P(i)}{P(i) + Q(i)} \right) + Q(i) \log_2 \left( \frac{2Q(i)}{P(i) + Q(i)} \right) \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import JS >>> t = np.random.random(10) >>> p = np.random.random(10) >>> JS(t, p)
- SeqMetrics.kgenp_bound(true, predicted, treat_arrays: bool = True, **treat_arrays_kws)[source]
Bounded Version of the Non-Parametric Kling-Gupta Efficiency
\[KGE_{np_{bound}} = \frac{1 - \sqrt{\left(\rho(t, p) - 1\right)^2 + \left(1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted}(p_i)}{\text{mean}(p) \cdot n} - \frac{\text{sorted}(t_i)}{\text{mean}(t) \cdot n} \right| - 1\right)^2 + \left(\frac{\text{mean}(p)}{\text{mean}(t)} - 1\right)^2}}{2 - \left(1 - \sqrt{\left(\rho(t, p) - 1\right)^2 + \left(1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted}(p_i)}{\text{mean}(p) \cdot n} - \frac{\text{sorted}(t_i)}{\text{mean}(t) \cdot n} \right| - 1\right)^2 + \left(\frac{\text{mean}(p)}{\text{mean}(t)} - 1\right)^2}\right)}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import kgenp_bound >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kgenp_bound(t, p)
- SeqMetrics.kl_sym(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float | None[source]
Symmetric kullback-leibler divergence
\[\text{KL}_{\text{sym}}(P || Q) = \frac{1}{2} \sum_{i=1}^{n} \left( P_i - Q_i \right) \left( \log_2 \frac{P_i}{Q_i} \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import kl_sym >>> t = np.random.random(10) >>> p = np.random.random(10) >>> kl_sym(t, p)
- SeqMetrics.lm_index(true, predicted, treat_arrays: bool = True, obs_bar_p=None, **treat_arrays_kws) float[source]
Legate-McCabe Efficiency Index. Less sensitive to outliers in the data. The larger, the better
\[a_i = |predicted_i - true_i|\]\[b_i = |true_i - \text{obs\_bar\_p}| \text{if } \text{obs\_bar\_p} \text{ is provided} \|true_i - \bar{true}| \text{otherwise}\]\[\text{LM Index} = 1 - \frac{\sum_{i=1}^{n} a_i}{\sum_{i=1}^{n} b_i}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
obs_bar_p (float,) – Seasonal or other selected average. If None, the mean of the observed array will be used.
Examples
>>> import numpy as np >>> from SeqMetrics import lm_index >>> t = np.random.random(10) >>> p = np.random.random(10) >>> lm_index(t, p)
- SeqMetrics.maape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Arctangent Absolute Percentage Error Note: result is NOT multiplied by 100
\[MAAPE = \frac{1}{n} \sum_{i=1}^{n} \arctan \left( \frac{| \text{true}_i - \text{predicted}_i |}{| \text{true}_i | + \epsilon} \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import maape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> maape(t, p)
- SeqMetrics.mbrae(true, predicted, treat_arrays: bool = True, benchmark: ndarray | None = None, **treat_arrays_kws) float[source]
Mean Bounded Relative Absolute Error
\[MBRAE = \frac{1}{n} \sum_{i=1}^{n} \frac{| \text{true}_i - \text{predicted}_i |}{| \text{true}_i - \text{benchmark}_i |}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
benchmark –
Examples
>>> import numpy as np >>> from SeqMetrics import mbrae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mbrae(t, p)
- SeqMetrics.max_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
maximum absolute error In Sklearn, there is “absolute” in equation but not in name of metric.
\[\text{Max Error} = \max_{i=1}^n \left| \text{true}_i - \text{predicted}_i \right|\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import max_error >>> t = np.random.random(10) >>> p = np.random.random(10) >>> max_error(t, p)
- SeqMetrics.mb_r(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[R = 1 - \frac{n^2 \cdot \frac{1}{n} \sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|}{\sum_{i=1}^{n} \sum_{j=1}^{n} \left| \text{predicted}_j - \text{true}_i \right|}\]
References
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mb_r >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mb_r(t, p)
- SeqMetrics.mda(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Directional Accuracy
\[\text{MDA} = \frac{1}{n-1} \sum_{i=1}^{n-1} \left( \text{sign}( \text{true}_{i+1} - \text{true}_i) == \text{sign}( \text{predicted}_{i+1} - \text{predicted}_i) \right)\]modified after.
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mda >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mda(t, p)
- SeqMetrics.mde(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[MDE = \text{median}(\text{predicted}_i - \text{true}_i)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mde >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mde(t, p)
- SeqMetrics.mdape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Median Absolute Percentage Error. The value is multiplied by 100.
\[\text{MdAPE} = 100 \times \text{Median} \left( \left\{ \frac{|\text{true}_i - \text{predicted}_i|}{|\text{true}_i|} \right\}_{i=1}^n \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mdape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mdape(t, p)
- SeqMetrics.mdrae(true, predicted, treat_arrays: bool = True, benchmark: ndarray | None = None, **treat_arrays_kws) float[source]
Median Relative Absolute Error In Sklearn, there is “absolute” in equation but not in name of metric.
\[MdRAE = \text{median} \left( \left| \frac{true_i - predicted_i}{true_i - benchmark_i} \right| \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
benchmark –
Examples
>>> import numpy as np >>> from SeqMetrics import mdrae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mdrae(t, p)
- SeqMetrics.me(true, predicted, treat_arrays: bool = True, **treat_arrays_kws)[source]
-
\[ME = \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import me >>> t = np.random.random(10) >>> p = np.random.random(10) >>> me(t, p)
- SeqMetrics.mean_bias_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Bias Error It represents overall bias error or systematic error. It shows average interpolation bias; i.e. average over- or underestimation. [1][2].This indicator expresses a tendency of model to underestimate (negative value) or overestimate (positive value) global radiation, while the MBE values closest to zero are desirable. The drawback of this test is that it does not show the correct performance when the model presents overestimated and underestimated values at the same time, since overestimation and underestimation values cancel each other.
\[\text{MBE} = \frac{1}{N} \sum_{i=1}^{N} (true_i - predicted_i)\]References
- `Willmott, C. J., & Matsuura, K. (2006). On the use of dimensioned measures of error to evaluate the performance
of spatial interpolators. International Journal of Geographical Information Science, 20(1), 89-102. <https://doi.org/10.1080/1365881050028697>`_
- `Valipour, M. (2015). Retracted: Comparative Evaluation of Radiation-Based Methods for Estimation of Potential
Evapotranspiration. Journal of Hydrologic Engineering, 20(5), 04014068. <https://dx.doi.org/10.1061/(ASCE)HE.1943-5584.0001066>`_
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mean_bias_error >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mean_bias_error(t, p)
- SeqMetrics.mean_var(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean variance
\[\text{mean_var} = \text{Var} \left( \log(1 + \text{true}) - \log(1 + \text{predicted}) \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mean_var >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mean_var(t, p)
- SeqMetrics.mean_poisson_deviance(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws) float[source]
mean poisson deviance
\[\text{MPD} = \frac{1}{n} \sum_{i=1}^{n} 2 \left( \text{true}_i \log \left( \frac{\text{true}_i}{\text{predicted}_i} \right) - (\text{true}_i - \text{predicted}_i) \right)\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_poisson_deviance.html
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import mean_poisson_deviance >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mean_poisson_deviance(t, p)
- SeqMetrics.mean_gamma_deviance(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws) float[source]
-
\[\text{Mean Gamma Deviance (Weighted)} = \frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} w_i \frac{2}{\text{true}_i} \left( \text{predicted}_i - \text{true}_i - \text{true}_i \ln \left( \frac{\text{predicted}_i}{\text{true}_i} \right) \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import mean_gamma_deviance >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mean_gamma_deviance(t, p)
- SeqMetrics.median_abs_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
median absolute error
\[\text{MedAE} = \text{median} \left( \left| \text{true}_i - \text{predicted}_i \right| \right)\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.median_absolute_error.html
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import median_abs_error >>> t = np.random.random(10) >>> p = np.random.random(10) >>> median_abs_error(t, p)
- SeqMetrics.mle(true, predicted, treat_arrays=True, **treat_arrays_kws) float[source]
-
\[\text{MLE} = \frac{1}{n} \sum_{i=1}^{n} \left( \log(1 + \text{predicted}_i) - \log(1 + \text{true}_i) \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mle >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mle(t, p)
- SeqMetrics.mod_agreement_index(true, predicted, treat_arrays: bool = True, j=1, **treat_arrays_kws) float[source]
Modified agreement of index. j: int, when j==1, this is same as agreement_index. Higher j means more impact of outliers.
\[MAI = 1 - \frac{\sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|^j}{\sum_{i=1}^{n} \left( \left| \text{predicted}_i - \overline{\text{true}} \right| + \left| \text{true}_i - \overline{\text{true}} \right| \right)^j}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
j –
Examples
>>> import numpy as np >>> from SeqMetrics import mod_agreement_index >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mod_agreement_index(t, p)
- SeqMetrics.mpe(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Percentage Error. The value is multiplied by 100 to reflect percentage.
\[MPE = \frac{1}{n} \sum_{i=1}^{n} \left( \frac{true_i - predicted_i}{true_i} \right) \times 100\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mpe >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mpe(t, p)
- SeqMetrics.mrae(true, predicted, treat_arrays: bool = True, benchmark: ndarray | None = None, **treat_arrays_kws)[source]
-
\[MRAE = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{benchmark}_i} \right|\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
benchmark –
Examples
>>> import numpy as np >>> from SeqMetrics import mrae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mrae(t, p)
- SeqMetrics.norm_euclid_distance(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[D_{norm} = \sqrt{\sum_{i=1}^{n} \left( \frac{\text{true}_i}{\bar{\text{true}}} - \frac{\text{predicted}_i}{\bar{\text{predicted}}} \right)^2}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import norm_euclid_distance >>> t = np.random.random(10) >>> p = np.random.random(10) >>> norm_euclid_distance(t, p)
- SeqMetrics.nrmse_range(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Range Normalized Root Mean Squared Error. RMSE normalized by true values. This allows comparison between data sets with different scales. It is more sensitive to outliers.
Reference: Pontius et al., 2008
\[\text{NRMSE} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{predicted}_i - \text{true}_i)^2}}{\max(\text{true}) - \min(\text{true})}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nrmse_range >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nrmse_range(t, p)
- SeqMetrics.nrmse_ipercentile(true, predicted, treat_arrays: bool = True, q1=25, q2=75, **treat_arrays_kws) float[source]
RMSE normalized by inter percentile range of true. This is the least sensitive to outliers. q1: any interger between 1 and 99 q2: any integer between 2 and 100. Should be greater than q1. Reference: Pontius et al., 2008.
\[\text{NRMSE}_{\text{IP}} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{Q_{q2} - Q_{q1}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
q1 –
q2 –
Examples
>>> import numpy as np >>> from SeqMetrics import nrmse_ipercentile >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nrmse_ipercentile(t, p)
- SeqMetrics.nrmse_mean(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Normalized RMSE
RMSE normalized by mean of true values.This allows comparison between datasets with different scales.
\[NRMSE_{mean} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{\bar{\text{true}}}\]Reference: Pontius et al., 2008 :param true: true/observed/actual/target values. It must be a numpy array,
or pandas series/DataFrame or a list.
- Parameters:
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import nrmse_mean >>> t = np.random.random(10) >>> p = np.random.random(10) >>> nrmse_mean(t, p)
- SeqMetrics.norm_ae(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[norm\_ae = \sqrt{\frac{\sum_{i=1}^{n} (error_i - MAE)^2}{n - 1}}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import norm_ae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> norm_ae(t, p)
- SeqMetrics.norm_ape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Normalized Absolute Percentage Error
\[\text{norm_APE} = \sqrt{ \frac{1}{n-1} \sum_{i=1}^{n} \left( \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right| - \frac{1}{n} \sum_{j=1}^{n} \left| \frac{\text{true}_j - \text{predicted}_j}{\text{true}_j} \right| \right)^2 }\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import norm_ape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> norm_ape(t, p)
- SeqMetrics.log_prob(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Logarithmic probability distribution
\[\text{log_prob} = \frac{1}{N} \sum_{i=1}^{N} \left( -\frac{\left( \frac{\text{true}_i - \text{predicted}_i}{\text{scale}} \right)^2}{2} - \log(\sqrt{2\pi}) \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import log_prob >>> t = np.random.random(10) >>> p = np.random.random(10) >>> log_prob(t, p)
- SeqMetrics.log_nse(true, predicted, treat_arrays: bool = True, epsilon: float = 0.0, log_base: str = 'e', **treat_arrays_kws) float[source]
log transformed Nash-Sutcliffe Efficiency.
It is especially useful for capturing prediction performance for the lowest flows due to the logarithmic transform.
\[NSE = 1-\frac{\sum_{i=1}^{N}(log(e_{i})-log(s_{i}))^2}{\sum_{i=1}^{N}(log(e_{i})-log(\bar{e})^2}-1)*-1\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
epsilon – A small value to be added to true and predicted values to avoid log(0)
References
Pushpalatha, R.; Perrin, C.; le Moine, N. and Andréassian V. (2012). “A review of efficiency criteria suitable for evaluating low-flow simulations”. Journal of Hydrology. 420-421, 171-182. doi:10.1016/j.jhydrol.2011.11.055
https://doi.org/10.1029/2012WR012005
Examples
>>> import numpy as np >>> from SeqMetrics import log_nse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> log_nse(t, p)
- SeqMetrics.rmdspe(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Root Median Squared Percentage Error. The value is multiplied by 100 to reflect percentage.
\[\text{RMDSPE} = \sqrt{\text{median}\left(\left(\frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \times 100\right)^2\right)}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rmdspe >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rmdspe(t, p)
- SeqMetrics.rse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Relative Squared Error
\[\text{RSE} = \frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rse(t, p)
- SeqMetrics.rrse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[RRSE = \sqrt{\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rrse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rrse(t, p)
- SeqMetrics.rae(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Relative Absolute Error (aka Approximation Error)
\[\text{RAE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \left| \text{true}_i - \overline{\text{true}} \right|}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rae(t, p)
- SeqMetrics.ref_agreement_index(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Refined Index of Agreement. From -1 to 1. Larger the better.
\[a = \sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|\]\[b = 2 \sum_{i=1}^{n} \left| \text{true}_i - \overline{\text{true}} \right|\]\[d_{\text{ref}} = \begin{cases} 1 - \frac{a}{b} & \text{if } a \leq b \ \frac{b}{a} - 1 & \text{if } a > b \end{cases}\]Refrence: Willmott et al., 2012 :param true: true/observed/actual/target values. It must be a numpy array,
or pandas series/DataFrame or a list.
- Parameters:
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import ref_agreement_index >>> t = np.random.random(10) >>> p = np.random.random(10) >>> ref_agreement_index(t, p)
- SeqMetrics.rel_agreement_index(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Relative index of agreement. from 0 to 1. larger the better.
\[\text{rel_agreement_index} = 1 - \frac{\sum_{i=1}^{n} \left( \frac{\text{predicted}_i - \text{true}_i}{\text{true}_i} \right)^2}{\sum_{i=1}^{n} \left( \frac{|\text{predicted}_i - \bar{\text{true}}| + |\text{true}_i - \bar{\text{true}}|}{\bar{\text{true}}} \right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rel_agreement_index >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rel_agreement_index(t, p)
- SeqMetrics.relative_rmse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Relative Root Mean Squared Error. It normalizes teh rmse by mean of true values.
\[RRMSE=\frac{\sqrt{\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i})^2}}{\bar{e}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import relative_rmse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> relative_rmse(t, p)
- SeqMetrics.rmspe(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Root Mean Square Percentage Error.
\[RMSPE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left(PE_i\right)^2} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left(\frac{\text{true}_i - \text{predicted}_i}{\text{true}_i}\right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rmspe >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rmspe(t, p)
- SeqMetrics.rsr(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
It is MSE normalized by standard deviation of true values. Following Moriasi et al., 2007..
It incorporates the benefits of error index statistics andincludes a scaling/normalization factor, so that the resulting statistic and reported values can apply to various constitu-ents. It ranges from 0 to infinity, with 0-0.5 indicating very good model performance, 0.5-0.8 indicating good model performance.
Standard deviation is calculated using np.ntd(true, ddof=1) to match the results of this.
\[\text{RSR} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{\sqrt{\frac{1}{n-1} \sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import rsr >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rsr(t, p)
- SeqMetrics.rmsse(true, predicted, treat_arrays: bool = True, seasonality: int = 1, **treat_arrays_kws) float[source]
Root Mean Squared Scaled Error
\[\text{RMSSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \frac{\left| \text{true}_i - \text{predicted}_i \right|}{\frac{1}{n-s} \sum_{j=s+1}^{n} \left| \text{true}_j - \text{true}_{j-s} \right|} \right)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
seasonality –
Examples
>>> import numpy as np >>> from SeqMetrics import rmsse >>> t = np.random.random(10) >>> p = np.random.random(10) >>> rmsse(t, p)
- SeqMetrics.sa(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Spectral angle. From -pi/2 to pi/2. Closer to 0 is better. It measures angle between two vectors in hyperspace indicating how well the shape of two arrays match instead of their magnitude. Reference: Robila and Gershman, 2005.
\[SA = \arccos \left( \frac{\sum_{i=1}^{n} (\text{true}_i \cdot \text{predicted}_i)}{\sqrt{\sum_{i=1}^{n} (\text{true}_i)^2} \cdot \sqrt{\sum_{i=1}^{n} (\text{predicted}_i)^2}} \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import sa >>> t = np.random.random(10) >>> p = np.random.random(10) >>> sa(t, p)
- SeqMetrics.smape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Symmetric Mean Absolute Percentage Error. Adoption from this.
\[SMAPE = \frac{100}{n} \sum_{i=1}^{n} \frac{2 \left| \text{predicted}_i - \text{true}_i \right|}{\left| \text{true}_i \right| + \left| \text{predicted}_i \right|}\]Goodwin and Lawton, 1999 : https://doi.org/10.1016/S0169-2070(99)00007-2 Flores et al., 1986 : https://doi.org/10.1016/0305-0483(86)90013-7
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import smape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> smape(t, p)
- SeqMetrics.smdape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Symmetric Median Absolute Percentage Error Note: result is NOT multiplied by 100
\[\text{smdape} = \text{median} \left( \frac{2 \cdot | \text{predicted} - \text{true} |}{| \text{true} | + | \text{predicted} | + \epsilon} \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import smdape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> smdape(t, p)
- SeqMetrics.sid(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Spectral Information Divergence. From -pi/2 to pi/2. Closer to 0 is better.
\[\text{SID} = \left( \frac{\text{t}}{\text{mean(t)}} - \frac{\text{p}}{\text{mean(p)}} \right) \cdot \left( \log_{10}(\text{t}) - \log_{10}(\text{mean(t)}) - \log_{10}(\text{p}) + \log_{10}(\text{mean(p)}) \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import sid >>> t = np.random.random(10) >>> p = np.random.random(10) >>> sid(t, p)
- SeqMetrics.skill_score_murphy(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Adopted from here . Calculate non-dimensional skill score (SS) between two variables using definition of Murphy (1988) using the formula:
\[SS = 1 - RMSE^2/SDEV^2\]\[SDEV is the standard deviation of the true values\]\[SDEV^2 = sum_(n=1)^N [r_n - mean(r)]^2/(N-1)\]where p is the predicted values, r is the reference values, and N is the total number of values in p & r. Note that p & r must have the same number of values. A positive skill score can be interpreted as the percentage of improvement of the new model forecast in comparison to the reference. On the other hand, a negative skill score denotes that the forecast of interest is worse than the referencing forecast. Consequently, a value of zero denotes that both forecasts perform equally [MLAir, 2020].
- Returns:
flaot
References
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import skill_score_murphy >>> t = np.random.random(10) >>> p = np.random.random(10) >>> skill_score_murphy(t, p)
- SeqMetrics.std_ratio(true, predicted, treat_arrays: bool = True, std_kwargs: dict | None = None, **treat_arrays_kws) float[source]
Ratio of standard deviations of predictions and trues. Also known as standard ratio, it varies from 0.0 to infinity while 1.0 being the perfect value.
\[\text{std_ratio} = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import std_ratio >>> t = np.random.random(10) >>> p = np.random.random(10) >>> std_ratio(t, p)
- SeqMetrics.umbrae(true, predicted, treat_arrays: bool = True, benchmark: ndarray | None = None, **treat_arrays_kws)[source]
Unscaled Mean Bounded Relative Absolute Error
\[UMBRAE = \frac{\frac{1}{n} \sum_{i=1}^{n} \frac{|t_i - p_i|}{|t_i - b_i|}}{1 - \frac{1}{n} \sum_{i=1}^{n} \frac{|t_i - p_i|}{|t_i - b_i|}}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
benchmark –
Examples
>>> import numpy as np >>> from SeqMetrics import umbrae >>> t = np.random.random(10) >>> p = np.random.random(10) >>> umbrae(t, p)
- SeqMetrics.ve(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Volumetric efficiency. Ranges from 0 to 1. Smaller the better.
\[VE = 1 - \frac{\sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import ve >>> t = np.random.random(10) >>> p = np.random.random(10) >>> ve(t, p)
- SeqMetrics.volume_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Returns the Volume Error (Ve). It is an indicator of the agreement between the averages of the simulated and observed runoff (i.e. long-term water balance). used in Reynolds paper:
\[\text{volume_error}= Sum(self.predicted- true)/sum(self.predicted)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import volume_error >>> t = np.random.random(10) >>> p = np.random.random(10) >>> volume_error(t, p)
- SeqMetrics.wape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
weighted absolute percentage error. The lower the better.
It is a variation of mape but more suitable for intermittent and low-volume data.
\[\text{WAPE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import wape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> wape(t, p)
- SeqMetrics.watt_m(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[M = \frac{2}{\pi} \cdot \arcsin \left( 1 - \frac{\frac{1}{n} \sum_{i=1}^{n} ( \text{true}_i - \text{predicted}_i )^2}{\sigma_{\text{true}}^2 + \sigma_{\text{predicted}}^2 + (\mu_{\text{predicted}} - \mu_{\text{true}})^2} \right)\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import watt_m >>> t = np.random.random(10) >>> p = np.random.random(10) >>> watt_m(t, p)
- SeqMetrics.wmape(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Weighted Mean Absolute Percent Error.
\[\text{WMAPE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import wmape >>> t = np.random.random(10) >>> p = np.random.random(10) >>> wmape(t, p)
- SeqMetrics.spearmann_corr(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Separmann correlation coefficient.
\[r = \frac{\sum_{i=1}^{n} \left( R_{t,i} - \overline{R_t} \right) \left( R_{p,i} - \overline{R_p} \right)}{\sqrt{ \sum_{i=1}^{n} \left( R_{t,i} - \overline{R_t} \right)^2 \sum_{i=1}^{n} \left( R_{p,i} - \overline{R_p} \right)^2 }}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import spearmann_corr >>> t = np.random.random(10) >>> p = np.random.random(10) >>> spearmann_corr(t, p)
- SeqMetrics.agreement_index(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Agreement Index (d) developed by Willmott, 1981.
It detects additive and pro-portional differences in the observed and simulated means and vari-ances Moriasi et al., 2015. It is overly sensitive to extreme values due to the squared differences. It can also be used as a substitute for R2 to identify the degree to which model predic-tions are error-free. Its value varies between 0 and 1 with 1 being the best.
\[d = 1 - \frac{\sum_{i=1}^{N}(e_{i} - s_{i})^2}{\sum_{i=1}^{N}(\left | s_{i} - \bar{e} \right | + \left | e_{i} - \bar{e} \right |)^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import agreement_index >>> t = np.random.random(10) >>> p = np.random.random(10) >>> agreement_index(t, p)
- SeqMetrics.centered_rms_dev(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Modified after SkillMetrics. Calculates the centered root-mean-square (RMS) difference between true and predicted using the formula: (E’)^2 = sum_(n=1)^N [(p_n - mean(p))(r_n - mean(r))]^2/N where p is the predicted values, r is the true values, and N is the total number of values in p & r.
\[CRMSD = \sqrt{\frac{1}{N} \sum_{i=1}^{N} \left( (p_i - \text{mean}(p)) - (r_i - \text{mean}(r)) \right)^2}\]Output: CRMSDIFF : centered root-mean-square (RMS) difference (E’)^2
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import centered_rms_dev >>> t = np.random.random(10) >>> p = np.random.random(10) >>> centered_rms_dev(t, p)
- SeqMetrics.mapd(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean absolute percentage deviation
\[MAPD = \frac{\sum_{i=1}^{n} \left| predicted_i - true_i \right|}{\sum_{i=1}^{n} \left| true_i \right|}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import mapd >>> t = np.random.random(10) >>> p = np.random.random(10) >>> mapd(t, p)
- SeqMetrics.sga(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Spectral gradient angle. It varies from -pi/2 to pi/2. Closer to 0 is better.
\[\text{SGA} = \arccos \left( \frac{\sum_{i=1}^{n-1} \left( (true_{i+1} - true_i) \cdot (predicted_{i+1} - predicted_i) \right)}{\sqrt{\sum_{i=1}^{n-1} (true_{i+1} - true_i)^2} \times \sqrt{\sum_{i=1}^{n-1} (predicted_{i+1} - predicted_i)^2}} \right)\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import sga >>> t = np.random.random(10) >>> p = np.random.random(10) >>> sga(t, p)
- SeqMetrics.mse(true, predicted, treat_arrays: bool = True, weights=None, **treat_arrays_kws) float[source]
-
\[MSE = \frac{\sum_{i=1}^{N} w_i (true_i - predicted_i)^2}{\sum_{i=1}^{N} w_i}\]
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
weights –
Examples
>>> import numpy as np >>> from SeqMetrics import mse >>> t = np.random.random(10) >>> p = np.random.random(10)treat_arrays >>> mse(t, p)
- SeqMetrics.variability_ratio(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Variability Ratio It is the ratio of the variance of the predicted values to the variance of the true values. It is used to measure the variability of the predicted values relative to the true values.
\[VR = 1 - \left| \frac{\frac{\sigma_{\text{predicted}}}{\mu_{\text{predicted}}}}{\frac{\sigma_{\text{true}}}{\mu_{\text{true}}}} - 1 \right|\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated/predicted values
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import variability_ratio >>> t = np.random.random(10) >>> p = np.random.random(10) >>> variability_ratio(t, p)
- SeqMetrics.concordance_corr_coef(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Concordance Correlation Coefficient (CCC) taken from this paper.
\[CCC = \frac{2 \rho \sigma_{true} \sigma_{predicted}}{\sigma_{true}^2 + \sigma_{predicted}^2 + (\bar{true} - \bar{predicted})^2}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import concordance_corr_coef >>> t = np.random.random(10) >>> p = np.random.random(10) >>> concordance_corr_coef(t, p)
- SeqMetrics.critical_success_index(true, predicted, treat_arrays: bool = True, threshold=0.5, **treat_arrays_kws) float[source]
-
\[CSI = \frac{TP}{TP + FN + FP}\]
- Parameters:
true – True/observed/actual/target values. It should be a binary array (0s and 1s), or a continuous array where values are binarized using a threshold.
predicted – Predicted values, same format as ‘true’.
treat_arrays – treat_arrays the true and predicted array
threshold – Threshold for binarizing continuous values (if applicable).
Examples
>>> import numpy as np >>> from SeqMetrics import critical_success_index >>> t = np.array([0.4, 0.1, 0.1, 0.3, 0.7, 0.1]) >>> p = np.array([0.8, 0.11, 0.5, 0.1, 0.1, 0.1]) >>> critical_success_index(t, p)
- SeqMetrics.kl_divergence(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[D_{KL}(P \parallel Q) = \sum_{i} P(i) \log \left( \frac{P(i)}{Q(i)} \right)\]
- Parameters:
true – True/observed/actual/target probability distribution. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted probability distribution, same format as ‘true’.
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import kl_divergence >>> t = np.array([0.1, 0.2, 0.3, 0.2, 0.2]) >>> p = np.array([0.2, 0.2, 0.2, 0.2, 0.2]) >>> divergence = kl_divergence(t, p)
- SeqMetrics.log_cosh_error(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[\text{Log-Cosh Error} = \frac{1}{n} \sum_{i=1}^{n} \log \left( \cosh(\text{predicted}_i - \text{true}_i) \right)\]
- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import log_cosh_error >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> error = log_cosh_error(t, p)
- SeqMetrics.minkowski_distance(true, predicted, order=1, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[D_{Minkowski} = \left( \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|^p \right)^{\frac{1}{p}}\]
- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
order – The order of the norm of the difference. order=2 is equivalent to the Euclidean distance, order=1 is the Manhattan distance.
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import minkowski_distance >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> order = 2 # Euclidean distance >>> distance = minkowski_distance(t, p, order)
- SeqMetrics.tweedie_deviance_score(true, predicted, power=0, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[D(\text{true}, \text{predicted}) = \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \text{true}_i \log\left(\frac{\text{true}_i + (\text{true}_i = 0)}{\text{predicted}_i}\right) - \text{true}_i + \text{predicted}_i \right)\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \frac{\text{true}_i}{\text{predicted}_i} - \log\left(\frac{\text{true}_i}{\text{predicted}_i}\right) - 1 \right)\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \frac{(\text{true}_i - \text{predicted}_i)^2}{\text{true}_i^2 \text{predicted}_i} \right)\]
- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
power – The power determines the underlying target distribution. power=0 for Normal, power=1 for Poisson, power=2 for Gamma, and power=3 for Inverse Gaussian.
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import tweedie_deviance_score >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> power = 2 # Gamma distribution >>> score = tweedie_deviance_score(t, p, power)
- SeqMetrics.mre(true, predicted, benchmark: ndarray | None = None, treat_arrays: bool = True, **treat_arrays_kws) float[source]
-
\[\text{MRE} = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right|\]
- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
benchmark –
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import mre >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> score = mre(t, p)
- SeqMetrics.mape_for_peaks(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Mean Absolute Percentage Error for peaks which are found using scipy.singnal.find_peaks
\[\text{MAPE}_\text{peak} = \frac{1}{P}\sum_{p=1}^{P} \left |\frac{Q_{s,p} - Q_{o,p}}{Q_{o,p}} \right | \times 100,\]- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
treat_arrays – treat_arrays the true and predicted array
https (//github.com/neuralhydrology/neuralhydrology/blob/master/neuralhydrology/evaluation/metrics.py#L707) –
Examples
>>> import numpy as np >>> from SeqMetrics import mape_for_peaks >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> score = mre(t, p)
- SeqMetrics.legates_coeff_eff(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Legates Coefficient of Efficiency. Its value varies between 0 and 1. It is not as sensitive to extreme values as agreement_index and coefficcient of determination because of the utilization of the absolute value of the difference instead of the squared difference. See Equaltion 23 in Dodo et al., 2022
\[LCE = 1 - \frac{\sum_{i=1}^{n} |true_i - predicted_i|}{\sum_{i=1}^{n} |true_i - \bar{true}|}\]- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import legates_coeff_eff >>> t = np.random.random(10) >>> p = np.random.random(10) >>> agreement_index(t, p)
- SeqMetrics.manhattan_distance(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Manhattan distance, also known as cityblock distance or taxicab norm.
See Blanco-Mallo et al., 2023 and Alexei Botchkarev 2019 on the use of distances in performance measures.
\[D_{\text{manhattan}} = \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|\]- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
treat_arrays – treat_arrays the true and predicted array
Examples
>>> import numpy as np >>> from SeqMetrics import manhattan_distance >>> t = np.random.random(100) >>> p = np.random.random(100) >>> manhattan_distance(t, p)
- SeqMetrics.norm_nse(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Normalized Nash-Sutcliffe Efficiency. It ranges from 0 to 1. A value of 1 indicates perfect fit.
Parameters
or pandas series/DataFrame or a list.
- predicted :
simulated values
- treat_arrays :
process the true and predicted arrays using maybe_treat_arrays function
- SeqMetrics.coeff_of_persistence(true, predicted, lag: int = 1, treat_arrays: bool = True, **treat_arrays_kws) float[source]
Coefficient of Persistence. Varies between -inf to 1. The higher the better.
- Parameters:
true – True/observed/actual/target values. It must be a numpy array, pandas series/DataFrame, or a list.
predicted – Predicted values, same format as ‘true’.
lag – The lag for the baseline
treat_arrays – treat_arrays the true and predicted array
References
- Kitanidis, P. K., & Bras, R. L. (1980). Real-time forecasting with a conceptual
hydrologic model: 2. Applications and results. Water Resources Research, 16(6), 1034-1044.
- Nossent, J., & Bauwens, W. (2012, April). Application of a normalized
Nash-Sutcliffe efficiency to improve the accuracy of the Sobol’sensitivity analysis of a hydrological model. In EGU General Assembly Conference Abstracts (p. 237).
Examples
>>> import numpy as np >>> from SeqMetrics import manhattan_distance >>> t = np.random.random(100) >>> p = np.random.random(100) >>> coeff_of_persistence(t, p)
- SeqMetrics.calculate_hydro_metrics(true, predicted, treat_arrays: bool = True, **treat_arrays_kws) dict[source]
- Calculates the following performance metrics related to hydrology.
fdc_flv
fdc_fhv
kge
kge_np
kge_mod
kge_bound
kgeprime_bound
kgenp_bound
nse
nse_alpha
nse_beta
nse_mod
nse_bound
r2
mape
nrmse
corr_coeff
rmse
mae
mse
mpe
mase
r2_score
- Returns:
Dictionary with all metrics
- Return type:
- Parameters:
true – true/observed/actual/target values. It must be a numpy array, or pandas series/DataFrame or a list.
predicted – simulated values
treat_arrays – process the true and predicted arrays using maybe_treat_arrays function
Examples
>>> import numpy as np >>> from SeqMetrics import calculate_hydro_metrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> calculate_hydro_metrics(t, p)
Class-Based API
- class SeqMetrics.RegressionMetrics(*args, **kwargs)[source]
Bases:
MetricsCalculates more than 100 regression performance metrics related to sequence data.
Example
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> errors = RegressionMetrics(t,p) >>> all_errors = errors.calculate_all()
- __init__(*args, **kwargs)[source]
Initializes
Metrics.args and kwargs go to parent class
SeqMetrics.Metrics.
- JS() float[source]
Jensen-shannon divergence
\[JS(P \parallel Q) = \frac{1}{2} \sum_{i} \left( P(i) \log_2 \left( \frac{2P(i)}{P(i) + Q(i)} \right) + Q(i) \log_2 \left( \frac{2Q(i)}{P(i) + Q(i)} \right) \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.JS()
- acc() float[source]
Anomaly correction coefficient. See Langland et al., 2012; Miyakoda_ et al., 1972 and Murphy et al., 1989.
\[ACC = \frac{\sum_{i=1}^{N} \left( (\text{predicted}_i - \overline{\text{predicted}})(\text{true}_i - \overline{\text{true}}) \right)}{(N-1) \cdot \sigma_{\text{true}} \cdot \sigma_{\text{predicted}}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.acc()
- adjusted_r2() float[source]
Adjusted R squared.
\[\text{Adjusted } R^2 = 1 - \left( \frac{(1 - R^2) \cdot (n - 1)}{n - k - 1} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.adjusted_r2()
- agreement_index() float[source]
Agreement Index (d) developed by Willmott, 1981.
It detects additive and pro-portional differences in the observed and simulated means and vari-ances Moriasi et al., 2015. It is overly sensitive to extreme values due to the squared differences. It can also be used as a substitute for R2 to identify the degree to which model predic-tions are error-free.
\[d = 1 - \frac{\sum_{i=1}^{N}(e_{i} - s_{i})^2}{\sum_{i=1}^{N}(\left | s_{i} - \bar{e} \right | + \left | e_{i} - \bar{e} \right |)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.agreement_index()
- aic(p=1) float[source]
Akaike_ Information Criterion. Modifying from this source
\[AIC = n \cdot \ln\left(\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{n}\right) + 2p\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.aic( )
- aitchison(center='mean') float[source]
Aitchison distance. used in Zhang et al., 2020
\[d_{\text{Aitchison}} = \sqrt{\sum_{i=1}^{n} \left( \log(\text{true}_i) - \text{center}(\log(\text{true})) - \left(\log(\text{predicted}_i) - \text{center}(\log(\text{predicted}))\right) \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.aitchison( )
- amemiya_adj_r2() float[source]
Amemiya’s Adjusted R-squared
\[R^2_{\text{adj, Amemiya}} = 1 - \left( \frac{(1 - R^2) \cdot (n + k)}{n - k - 1} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.amemiya_adj_r2( )
- amemiya_pred_criterion() float[source]
Amemiya’s Prediction Criterion
\[\text{APC} = \left( \frac{n + k}{n - k} \right) \left( \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2 \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.amemiya_pred_criterion()
- bias() float[source]
Bias as and given by Gupta1998 et al., 1998
\[Bias=\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i})\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.bias()
- bic(p=1) float[source]
Bayesian Information Criterion
Minimising the BIC is intended to give the best model. The model chosen by the BIC is either the same as that chosen by the AIC, or one with fewer terms. This is because the BIC penalises the number of parameters more heavily than the AIC. Modified after RegscorePy.
\[BIC = n \cdot \ln\left(\frac{\text{SSE}}{n}\right) + p \cdot \ln(n)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.bic()
- brier_score() float[source]
Adopted from SkillMetrics Calculates the Brier score (BS), a measure of the mean-square error of probability forecasts for a dichotomous (two-category) event, such as the occurrence/non-occurrence of precipitation. The score is calculated using the formula:
\[BS = sum_(n=1)^N (f_n - o_n)^2/N\]where f is the forecast probabilities, o is the observed probabilities (0 or 1), and N is the total number of values in f & o. Note that f & o must have the same number of values, and those values must be in the range [0,1].
- Returns:
BS : Brier score
- Return type:
References
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.brier_score()
- calculate_hydro_metrics()[source]
- Calculates the following performance metrics related to hydrology.
fdc_flv
fdc_fhv
kge
kge_np
kge_mod
kge_bound
kgeprime_bound
kgenp_bound
nse
nse_alpha
nse_beta
nse_mod
nse_bound
r2
mape
nrmse
corr_coeff
rmse
mae
mse
mpe
mase
r2_score
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.calculate_hydro_metrics()
- centered_rms_dev() float[source]
Modified after SkillMetrics. Calculates the centered root-mean-square (RMS) difference between true and predicted using the formula: (E’)^2 = sum_(n=1)^N [(p_n - mean(p))(r_n - mean(r))]^2/N where p is the predicted values, r is the true values, and N is the total number of values in p & r.
\[CRMSD = \sqrt{\frac{1}{N} \sum_{i=1}^{N} \left( (p_i - \text{mean}(p)) - (r_i - \text{mean}(r)) \right)^2}\]Output: CRMSDIFF : centered root-mean-square (RMS) difference (E’)^2
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.centered_rms_dev()
- concordance_corr_coef() float[source]
Concordance Correlation Coefficient (CCC) taken from this paper.
\[CCC = \frac{2 \rho \sigma_{true} \sigma_{predicted}}{\sigma_{true}^2 + \sigma_{predicted}^2 + (\bar{true} - \bar{predicted})^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.concordance_corr_coef()
- corr_coeff() float[source]
Pearson correlation coefficient. It measures linear correlatin between true and predicted arrays. It is sensitive to outliers.
\[r = \frac{\sum ^n _{i=1}(e_i - \bar{e})(s_i - \bar{s})}{\sqrt{\sum ^n _{i=1}(e_i - \bar{e})^2} \sqrt{\sum ^n _{i=1}(s_i - \bar{s})^2}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.corr_coeff()
- cosine_similarity() float[source]
It is a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. See
\[\text{Cosine Similarity} = \frac{\sum_{i=1}^{n} \text{true}_i \cdot \text{predicted}_i}{\sqrt{\sum_{i=1}^{n} (\text{true}_i)^2} \cdot \sqrt{\sum_{i=1}^{n} (\text{predicted}_i)^2}}\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.cosine_similarity()
- covariance() float[source]
- Covariance
- \[\]
Covariance = frac{1}{N} sum_{i=1}^{N}((e_{i} - bar{e}) * (s_{i} - bar{s}))
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.covariance()
- critical_success_index(threshold=0.5) float[source]
-
\[CSI = \frac{TP}{TP + FN + FP}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([0, 1, 1, 0, 0, 1]) >>> p = np.array([0, 1, 0, 1, 1, 1]) >>> metrics= RegressionMetrics(t, p) >>> metrics.critical_success_index()
- cronbach_alpha() float[source]
It is a measure of internal consitency of data. See ucla and stackoverflow pages for more info.
\[alpha = \frac{N}{N - 1} \left(1 - \frac{\sum_{i=1}^{N} \sigma^2_{i}}{\sigma^2_{\text{total}}}\right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.cronbach_alpha()
- decomposed_mse() float[source]
Decomposed MSE developed by Kobayashi and Salam (2000)
\[dMSE = (\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i}))^2 + SDSD + LCS\]\[SDSD = (\sigma(e) - \sigma(s))^2\]\[LCS = 2 \sigma(e) \sigma(s) * (1 - \frac{\sum ^n _{i=1}(e_i - \bar{e})(s_i - \bar{s})} {\sqrt{\sum ^n _{i=1}(e_i - \bar{e})^2} \sqrt{\sum ^n _{i=1}(s_i - \bar{s})^2}})\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.decomposed_mse()
- euclid_distance() float[source]
Euclidian distance taken from `this book <https://doi.org/10.1016/B978-0-12-088735-4.50006-7`_.
\[D = \sqrt{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}\]Referneces: Kennard et al., 2010
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.euclid_distance()
- exp_var_score(weights=None) float | None[source]
Explained variance score . Best value is 1, lower values are less accurate.
\[\text{EVS} = 1 - \frac{\sum_{i=1}^{n} w_i \left( (true_i - predicted_i) - \frac{\sum_{j=1}^{n} w_j (true_j - predicted_j)}{\sum_{j=1}^{n} w_j} \right)^2}{\sum_{i=1}^{n} w_i (true_i - \frac{\sum_{j=1}^{n} w_j true_j}{\sum_{j=1}^{n} w_j})^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.exp_var_score()
- expanded_uncertainty(cov_fact=1.96) float[source]
By default, it calculates uncertainty with 95% confidence interval. 1.96 is the coverage factor corresponding 95% confidence level .This indicator is used in order to show more information about the model deviation. Using formula from by Behar et al., 2015 and Gueymard et al., 2014.
\[U = \text{cov_fact} \times \sqrt{\frac{1}{n-1} \sum_{i=1}^{n} \left( \left(\text{true}_i - \text{predicted}_i\right) - \overline{\left(\text{true} - \text{predicted}\right)} \right)^2 + \frac{1}{n} \sum_{i=1}^{n} \left(\text{true}_i - \text{predicted}_i\right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.expanded_uncertainty()
- fdc_fhv(h: float = 0.02) float[source]
modified Kratzert2018 code. Peak flow bias of the flow duration curve (Yilmaz 2008). used in kratzert et al., 2019.
\[FHV = \frac{\sum_{i=1}^{k} (predicted_i - true_i)}{\sum_{i=1}^{k} true_i} \times 100\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.fdc_fhv()
- fdc_flv(low_flow: float = 0.3) float[source]
bias of the bottom 30 % low flows. modified Kratzert_ code used in kratzert et al., 2019.
\[\text{FLV} = -1 \times \frac{\sum (\log(\text{predicted}) - \min(\log(\text{predicted}))) - \sum (\log(\text{true}) - \min(\log(\text{true})))}{\sum (\log(\text{true}) - \min(\log(\text{true}))) + 1 \times 10^{-6}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.fdc_flv()
- gmae() float[source]
-
\[GMAE = \left( \prod_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right| \right)^{\frac{1}{n}}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.gmae()
- gmean_diff() float[source]
- Geometric mean difference.
First geometric mean is calculated for each
of two samples and their difference is calculated.
\[\text{gmean_diff} = \left( \prod_{i=1}^{n} \text{true}_i \right)^{\frac{1}{n}} - \left( \prod_{i=1}^{n} \text{predicted}_i \right)^{\frac{1}{n}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.gmean_diff()
- gmrae(benchmark: ndarray | None = None) float[source]
Geometric Mean Relative Absolute Error
\[GMRAE = \left( \prod_{i=1}^{n} \frac{|true_i - predicted_i|}{|true_i - benchmark_i|} \right)^{\frac{1}{n}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.gmrae()
- inrse() float[source]
Integral Normalized Root Squared Error
\[IN\text{-}RSE = \sqrt{\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \overline{\text{true}})^2}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.inrse()
- irmse() float[source]
Inertial RMSE. RMSE divided by standard deviation of the gradient of true.
\[\text{IRMSE} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \text{true}_i - \text{predicted}_i \right)^2}}{\sqrt{\frac{1}{n-2} \sum_{i=1}^{n-1} \left( (\text{true}_{i+1} - \text{true}_i) - \overline{(\text{true}_{i+1} - \text{true}_i)} \right)^2}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.irmse()
- kendall_tau(return_p=False) float | tuple[source]
Kendall’s tau .used in Probst et al., 2019.
\[tau = \frac{(C - D)}{\sqrt{(C + D + T_{\text{true}})(C + D + T_{\text{predicted}})}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kendall_tau()
- kge()[source]
Kling-Gupta Efficiency following Gupta_ et al. 2009. This error considers correlation, variability and mean difference/error.
\[\text{KGE} = 1 - \sqrt{(r - 1)^2 + (\alpha - 1)^2 + (\beta - 1)^2}\]\[\alpha = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]\[\beta = \frac{\mu_{\text{predicted}}}{\mu_{\text{true}}}\]In this equation, alpha accounts for the variability (standard deviation), beta accounts for the mean difference and r accounts for the correlation between the true and predicted values. This equation can also be written as belowS
\[\text{KGE} = \frac{\sum_{i=1}^{N} ( \text{true}_i - \bar{\text{true}} ) ( \text{predicted}_i - \bar{\text{predicted}} )}{\sqrt{\sum_{i=1}^{N} ( \text{true}_i - \bar{\text{true}} )^2} \sqrt{\sum_{i=1}^{N} ( \text{predicted}_i - \bar{\text{predicted}} )^2}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kge()
- kge_bound() float[source]
Bounded Version of the Original Kling-Gupta Efficiency after Mathevet et al. 2006.
\[\text{KGE}_{\text{bound}} = \frac{\text{KGE}}{2 - \text{KGE}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kge_bound()
- kge_mod()[source]
Modified Kling-Gupta Efficiency after Kling et al. 2012.
\[\text{KGE}_{\text{mod}} = 1 - \sqrt{ \left( \frac{\sum_{i=1}^{n} (true_i - \bar{true})(predicted_i - \bar{predicted})}{\sqrt{\sum_{i=1}^{n} (true_i - \bar{true})^2} \sqrt{\sum_{i=1}^{n} (predicted_i - \bar{predicted})^2}} - 1 \right)^2 + \left( \frac{\frac{\sigma_{predicted}}{\bar{predicted}}}{\frac{\sigma_{true}}{\bar{true}}} - 1 \right)^2 + \left( \frac{\bar{predicted}}{\bar{true}} - 1 \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kge_mod()
- kge_np()[source]
Non-parametric Kling-Gupta Efficiency after Pool et al. 2018.
\[cc = \rho(\text{true}, \text{predicted})\]\[\alpha = 1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted(predicted}_i\text{)}}{\text{mean(predicted)} \cdot n} - \frac{\text{sorted(true}_i\text{)}}{\text{mean(true)} \cdot n} \right|\]\[\beta = \frac{\text{mean(predicted)}}{\text{mean(true)}}\]\[\text{KGE}_{\text{np}} = 1 - \sqrt{(cc - 1)^2 + (\alpha - 1)^2 + (\beta - 1)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kge_np()
- kgenp_bound()[source]
Bounded Version of the Non-Parametric Kling-Gupta Efficiency
\[KGE_{np_{bound}} = \frac{1 - \sqrt{\left(\rho(t, p) - 1\right)^2 + \left(1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted}(p_i)}{\text{mean}(p) \cdot n} - \frac{\text{sorted}(t_i)}{\text{mean}(t) \cdot n} \right| - 1\right)^2 + \left(\frac{\text{mean}(p)}{\text{mean}(t)} - 1\right)^2}}{2 - \left(1 - \sqrt{\left(\rho(t, p) - 1\right)^2 + \left(1 - 0.5 \sum_{i=1}^{n} \left| \frac{\text{sorted}(p_i)}{\text{mean}(p) \cdot n} - \frac{\text{sorted}(t_i)}{\text{mean}(t) \cdot n} \right| - 1\right)^2 + \left(\frac{\text{mean}(p)}{\text{mean}(t)} - 1\right)^2}\right)}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kgenp_bound()
- kgeprime_bound() float[source]
Bounded Version of the Modified Kling-Gupta Efficiency
\[KGE'_{\text{bounded}} = \frac{1 - \sqrt{(r - 1)^2 + (\gamma - 1)^2 + (\beta - 1)^2}}{2 - (1 - \sqrt{(r - 1)^2 + (\gamma - 1)^2 + (\beta - 1)^2})}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kgeprime_bound()
- kl_divergence() float[source]
- \[D_{KL}(P||Q) = \sum_{x\in\mathcal{X}} P(x) \log\]
rac{P(x)}{Q{x}}
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([0.1, 0.2, 0.3, 0.2, 0.2]) >>> p = np.array([0.2, 0.2, 0.2, 0.2, 0.2]) >>> metrics= RegressionMetrics(t, p) >>> divergence = metrics.kl_divergence()
- kl_sym() float | None[source]
Symmetric kullback-leibler divergence
\[\text{KL}_{\text{sym}}(P || Q) = \frac{1}{2} \sum_{i=1}^{n} \left( P_i - Q_i \right) \left( \log_2 \frac{P_i}{Q_i} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.kl_sym()
- legates_coeff_eff(power=0) float[source]
Legates Coefficient of Efficiency. Its value varies between 0 and 1. It is not as sensitive to extreme values as agreement_index and coefficcient of determination because of the utilization of the absolute value of the difference instead of the squared difference. See Equaltion 23 in Dodo et al., 2022
\[LCE = 1 - \frac{\sum_{i=1}^{n} |true_i - predicted_i|}{\sum_{i=1}^{n} |true_i - \bar{true}|}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> metrics= RegressionMetrics(t, p) >>> score = metrics.legates_coeff_eff()
- lm_index(obs_bar_p=None) float[source]
Legate-McCabe Efficiency Index. Less sensitive to outliers in the data. The larger, the better
\[a_i = |predicted_i - true_i|\]\[b_i = |true_i - \text{obs\_bar\_p}| \text{if } \text{obs\_bar\_p} \text{ is provided} \|true_i - \bar{true}| \text{otherwise}\]\[\text{LM Index} = 1 - \frac{\sum_{i=1}^{n} a_i}{\sum_{i=1}^{n} b_i}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.lm_index()
- log_cosh_error() float[source]
-
\[\text{Log-Cosh Error} = \frac{1}{n} \sum_{i=1}^{n} \log \left( \cosh(\text{predicted}_i - \text{true}_i) \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> metrics= RegressionMetrics(t, p) >>> error = metrics.log_cosh_error()
- log_nse(epsilon: float = 0.0, log_base: str = 'e') float[source]
log transformed Nash-Sutcliffe Efficiency. It is especially useful for capturing prediction performance for the lowest flows due to the logarithmic transform.
\[ \begin{align}\begin{aligned}NSE = 1-\frac{\sum_{i=1}^{N}(log(e_{i})-log(s_{i}))^2}{\sum_{i=1}^{N}(log(e_{i})-log(\bar{e})^2}-1)*-1\\Examples\end{aligned}\end{align} \]>>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.log_nse()
- log_prob() float[source]
Logarithmic probability distribution
\[\text{log_prob} = \frac{1}{N} \sum_{i=1}^{N} \left( -\frac{\left( \frac{\text{true}_i - \text{predicted}_i}{\text{scale}} \right)^2}{2} - \log(\sqrt{2\pi}) \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.log_prob()
- maape() float[source]
Mean Arctangent Absolute Percentage Error Note: result is NOT multiplied by 100
\[MAAPE = \frac{1}{n} \sum_{i=1}^{n} \arctan \left( \frac{| \text{true}_i - \text{predicted}_i |}{| \text{true}_i | + \epsilon} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.maape()
- mae() float[source]
Mean Absolute Error. It is less sensitive to outliers as compared to mse/rmse.
\[\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mae()
- manhattan_distance() float[source]
Manhattan distance, also known as cityblock distance or taxicab norm.
\[D_{\text{manhattan}} = \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|\]- See Blanco-Mallo et al., 2023 and Cha et al., 2007
and Alexei Botchkarev 2019 on the use of distances in performance measures.
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> metrics= RegressionMetrics(t, p) >>> score = metrics.manhattan_distance()
- mapd() float[source]
Mean absolute percentage deviation
\[MAPD = \frac{\sum_{i=1}^{n} \left| predicted_i - true_i \right|}{\sum_{i=1}^{n} \left| true_i \right|}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mapd(t, p)
- mape() float[source]
Mean Absolute Percentage Error. The MAPE is often used when the quantity to predict is known to remain way above zero. It is useful when the size or size of a prediction variable is significant in evaluating the accuracy of a prediction. It has advantages of scale-independency and interpretability. However, it has the significant disadvantage that it produces infinite or undefined values for zero or close-to-zero actual values.
\[MAPE = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{true_i - predicted_i}{true_i} \right| \times 100\]References
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mape()
- mape_for_peaks() float[source]
Mean Absolute Percentage Error for peaks which are found using scipy.singnal.find_peaks
\[\text{MAPE}_\text{peak} = \frac{1}{P}\sum_{p=1}^{P} \left |\frac{Q_{s,p} - Q_{o,p}}{Q_{o,p}} \right | \times 100,\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mape_for_peaks()
- mare() float[source]
Mean Absolute Relative Error. When expressed in %age, it is also known as mape.
\[\text{MARE} = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right|\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mare()
- mase(seasonality: int = 1)[source]
Mean Absolute Scaled Error. Baseline (benchmark) is computed with naive forecasting (shifted by seasonality) modified after this. It is the ratio of MAE of used model and MAE of naive forecast.
\[\text{MASE} = \frac{\frac{1}{n} \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\frac{1}{n-s} \sum_{i=s+1}^{n} \left| \text{true}_i - \text{true}_{i-s} \right|}\]Hyndman, R. J. (2006). Another look at forecast-accuracy metrics for intermittent demand. Foresight: The International Journal of Applied Forecasting, 4(4), 43-46.
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mase()
- max_error() float[source]
maximum absolute error In Sklearn, there is “absolute” in equation but not in name of metric.
\[\text{Max Error} = \max_{i=1}^n \left| \text{true}_i - \text{predicted}_i \right|\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.max_error()
- mb_r() float[source]
Mielke-Berry R value. Berry and Mielke, 1988.
\[R = 1 - \frac{n^2 \cdot \frac{1}{n} \sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|}{\sum_{i=1}^{n} \sum_{j=1}^{n} \left| \text{predicted}_j - \text{true}_i \right|}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mb_r()
- mbrae(benchmark: ndarray | None = None) float[source]
Mean Bounded Relative Absolute Error
\[MBRAE = \frac{1}{n} \sum_{i=1}^{n} \frac{| \text{true}_i - \text{predicted}_i |}{| \text{true}_i - \text{benchmark}_i |}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mbrae()
- mda() float[source]
Mean Directional Accuracy modified after
\[\text{MDA} = \frac{1}{n-1} \sum_{i=1}^{n-1} \left( \text{sign}( \text{true}_{i+1} - \text{true}_i) == \text{sign}( \text{predicted}_{i+1} - \text{predicted}_i) \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mda()
- mdape() float[source]
Median Absolute Percentage Error. The value is multiplied by 100.
\[\text{MdAPE} = 100 \times \text{Median} \left( \left\{ \frac{|\text{true}_i - \text{predicted}_i|}{|\text{true}_i|} \right\}_{i=1}^n \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mdape()
- mde() float[source]
-
\[MDE = \text{median}(\text{predicted}_i - \text{true}_i)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mde()
- mdrae(benchmark: ndarray | None = None) float[source]
Median Relative Absolute Error In Sklearn, there is “absolute” in equation but not in name of metric.
\[MdRAE = \text{median} \left( \left| \frac{true_i - predicted_i}{true_i - benchmark_i} \right| \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mdrae()
- me()[source]
-
\[ME = \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.me()
- mean_bias_error() float[source]
Mean Bias Error It represents overall bias error or systematic error. It shows average interpolation bias; i.e. average over- or underestimation. [1][2].This indicator expresses a tendency of model to underestimate (negative value) or overestimate (positive value) global radiation, while the MBE values closest to zero are desirable. The drawback of this test is that it does not show the correct performance when the model presents overestimated and underestimated values at the same time, since overestimation and underestimation values cancel each other.
\[\text{MBE} = \frac{1}{N} \sum_{i=1}^{N} (true_i - predicted_i)\]References
- `Willmott, C. J., & Matsuura, K. (2006). On the use of dimensioned measures of error to evaluate the performance
of spatial interpolators. International Journal of Geographical Information Science, 20(1), 89-102. <https://doi.org/10.1080/1365881050028697>`_
- `Valipour, M. (2015). Retracted: Comparative Evaluation of Radiation-Based Methods for Estimation of Potential
Evapotranspiration. Journal of Hydrologic Engineering, 20(5), 04014068. <https://dx.doi.org/10.1061/(ASCE)HE.1943-5584.0001066>`_
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mean_bias_error()
- mean_gamma_deviance(weights=None) float[source]
-
\[\text{Mean Gamma Deviance (Weighted)} = \frac{1}{\sum_{i=1}^{n} w_i} \sum_{i=1}^{n} w_i \frac{2}{\text{true}_i} \left( \text{predicted}_i - \text{true}_i - \text{true}_i \ln \left( \frac{\text{predicted}_i}{\text{true}_i} \right) \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mean_gamma_deviance()
- mean_poisson_deviance(weights=None) float[source]
-
\[\text{MPD} = \frac{1}{n} \sum_{i=1}^{n} 2 \left( \text{true}_i \log \left( \frac{\text{true}_i}{\text{predicted}_i} \right) - (\text{true}_i - \text{predicted}_i) \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mean_poisson_deviance()
- mean_var() float[source]
Mean variance
\[\text{mean_var} = \text{Var} \left( \log(1 + \text{true}) - \log(1 + \text{predicted}) \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mean_var()
- med_seq_error() float[source]
Median Squared Error Same as mse, but it takes median which reduces the impact of outliers.
\[\text{MedSE} = \text{median} \left( (\text{predicted}_i - \text{true}_i)^2 \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics = RegressionMetrics(t, p) >>> metrics.med_seq_error()
- median_abs_error() float[source]
median absolute error
\[\text{MedAE} = \text{median} \left( \left| \text{true}_i - \text{predicted}_i \right| \right)\]References
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.median_absolute_error.html
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.median_abs_error()
- minkowski_distance(order=1) float[source]
-
\[D_{Minkowski} = \left( \sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|^p \right)^{\frac{1}{p}}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> metrics= RegressionMetrics(t, p) >>> distance = metrics.minkowski_distance()
- mle() float[source]
-
\[\text{MLE} = \frac{1}{n} \sum_{i=1}^{n} \left( \log(1 + \text{predicted}_i) - \log(1 + \text{true}_i) \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics = RegressionMetrics(t, p) >>> metrics.mle()
- mod_agreement_index(j=1) float[source]
Modified agreement of index. j: int, when j==1, this is same as agreement_index. Higher j means more impact of outliers.
\[MAI = 1 - \frac{\sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|^j}{\sum_{i=1}^{n} \left( \left| \text{predicted}_i - \overline{\text{true}} \right| + \left| \text{true}_i - \overline{\text{true}} \right| \right)^j}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics = RegressionMetrics(t, p) >>> metrics.mod_agreement_index()
- mpe() float[source]
Mean Percentage Error The value is multiplied by 100 to reflect percentage.
\[MPE = \frac{1}{n} \sum_{i=1}^{n} \left( \frac{true_i - predicted_i}{true_i} \right) \times 100\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mpe()
- mrae(benchmark: ndarray | None = None)[source]
-
\[MRAE = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{benchmark}_i} \right|\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mrae()
- mre(benchmark: ndarray | None = None)[source]
-
\[\text{MRE} = \frac{1}{n} \sum_{i=1}^{n} \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right|\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mre()
- mse() float[source]
-
\[MSE = \frac{\sum_{i=1}^{N} w_i (true_i - predicted_i)^2}{\sum_{i=1}^{N} w_i}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.mse()
- msle(weights=None) float[source]
-
\[\text{MSLE} = \frac{\sum_{i=1}^{n} w_i \cdot \text{sq_log_error}_i}{\sum_{i=1}^{n} w_i}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.msle()
- norm_ae() float[source]
-
\[norm\_ae = \sqrt{\frac{\sum_{i=1}^{n} (error_i - MAE)^2}{n - 1}}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.norm_ae()
- norm_ape() float[source]
Normalized Absolute Percentage Error
\[\text{norm_APE} = \sqrt{ \frac{1}{n-1} \sum_{i=1}^{n} \left( \left| \frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \right| - \frac{1}{n} \sum_{j=1}^{n} \left| \frac{\text{true}_j - \text{predicted}_j}{\text{true}_j} \right| \right)^2 }\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.norm_ape()
- norm_euclid_distance() float[source]
-
\[D_{norm} = \sqrt{\sum_{i=1}^{n} \left( \frac{\text{true}_i}{\bar{\text{true}}} - \frac{\text{predicted}_i}{\bar{\text{predicted}}} \right)^2}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.norm_euclid_distance()
- nrmse() float[source]
Normalized Root Mean Squared Error
\[ \begin{align}\begin{aligned}NRMSE = \frac{\sqrt{\frac{1}{N} \sum_{i=1}^{N} (\text{true}_i - \text{predicted}_i)^2}}{\max(\text{true}) - \min( ext{true})}\\Examples\end{aligned}\end{align} \]>>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nrmse()
- nrmse_ipercentile(q1=25, q2=75) float[source]
RMSE normalized by inter percentile range of true. This is the least sensitive to outliers. q1: any interger between 1 and 99 q2: any integer between 2 and 100. Should be greater than q1. Reference: Pontius et al., 2008
\[\text{NRMSE}_{\text{IP}} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{Q_{q2} - Q_{q1}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nrmse_ipercentile()
- nrmse_mean() float[source]
Mean Normalized RMSE RMSE normalized by mean of true values.This allows comparison between datasets with different scales.
Reference: Pontius et al., 2008
\[NRMSE_{mean} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{\bar{\text{true}}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nrmse_mean()
- nrmse_range() float[source]
Range Normalized Root Mean Squared Error. RMSE normalized by true values. This allows comparison between data sets with different scales. It is more sensitive to outliers.
Reference: Pontius et al., 2008
\[\text{NRMSE} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{predicted}_i - \text{true}_i)^2}}{\max(\text{true}) - \min(\text{true})}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nrmse_range()
- nse() float[source]
Nash-Sutcliff Efficiency.
The Nash-Sutcliffe efficiency (NSE) is a normalized statistic that determines the relative magnitude of the residual variance compared to the measured data variance It determines how well the model simulates trends for the output response of concern. But cannot help identify model bias and cannot be used to identify differences in timing and magnitude of peak flows and shape of recession curves; in other words, it cannot be used for single-event simulations. It is sensitive to extreme values due to the squared differ-ences [1]. To make it less sensitive to outliers, [2] proposed log and relative nse.
\[\text{NSE} = 1 - \frac{\sum_{i=1}^{N} (predicted_i - true_i)^2}{\sum_{i=1}^{N} (true_i - \bar{true})^2}\]where the bar above predicted and true indicates the mean of the array.
References
- Moriasi, D. N., Gitau, M. W., Pai, N., & Daggupati, P. (2015). Hydrologic and water quality models:
Performance measures and evaluation criteria. Transactions of the ASABE, 58(6), 1763-1785.
- Krause, P., Boyle, D., & Bäse, F. (2005). Comparison of different efficiency criteria for hydrological
model assessment. Adv. Geosci., 5, 89-97. https://dx.doi.org/10.5194/adgeo-5-89-2005.
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse()
- nse_alpha() float[source]
Alpha decomposition of the NSE, see Gupta et al., 2009 used in Kratzert et al., 2019.
\[\text{NSE}_{\text{alpha}} = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse_alpha()
- nse_beta() float[source]
Beta decomposition of NSE. Gupta et al. 2009 used in kratzert et al., 2019. .. math:
\text{NSE}_{\text{beta}} = \frac{\mu_{\text{predicted}} - \mu_{\text{true}}}{\sigma_{\text{true}}}
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse_beta()
- nse_bound() float[source]
Bounded Version of the Nash-Sutcliffe Efficiency (nse)
\[\text{NSE}_{\text{bound}} = \frac{\text{NSE}}{2 - \text{NSE}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse_bound()
- nse_mod(j=1) float[source]
Gives less weightage to outliers if j=1 and if j>1 then it gives more weightage to outliers. Reference: Krause_ et al., 2005.
\[\text{NSE}_{\text{mod}} = 1 - \frac{\sum_{i=1}^{N} \left| \text{predicted}_i - \text{true}_i \right|^j}{\sum_{i=1}^{N} \left| \text{true}_i - \bar{ ext{true}} \right|^j}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse_mod()
- nse_rel() float[source]
Relative Nash-Sutcliff Efficiency.
\[\text{NSE}_{\text{rel}} = 1 - \frac{\sum_{i=1}^{N} \left( \frac{|\text{predicted}_i - \text{true}_i|}{\text{true}_i} \right)^2}{\sum_{i=1}^{N} \left( \frac{|\text{true}_i - \overline{\text{true}}|}{\overline{\text{true}}} \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.nse_rel()
- pbias() float[source]
Percent Bias. It determines how well the model simulates the average magnitudes for the output response of interest. It can also determine over and under-prediction. It cannot be used (1) for single-event simula-tions to identify differences in timing and magnitude of peak flows and the shape of recession curves nor (2) to determine how well the model simulates residual variations and/or trends for the output response of interest. It can give a deceiving rating of model performance if the model overpredicts as much as it underpredicts, in which case PBIAS will be close to zero even though the model simulation is poor. [1]
\[PBIAS = 100 \times \frac{\sum_{i=1}^{N} (\text{true}_i - \text{predicted}_i)}{\sum_{i=1}^{N} \text{true}_i}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.pbias()
- r2() float[source]
R2 is a statistical measure of how well the regression line approximates the actual data. Quantifies the percent of variation in the response that the ‘model’ explains. The ‘model’ here is anything from which we obtained predicted array. It is also called coefficient of determination or square of pearson correlation coefficient. More heavily affected by outliers than pearson correlatin r.
\[R^2 = \left( \frac{\sum_{i=1}^{N} \left( \frac{true_i - \bar{true}}{\sigma_{true}} \cdot \frac{predicted_i - \bar{predicted}}{\sigma_{predicted}} \right)}{N - 1} \right)^2\]where the bar above predicted and true indicates the mean of the array.
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> r_square= metrics.r2() >>> r_square
- r2_score(weights=None)[source]
This is not a symmetric function. Unlike most other scores, R^2 score may be negative (it need not actually be the square of a quantity R). This metric is not well-defined for single samples and will return a NaN value if n_samples is less than two.
\[\text{R2}_{\text{score}} = 1 - \frac{\sum_{i=1}^{n} w_i (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} w_i (\text{true}_i - \bar{\text{true}})^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.r2_score()
- rae() float[source]
Relative Absolute Error (aka Approximation Error)
\[\text{RAE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \left| \text{true}_i - \overline{\text{true}} \right|}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rae()
- ref_agreement_index() float[source]
Refined Index of Agreement . From -1 to 1. Larger the better.
\[a = \sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|\]\[b = 2 \sum_{i=1}^{n} \left| \text{true}_i - \overline{\text{true}} \right|\]\[d_{\text{ref}} = \begin{cases} 1 - \frac{a}{b} & \text{if } a \leq b \ \frac{b}{a} - 1 & \text{if } a > b \end{cases}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.ref_agreement_index()
- rel_agreement_index() float[source]
Relative index of agreement. from 0 to 1. larger the better.
\[\text{rel_agreement_index} = 1 - \frac{\sum_{i=1}^{n} \left( \frac{\text{predicted}_i - \text{true}_i}{\text{true}_i} \right)^2}{\sum_{i=1}^{n} \left( \frac{|\text{predicted}_i - \bar{\text{true}}| + |\text{true}_i - \bar{\text{true}}|}{\bar{\text{true}}} \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rel_agreement_index()
- relative_rmse() float[source]
Relative Root Mean Squared Error
\[RRMSE=\frac{\sqrt{\frac{1}{N}\sum_{i=1}^{N}(e_{i}-s_{i})^2}}{\bar{e}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.relative_rmse()
- rmdspe() float[source]
Root Median Squared Percentage Error. The value is multiplied by 100 to reflect percentage.
\[\text{RMDSPE} = \sqrt{\text{median}\left(\left(\frac{\text{true}_i - \text{predicted}_i}{\text{true}_i} \times 100\right)^2\right)}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rmdspe()
- rmse(weights=None) float[source]
-
\[\text{RMSE} = \sqrt{\frac{\sum_{i=1}^{n} w_i (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} w_i}}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rmse()
- rmsle() float[source]
-
This error is less sensitive to outliers . Compared to RMSE, RMSLE only considers the relative error between predicted and actual values, and the scale of the error is nullified by the log-transformation. Furthermore, RMSLE penalizes underestimation more than overestimation. This is especially useful in those studies where the underestimation of the target variable is not acceptable but overestimation can be tolerated .
\[RMSLE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \log(1 + \text{predicted}_i) - \log(1 + \text{true}_i) \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rmsle()
- rmspe() float[source]
Root Mean Square Percentage Error .
\[RMSPE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left(PE_i\right)^2} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left(\frac{\text{true}_i - \text{predicted}_i}{\text{true}_i}\right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rmspe()
- rmsse() float[source]
Root Mean Squared Scaled Error
\[\text{RMSSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left( \frac{\left| \text{true}_i - \text{predicted}_i \right|}{\frac{1}{n-s} \sum_{j=s+1}^{n} \left| \text{true}_j - \text{true}_{j-s} \right|} \right)^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rmsse()
- rrse() float[source]
-
\[RRSE = \sqrt{\frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}}\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rrse()
- rse() float[source]
Relative Squared Error
\[\text{RSE} = \frac{\sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}{\sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rse()
- rsr() float[source]
ratio of the root mean square error to the standard deviation of measured data (RSR),
It incorporates the benefits of error index statistics andincludes a scaling/normalization factor, so that the resulting statistic and reported values can apply to various constitu-ents.
\[\text{RSR} = \frac{\sqrt{\frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2}}{\sqrt{\frac{1}{n-1} \sum_{i=1}^{n} (\text{true}_i - \bar{\text{true}})^2}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.rsr()
- sa() float[source]
Spectral angle. From -pi/2 to pi/2. Closer to 0 is better. It measures angle between two vectors in hyperspace indicating how well the shape of two arrays match instead of their magnitude. Reference: Robila and Gershman, 2005.
\[SA = \arccos \left( \frac{\sum_{i=1}^{n} (\text{true}_i \cdot \text{predicted}_i)}{\sqrt{\sum_{i=1}^{n} (\text{true}_i)^2} \cdot \sqrt{\sum_{i=1}^{n} (\text{predicted}_i)^2}} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.sa()
- sc() float[source]
Spectral correlation. It varies from -pi/2 to pi/2. Closer to 0 is better.
\[sc = \arccos \left( \frac{ \sum_{i=1}^{n} (t_i - \bar{t}) \cdot (p_i - \bar{p}) }{ \sqrt{\sum_{i=1}^{n} (t_i - \bar{t})^2} \cdot \sqrt{\sum_{i=1}^{n} (p_i - \bar{p})^2} } \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.sc()
- sga() float[source]
Spectral gradient angle. It varies from -pi/2 to pi/2. Closer to 0 is better.
\[\text{SGA} = \arccos \left( \frac{\sum_{i=1}^{n-1} \left( (true_{i+1} - true_i) \cdot (predicted_{i+1} - predicted_i) \right)}{\sqrt{\sum_{i=1}^{n-1} (true_{i+1} - true_i)^2} \times \sqrt{\sum_{i=1}^{n-1} (predicted_{i+1} - predicted_i)^2}} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.sga()
- sid() float[source]
Spectral Information Divergence. From -pi/2 to pi/2. Closer to 0 is better.
\[\text{SID} = \left( \frac{\text{t}}{\text{mean(t)}} - \frac{\text{p}}{\text{mean(p)}} \right) \cdot \left( \log_{10}(\text{t}) - \log_{10}(\text{mean(t)}) - \log_{10}(\text{p}) + \log_{10}(\text{mean(p)}) \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.sid()
- skill_score_murphy() float[source]
Adopted from here . Calculate non-dimensional skill score (SS) between two variables using definition of Murphy (1988) using the formula:
\[SS = 1 - RMSE^2/SDEV^2\]\[SDEV is the standard deviation of the true values\]\[SDEV^2 = sum_(n=1)^N [r_n - mean(r)]^2/(N-1)\]where p is the predicted values, r is the reference values, and N is the total number of values in p & r. Note that p & r must have the same number of values. A positive skill score can be interpreted as the percentage of improvement of the new model forecast in comparison to the reference. On the other hand, a negative skill score denotes that the forecast of interest is worse than the referencing forecast. Consequently, a value of zero denotes that both forecasts perform equally [MLAir, 2020].
References
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.skill_score_murphy()
- smape() float[source]
Symmetric Mean Absolute Percentage Error. Adoption from this.
\[SMAPE = \frac{100}{n} \sum_{i=1}^{n} \frac{2 \left| \text{predicted}_i - \text{true}_i \right|}{\left| \text{true}_i \right| + \left| \text{predicted}_i \right|}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.smape()
- smdape() float[source]
Symmetric Median Absolute Percentage Error Note: result is NOT multiplied by 100
\[\text{smdape} = \text{median} \left( \frac{2 \cdot | \text{predicted} - \text{true} |}{| \text{true} | + | \text{predicted} | + \epsilon} \right)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.smdape()
- spearmann_corr() float[source]
Separmann correlation coefficient.
This is a nonparametric metric and assesses how well the relationship between the true and predicted data can be described using a monotonic function.
\[r = \frac{\sum_{i=1}^{n} \left( R_{t,i} - \overline{R_t} \right) \left( R_{p,i} - \overline{R_p} \right)}{\sqrt{ \sum_{i=1}^{n} \left( R_{t,i} - \overline{R_t} \right)^2 \sum_{i=1}^{n} \left( R_{p,i} - \overline{R_p} \right)^2 }}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.spearmann_corr()
- sse() float[source]
Sum of squared errors (model vs actual). It is measure of how far off our model’s predictions are from the observed values. A value of 0 indicates that all predications are spot on. A non-zero value indicates errors.
This is also called residual sum of squares (RSS) or sum of squared residuals as per tutorialspoint .
\[\text{SSE} = \sum_{i=1}^{n} (true_i - predicted_i)^2\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.sse()
- std_ratio(**kwargs) float[source]
Ratio of standard deviations of predictions and trues. Also known as standard ratio, it varies from 0.0 to infinity while 1.0 being the perfect value.
\[\text{std_ratio} = \frac{\sigma_{\text{predicted}}}{\sigma_{\text{true}}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.std_ratio()
- tweedie_deviance_score(power=0) float[source]
-
\[D(\text{true}, \text{predicted}) = \frac{1}{n} \sum_{i=1}^{n} (\text{true}_i - \text{predicted}_i)^2\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \text{true}_i \log\left(\frac{\text{true}_i + (\text{true}_i = 0)}{\text{predicted}_i}\right) - \text{true}_i + \text{predicted}_i \right)\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \frac{\text{true}_i}{\text{predicted}_i} - \log\left(\frac{\text{true}_i}{\text{predicted}_i}\right) - 1 \right)\]\[D(\text{true}, \text{predicted}) = 2 \sum_{i=1}^{n} \left( \frac{(\text{true}_i - \text{predicted}_i)^2}{\text{true}_i^2 \text{predicted}_i} \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.array([1, 2, 3, 4, 5]) >>> p = np.array([1.1, 1.9, 3.1, 4.2, 4.8]) >>> metrics= RegressionMetrics(t, p) >>> score = metrics.tweedie_deviance_score()
- umbrae(benchmark: ndarray | None = None)[source]
Unscaled Mean Bounded Relative Absolute Error
\[UMBRAE = \frac{\frac{1}{n} \sum_{i=1}^{n} \frac{|t_i - p_i|}{|t_i - b_i|}}{1 - \frac{1}{n} \sum_{i=1}^{n} \frac{|t_i - p_i|}{|t_i - b_i|}}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.umbrae()
- variability_ratio() float[source]
Variability Ratio It is the ratio of the variance of the predicted values to the variance of the true values. It is used to measure the variability of the predicted values relative to the true values.
\[VR = 1 - \left| \frac{\frac{\sigma_{\text{predicted}}}{\mu_{\text{predicted}}}}{\frac{\sigma_{\text{true}}}{\mu_{\text{true}}}} - 1 \right|\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.variability_ratio()
- ve() float[source]
Volumetric efficiency. from 0 to 1. Smaller the better.
\[VE = 1 - \frac{\sum_{i=1}^{n} \left| \text{predicted}_i - \text{true}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.ve()
- volume_error() float[source]
Returns the Volume Error (Ve). It is an indicator of the agreement between the averages of the simulated and observed runoff (i.e. long-term water balance). used in Reynolds paper:
\[\text{volume_error}= Sum(self.predicted- true)/sum(self.predicted)\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.volume_error()
- wape() float[source]
weighted absolute percentage error (wape)
It is a variation of mape but more suitable for intermittent and low-volume data.
\[\text{WAPE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.wape()
- watt_m() float[source]
-
\[M = \frac{2}{\pi} \cdot \arcsin \left( 1 - \frac{\frac{1}{n} \sum_{i=1}^{n} ( \text{true}_i - \text{predicted}_i )^2}{\sigma_{\text{true}}^2 + \sigma_{\text{predicted}}^2 + (\mu_{\text{predicted}} - \mu_{\text{true}})^2} \right)\]
Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.watt_m()
- wmape() float[source]
Weighted Mean Absolute Percent Error
\[\text{WMAPE} = \frac{\sum_{i=1}^{n} \left| \text{true}_i - \text{predicted}_i \right|}{\sum_{i=1}^{n} \text{true}_i}\]Examples
>>> import numpy as np >>> from SeqMetrics import RegressionMetrics >>> t = np.random.random(10) >>> p = np.random.random(10) >>> metrics= RegressionMetrics(t, p) >>> metrics.wmape()