Rexmex

Score Cards

class CoverageScoreCard(metric_set: Mapping[str, Callable[[numpy.array, numpy.array], float]], all_users: Collection[str], all_items: Collection[str])[source]

Coverage scorecard can be used to aggregate coverage-related metrics, plot those, and generate performance reports.

generate_report(recs_to_evaluate: pandas.core.frame.DataFrame, grouping: Optional[List[str]] = None)pandas.core.frame.DataFrame[source]

A method to calculate (aggregated) coverage/performance metrics based on a dataframe of predictions. It assumes that the dataframe has the user and item keys in the dataframe.

Parameters
  • recs_to_evaluate (pd.DataFrame) – A dataframe holding the recommendations (users, items). Contains

  • user and item. (columns) –

  • grouping (list) – A list of performance grouping variable names (e.g., different recommender settings).

Returns

The performance report.

Return type

report (pd.DataFrame)

get_coverage_metrics(recommendations: List[Tuple])pandas.core.frame.DataFrame[source]

Gets all coverage (performance) values using the defined metric_set. It expects a list of tuples of user/item combinations, e.g., [(user_1, item_1), (user_2, item1),]. The space of possible users and items to recommend is defined during initalisation of this class.

Parameters
  • List[Tuple] (recommendations) – recommendations of items to users, made by the evaluated system.

  • user has to decide which score or confidence levels to use prior to calling this ScoreCard. (The) –

Returns

The coverage (performance) metrics calculated from the recommendations.

Return type

performance_metrics (pd.DataFrame)

metric_set: Mapping[str, Callable[[numpy.array, numpy.array], float]]
class ScoreCard(metric_set: Mapping[str, Callable[[numpy.array, numpy.array], float]])[source]

A scorecard can be used to aggregate metrics, plot those, and generate performance reports.

filter_scores(scores: pandas.core.frame.DataFrame, training_set: pandas.core.frame.DataFrame, testing_set: pandas.core.frame.DataFrame, validation_set: pandas.core.frame.DataFrame, columns: List[str])pandas.core.frame.DataFrame[source]

A method to filter out those entries which also appear in either the training, testing or validation sets. The original is here: <https://papers.nips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf>. :param scores: A dataframe with the scores. :type scores: pd.DataFrame :param training_set: A dataframe of training data points. :type training_set: pd.DataFrame :param testing_set: A dataframe of testing data points. :type testing_set: pd.DataFrame :param validation_set: A dataframe of validation data points. :type validation_set: pd.DataFrame :param columns: A list of column names used for cross-referencing. :type columns: list

Returns

The scores for data points which are not in the reference sets.

Return type

scores (pd.DataFrame)

generate_report(scores_to_evaluate: pandas.core.frame.DataFrame, grouping: Optional[List[str]] = None)pandas.core.frame.DataFrame[source]

A method to calculate (aggregated) performance metrics based on a dataframe of ground truth and predictions. It assumes that the dataframe has the y_true and y_score keys in the dataframe.

Parameters
  • scores_to_evaluate (pd.DataFrame) – A dataframe with the scores and ground-truth - it has the y_true

  • y_score keys. (and) –

  • grouping (list) – A list of performance grouping variable names.

Returns

The performance report.

Return type

report (pd.DataFrame)

get_performance_metrics(y_true: numpy.array, y_score: numpy.array)pandas.core.frame.DataFrame[source]

A method to get the performance metrics for a pair of vectors.

Parameters
  • y_true (np.array) – A vector of ground truth values.

  • y_score (np.array) – A vector of model predictions.

Returns

The performance metrics calculated from the vectors.

Return type

performance_metrics (pd.DataFrame)

metric_set: Mapping[str, Callable[[numpy.array, numpy.array], float]]
print_metrics()[source]

Printing the name of metrics.

Metric Sets

class ClassificationMetricSet[source]

A set of classification metrics with the following metrics included:

Area Under the Receiver Operating Characteristic Curve
Area Under the Precision Recall Curve
Average Precision
F-1 Score
Matthew’s Correlation Coefficient
Fowlkes-Mallows Index
Precision
Recall
Specificity
Accuracy
Balanced Accuracy
class CoverageMetricSet[source]

A set of coverage metrics with the following metrics included: | Item Coverage | User Coverage

class MetricSet[source]
A metric set is a special dictionary that contains metric

name keys and evaluation metric function values.

add_metrics(metrics: List[Tuple])[source]

A method to add metric functions from a list of function names and functions.

Parameters

metrics (List[Tuple]) – A list of metric name and metric function tuples.

Returns

The metric set after the metrics were added.

Return type

self

filter_metrics(filter: Collection[str])[source]

A method to keep a list of metrics.

Parameters

filter – A list of metric names to keep.

Returns

The metric set after the metrics were filtered out.

Return type

self

print_metrics()[source]

Printing the name of metrics.

class RankingMetricSet[source]

A set of ranking metrics with the following metrics included:

class RatingMetricSet[source]

A set of rating metrics with the following metrics included:

Mean Absolute Error
Mean Squared Error
Root Mean Squared Error
Mean Absolute Percentage Error
Symmetric Mean Absolute Percentage Error
Coefficient of Determination
Pearson Correlation Coefficient
normalize_metrics()[source]

A method to normalize a set of metrics.

Returns

The metric set after the metrics were normalized.

Return type

self

Ranking Metrics

average_precision_at_k(relevant_items: numpy.array, recommendation: numpy.array, k=10)[source]

Calculate the average precision at k (AP@K) of items in a ranked list.

Parameters
  • relevant_items (array-like) – An N x 1 array of relevant items.

  • recommendation (array-like) – An N x 1 array of ordered items.

  • k (int) – the number of items considered in the predicted list.

Returns

The average precision @ k of a predicted list.

Return type

AP@K (float)

Original

average_recall_at_k(relevant_items: List, recommendation: List, k: int = 10)[source]

Calculate the average recall at k (AR@K) of items in a ranked list.

Parameters
  • relevant_items (array-like) – An N x 1 array of relevant items.

  • recommendation (array-like) – An N x 1 array of items.

  • k (int) – the number of items considered in the predicted list.

Returns

The average precision @ k of a predicted list.

Return type

AR@K (float)

discounted_cumulative_gain(y_true: numpy.array, y_score: numpy.array)[source]

Computes the Discounted Cumulative Gain (DCG), a sum of the true scores ordered by the predicted scores, and then penalized by a logarithmic discount based on ordering.

Parameters
  • y_true (array-like) – An N x M array of ground truth values, where M > 1 for multilabel classification problems.

  • y_score (array-like) – An N x M array of predicted values, where M > 1 for multilabel classification problems..

Returns

Discounted Cumulative Gain

Return type

DCG (float)

gmean_rank(relevant_items: Sequence[rexmex.metrics.ranking.X], recommendation: Sequence[rexmex.metrics.ranking.X])float[source]

Calculate the geometric mean rank (GMR) of items in a ranked list.

Parameters
  • relevant_items – An N x 1 sequence of relevant items.

  • recommendation – An N x 1 sequence of ordered items.

Returns

The mean reciprocal rank of the relevant items in a predicted.

hits_at_k(relevant_items: numpy.array, recommendation: numpy.array, k=10)[source]

Calculate the number of hits of relevant items in a ranked list HITS@K.

Parameters
  • relevant_items (array-like) – An 1 x N array of relevant items.

  • recommendation (array-like) – An 1 x N array of predicted arrays

  • k (int) – the number of items considered in the predicted list

Returns

The number of relevant items in the first k items of a prediction.

Return type

HITS@K (float)

intra_list_similarity(recommendations: List[list], items_feature_matrix: numpy.array)[source]

Calculate the intra list similarity of recommended items. The items are represented by feature vectors, which compared with cosine similarity. The predicted consists of item indices, which are used to fetch the item features.

Parameters
  • recommendations (List[list]) – A M x N array of predicted, where M is the number of predicted and N the number of recommended items

  • items_feature_matrix (matrix-link) – A N x D matrix, where N is the number of items and D the number of features representing one item

Returns

Average intra list similarity across predicted

Return type

(float)

Original

kendall_tau(relevant_items: numpy.array, recommendation: numpy.array)[source]

Calculate the Kendall’s tau, measuring the correspondence between two lists.

Parameters
  • relevant_items (array-like) – An 1 x N array of items.

  • recommendation (array-like) – An 1 x N array of items.

Returns

The tau statistic. p-value (float): two-sided p-value for null hypothesis that there’s no association between the predicted.

Return type

Kendall tau (float)

mean_average_precision_at_k(relevant_items: List[list], recommendations: List[list], k: int = 10)[source]

Calculate the mean average precision at k (MAP@K) across predicted lists. Each prediction should be paired with a list of relevant items. First predicted list is evaluated against the first list of relevant items, and so on.

Example usage:

import numpy as np
from rexmex.metrics.predicted import mean_average_precision_at_k

mean_average_precision_at_k(
    relevant_items=np.array(
        [
            [1,2],
            [2,3]
        ]
    ),
    predicted=np.array([
        [3,2,1],
        [2,1,3]
    ])
)
>>> 0.708333...
Parameters
  • relevant_items (array-like) – An M x N array of relevant items.

  • recommendations (array-like) – An M x N array of recommendation lists.

  • k (int) – the number of items considered in the predicted list.

Returns

The mean average precision @ k across recommendations.

Return type

MAP@K (float)

mean_average_recall_at_k(relevant_items: List[list], recommendations: List[list], k: int = 10)[source]

Calculate the mean average recall at k (MAR@K) for a list of recommendations. Each recommendation should be paired with a list of relevant items. First recommendation list is evaluated against the first list of relevant items, and so on.

Parameters
  • relevant_items (array-like) – An M x R list where M is the number of recommendation lists, and R is the number of relevant items.

  • recommendations (array-like) – An M x N list where M is the number of recommendation lists and N is the number of recommended items.

  • k (int) – the number of items considered in the recommendation.

Returns

The mean average recall @ k across the recommendations.

Return type

MAR@K (float)

mean_rank(relevant_items: Sequence[rexmex.metrics.ranking.X], recommendation: Sequence[rexmex.metrics.ranking.X])float[source]

Calculate the arithmetic mean rank (MR) of items in a ranked list.

Parameters
  • relevant_items – An N x 1 sequence of relevant items.

  • recommendation – An N x 1 sequence of ordered items.

Returns

The mean rank of the relevant items in a predicted.

mean_reciprocal_rank(relevant_items: List, recommendation: List)[source]

Calculate the mean reciprocal rank (MRR) of items in a ranked list.

Parameters
  • relevant_items (array-like) – An N x 1 array of relevant items.

  • recommendation (array-like) – An N x 1 array of ordered items.

Returns

The mean reciprocal rank of the relevant items in a predicted.

Return type

MRR (float)

normalized_discounted_cumulative_gain(y_true: numpy.array, y_score: numpy.array)[source]

Computes the Normalized Discounted Cumulative Gain (NDCG), a sum of the true scores ordered by the predicted scores, and then penalized by a logarithmic discount based on ordering. The score is normalized between [0.0, 1.0]

Parameters
  • y_true (array-like) – An N x M array of ground truth values, where M > 1 for multilabel classification problems.

  • y_score (array-like) – An N x M array of predicted values, where M > 1 for multilabel classification problems..

Returns

Normalized Discounted Cumulative Gain

Return type

NDCG (float)

normalized_distance_based_performance_measure(relevant_items: List, recommendation: List)[source]

Calculates the Normalized Distance-based Performance Measure (NPDM) between two ordered lists. Two matching orderings return 0.0 while two unmatched orderings returns 1.0.

Parameters
  • relevant_items (List) – List of items

  • recommendation (List) – The predicted list of items

Returns

Normalized Distance-based Performance Measure

Return type

NDPM (float)

Metric Definition: Yao, Y. Y. “Measuring retrieval effectiveness based on user preference of documents.” Journal of the American Society for Information science 46.2 (1995): 133-145.

Definition from: Shani, Guy, and Asela Gunawardana. “Evaluating recommendation systems.” Recommender systems handbook. Springer, Boston, MA, 2011. 257-297

novelty(recommendations: List[list], item_popularities: dict, num_users: int, k: int = 10)[source]

Calculates the capacity of the recommender system to to generate novel and unexpected results.

Parameters
  • recommendations (List[list]) – A M x N array of items, where M is the number of predicted lists and N the number of recommended items

  • item_popularities (dict) – A dict mapping each item in the recommendations to a popularity value. Popular items have higher values.

  • num_users (int) – The number of users

  • k (int) – The number of items considered in each recommendation.

Returns

novelty

Return type

(float)

Metric Definition: Zhou, T., Kuscsik, Z., Liu, J. G., Medo, M., Wakeling, J. R., & Zhang, Y. C. (2010). Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, 107(10), 4511-4515.

Original

personalization(recommendations: List[list])[source]

Calculates personalization, a measure of similarity between recommendations. A high value indicates that the recommendations are disimillar, or “personalized”.

Parameters

recommendations (List[list]) – A M x N array of predicted items, where M is the number of predicted lists and N the number of items

Returns

personalization

Return type

(float)

Original

rank(relevant_item: rexmex.metrics.ranking.X, recommendation: Sequence[rexmex.metrics.ranking.X])float[source]

Calculate the rank of an item in a ranked list of items.

Parameters
  • relevant_item – a target item in the predicted list of items.

  • recommendation – An N x 1 sequence of predicted items.

Returns

The rank of the item.

reciprocal_rank(relevant_item: rexmex.metrics.ranking.X, recommendation: Sequence[rexmex.metrics.ranking.X])float[source]

Calculate the reciprocal rank (RR) of an item in a ranked list of items.

Parameters
  • relevant_item – a target item in the predicted list of items.

  • recommendation – An N x 1 sequence of predicted items.

Returns

The reciprocal rank of the item.

Return type

RR (float)

spearmans_rho(relevant_items: numpy.array, recommendation: numpy.array)[source]

Calculate the Spearman’s rank correlation coefficient (Spearman’s rho) between two lists.

Parameters
  • relevant_items (array-like) – An 1 x N array of items.

  • recommendation (array-like) – An 1 x N array of items.

Returns

Spearman’s rho. p-value (float): two-sided p-value for null hypothesis that both predicted are uncorrelated.

Return type

(float)

Classification Metrics

accuracy_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the accuracy score for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of .

Return type

(float)

average_precision_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of average precision.

Return type

average_precision (float)

balanced_accuracy_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the balanced accuracy for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the balanced accuracy score.

Return type

balanced_accuracy (float)

condition_negative(y_true: numpy.array)float[source]

Calculate the number of instances which are negative.

Parameters

y_true (array-like) – An N x 1 array of ground truth values.

Returns

The number of negative instances.

Return type

cn (float)

condition_positive(y_true: numpy.array)float[source]

Calculate the number of instances which are positive.

Parameters

y_true (array-like) – An N x 1 array of ground truth values.

Returns

The number of positive instances.

Return type

cp (float)

critical_success_index(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the critical success index (duplicate of threat_score()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The critical success index value.

Return type

ts (float)

diagnostic_odds_ratio(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the diagnostic odds ratio.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The diagnostic odds ratio value.

Return type

dor (float)

f1_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the F-1 score for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the F-1 score.

Return type

f1 (float)

fall_out(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the fall out (duplicate of false_positive_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The fall out value.

Return type

fpr (float)

false_discovery_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the false discovery rate.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The false discovery rate value.

Return type

fdr (float)

false_negative(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the number of false negatives.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The number of false negatives.

Return type

fn (float)

false_negative_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the false negative rate (duplicated in miss_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The false negative rate value.

Return type

fnr (float)

false_omission_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the false omission rate.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The false omission rate value.

Return type

fomr (float)

false_positive(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the number of false positives.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The number of false positives.

Return type

fp (float)

false_positive_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the false positive rate (duplicated in false_positive_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The false positive rate value.

Return type

fpr (float)

fowlkes_mallows_index(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the Fowlkes-Mallows index.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The the Fowlkes-Mallows index value.

Return type

fm (float)

hit_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the hit rate (duplicate of true_positive_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The hit rate.

Return type

tpr (float)

informedness(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the informedness.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The informedness value.

Return type

bm (float)

markedness(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the markedness.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The markedness value.

Return type

mk (float)

matthews_correlation_coefficient(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate Matthew’s correlation coefficient for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of Matthew’s correlation coefficient.

Return type

mat_cor (float)

miss_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the miss rate (duplicate of false_negative_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The miss rate value.

Return type

fnr (float)

negative_likelihood_ratio(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the negative likelihood ratio.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The negative likelihood ratio value.

Return type

lr_minus (float)

negative_predictive_value(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the negative predictive value (duplicted in precision_score()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The negative predictive value.

Return type

npv (float)

positive_likelihood_ratio(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the positive likelihood ratio.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The positive likelihood ratio value.

Return type

(float)

positive_predictive_value(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the positive predictive value (duplicated in precision_score()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The positive predictive value.

Return type

ppv (float)

pr_auc_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the precision recall area under the curve (PR AUC) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the precision-recall area under the curve.

Return type

pr_auc (float)

precision_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the precision for a ground-truth prediction vector pair.

Duplicate of positive_predictive_value(), but with an alternate implementation using sklearn.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of precision.

Return type

precision (float)

prevalence_threshold(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the prevalence threshold score.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The prevalence threshold value.

Return type

pthr (float)

recall_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the recall for a ground-truth prediction vector pair.

Duplicate of true_positive_rate(), but with alternate implementation from sklearn.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of recall.

Return type

recall (float)

Note

It’s surprising that the sklearn implementation of TPR needs to be binarized but the rexmex implementation does not

roc_auc_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the AUC for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the area under the curve.

Return type

auc (float)

selectivity(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the selectivity (duplicate of true_negative_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The selectivity score.

Return type

tnr (float)

sensitivity(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the sensitivity (duplicate of true_positive_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The sensitivity score.

Return type

tpr (float)

specificity(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the specificity (duplicate of true_negative_rate()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The specificity score.

Return type

tnr (float)

threat_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the threat score.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The threat score value.

Return type

ts (float)

true_negative(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the number of true negatives.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The number of true negatives.

Return type

tn (float)

true_negative_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the true negative rate (duplicated in specificity() and selectivity()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The true negative rate.

Return type

tnr (float)

true_positive(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the number of true positives.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The number of true positives.

Return type

tp (float)

true_positive_rate(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the true positive rate (duplicated in hit_rate(), sensitivity(), and recall_score()).

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The true positive rate.

Return type

tpr (float)

Coverage Metrics

item_coverage(possible_users_items: Tuple[List[Union[int, str]], List[Union[int, str]]], recommendations: List[Tuple[Union[int, str], Union[int, str]]])float[source]

Calculates the coverage value for items in possible_users_items[1] given the collection of recommendations. Recommendations over users/items not in possible_users_items are discarded.

Parameters
  • possible_users_items (Tuple[List[Union[int, str]], List[Union[int, str]]]) – contains exactly TWO sub-lists,

  • one with users (first) –

  • with items (second) –

  • recommendations (List[Tuple[Union[int, str], Union[int, str]]]) – contains user-item recommendation tuples,

  • [ (e.g.) –

Returns: item coverage (float): a metric showing the fraction of items which got recommended at least once.

user_coverage(possible_users_items: Tuple[List[Union[int, str]], List[Union[int, str]]], recommendations: List[Tuple[Union[int, str], Union[int, str]]])float[source]

Calculates the coverage value for users in possible_users_items[0] given the collection of recommendations. Recommendations over users/items not in possible_users_items are discarded.

Parameters
  • possible_users_items (Tuple[List[Union[int, str]], List[Union[int, str]]]) – contains exactly TWO sub-lists,

  • one with users (first) –

  • with items (second) –

  • recommendations (List[Tuple[Union[int, str], Union[int, str]]]) – contains user-item recommendation tuples,

  • [ (e.g.) –

Returns: user coverage (float): a metric showing the fraction of users who got at least one recommendation out of all possible users.

Rating Metrics

mean_absolute_error(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the mean absolute error (MAE) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The mean absolute error value.

Return type

mae (float)

mean_absolute_percentage_error(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the mean absolute percentage error (MAPE) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The mean absolute percentage error value.

Return type

mape (float)

mean_squared_error(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the mean squared error (MSE) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The mean squared error value.

Return type

mse (float)

pearson_correlation_coefficient(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the Pearson correlation coefficient for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the correlation coefficient.

Return type

rho (float)

r2_score(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the coefficient of determination (R^2) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The coefficient of determination value.

Return type

r2 (float)

root_mean_squared_error(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the root mean squared error (RMSE) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the root mean squared error.

Return type

rmse (float)

symmetric_mean_absolute_percentage_error(y_true: numpy.array, y_score: numpy.array)float[source]

Calculate the symmetric mean absolute percentage error (SMAPE) for a ground-truth prediction vector pair.

Parameters
  • y_true (array-like) – An N x 1 array of ground truth values.

  • y_score (array-like) – An N x 1 array of predicted values.

Returns

The value of the symmetric mean absolute percentage error.

Return type

smape (float)

Utility

class Annotator[source]

A class to wrap annotations that generates a registry.

annotate(*, lower: float, upper: float, higher_is_better: bool, link: str, description: str, name: Optional[str] = None, lower_inclusive: bool = True, upper_inclusive: bool = True, binarize: bool = False, duplicate_of: Optional[Callable[[numpy.array, numpy.array], float]] = None)[source]

Annotate a function.

duplicate(other, *, name: Optional[str] = None, binarize: Optional[bool] = None)[source]

Annotate a function as a duplicate.

Metric

A function that can be called on y_true, y_score and return a floating point result

alias of Callable[[numpy.array, numpy.array], float]

binarize(metric)[source]

Binarize the predictions for a ground-truth - prediction vector pair.

Parameters

metric (function) – The metric function which needs a binarization pre-processing step.

Returns

The function which wraps the metric and binarizes the probability scores.

Return type

metric_wrapper (function)

normalize(metric)[source]

Normalize the predictions for a ground-truth - prediction vector pair.

Parameters

metric (function) – The metric function which needs a normalization pre-processing step.

Returns

The function which wraps the metric and normalizes predictions.

Return type

metric_wrapper (function)

Synthetic Datasets

class DatasetReader[source]

Class to read synthetic test datasets.

read_dataset(dataset: str = 'erdos_renyi_example')[source]

Method to read the dataset. :param dataset: Dataset of interest, one of:

("erdos_renyi_example"). Default is ‘erdos_renyi_example’.

Returns

The example dataset for testing the library.

Return type

data (pd.DataFrame)