label_evaluation package¶

Submodules¶

label_evaluation.accuracy_classifier module¶

label_evaluation.accuracy_classifier.cm(target: list, pred: DataFrame, gt: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) → None[source]¶

Compute confusion matrix to evaluate the performance of the classification.

Args:: target (list): Names matching the classes. pred (pd.DataFrame): Predicted classes. gt (pd.DataFrame): Ground truth classes. out_dir (Path): Path to the target directory to save the confusion matrix plot.

label_evaluation.accuracy_classifier.metrics(target: list, pred: DataFrame, gt: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) → str[source]¶

Build a text report showing the main classification metrics, to measure the quality of predictions of the classification model, and save it to a text file.

Args:: target (list): Names matching the classes. pred (pd.DataFrame): Predicted classes. gt (pd.DataFrame): Ground truth classes. out_dir (Path): Directory where the report file will be saved.
Returns:: str: Classification report as a text output.

label_evaluation.evaluate_text module¶

exception label_evaluation.evaluate_text.EmptyReferenceError(message=None)[source]¶

Bases: Exception

Custom exception for handling cases where the reference string is empty.

label_evaluation.evaluate_text.calculate_cer(reference: list, hypothesis: list) → float[source]¶

Calculate the Character Error Rate (CER) between reference and hypothesis.

Args:: reference (list): List of reference (ground truth) strings. hypothesis (list): List of hypothesis (predicted) strings.
Returns:: float: The computed CER value.

label_evaluation.evaluate_text.calculate_scores(gold_text: str, predicted_text: str) → tuple[source]¶

Calculate Word Error Rate (WER) and Character Error Rate (CER) between ground truth and prediction.

Args:: gold_text (str): Ground truth transcription. predicted_text (str): Predicted transcription.
Returns:: tuple: (WER, CER) both rounded to two decimal places.

label_evaluation.evaluate_text.create_plot(data: list, score_name: str, file_name: str) → None[source]¶

Create and save a violin plot for the given error scores.

Args:: data (list): List of numerical scores to visualize. score_name (str): Name of the score (e.g., “CER” or “WER”). file_name (str): Path to save the plot image.

label_evaluation.evaluate_text.evaluate_text_predictions(ground_truth_file: str, predictions_file: str, out_dir: str) → tuple[source]¶

Evaluate OCR predictions against a ground truth dataset.

Args:: ground_truth_file (str): Path to the ground truth CSV file. predictions_file (str): Path to the predictions JSON file. out_dir (str): Output directory for results.
Returns:: tuple: (List of WER scores, List of CER scores)

label_evaluation.evaluate_text.get_gold_transcriptions(filename: str, sep: str = ',') → dict[source]¶

Load ground truth transcriptions from a CSV file into a dictionary.

Args:: filename (str): Path to the CSV file. sep (str, optional): Delimiter used in the CSV file. Defaults to ‘,’.
Returns:: dict: Dictionary with keys as unique identifiers and values as transcription text.

label_evaluation.evaluate_text.load_json_predictions(filename: str) → list[source]¶

Load predictions from a JSON file.

Args:: filename (str): Path to the JSON file.
Returns:: list: List of predictions from the JSON file.

label_evaluation.iou_scores module¶

label_evaluation.iou_scores.box_plot_iou(df_concat: DataFrame, accuracy_txt_path: str | None = None) → Figure[source]¶

Generate a box plot for IOU scores.

Args:: df_concat (pd.DataFrame): DataFrame with IOU scores. accuracy_txt_path (str, optional): Path to save accuracy percentages.
Returns:: go.Figure: Plotly figure object.

label_evaluation.iou_scores.calculate_iou(pred_coords: tuple[float, float, float, float], gt_coords: tuple[str, float, float, float, float]) → float[source]¶

Calculates Intersection over Union (IOU) scores by comparing predicted and ground truth segmentation coordinates.

Args:: pred_coords (tuple): Coordinates for the predicted bounding box (xmin, ymin, xmax, ymax). gt_coords (tuple): Coordinates for the ground truth bounding box (class, xmin, ymin, xmax, ymax).
Returns:: float: IOU score.

label_evaluation.iou_scores.comparison(df_pred_filename: DataFrame, df_gt_filename: DataFrame) → DataFrame[source]¶

Compare bounding box coordinates and calculate IOU scores.

Args:: df_pred_filename (pd.DataFrame): DataFrame with predicted labels. df_gt_filename (pd.DataFrame): DataFrame with ground truth labels.
Returns:: pd.DataFrame: DataFrame with added IOU scores.

label_evaluation.iou_scores.concat_frames(df_pred: DataFrame, df_gt: DataFrame) → DataFrame[source]¶

Concatenate predicted and ground truth datasets with IOU scores.

Args:: df_pred (pd.DataFrame): DataFrame with predicted bounding boxes. df_gt (pd.DataFrame): DataFrame with ground truth bounding boxes.
Returns:: pd.DataFrame: Concatenated DataFrame with calculated IOU scores.

label_evaluation.redundancy module¶

label_evaluation.redundancy.clean_data(data: list[dict]) → list[dict][source]¶

Preprocess the dataset by converting text to lowercase, removing punctuation and whitespace, and excluding entries containing ‘http’.

Args:: data (list of dict): List of dictionaries with labels’ transcription.
Returns:: list of dict: Preprocessed list of dictionaries.

label_evaluation.redundancy.per_redundancy(data: list[dict]) → int[source]¶

Calculate the percentage of transcription redundancy in a dataset.

Args:: data (list of dict): Preprocessed list of dictionaries with labels’ transcription.
Returns:: int: Percentage of redundant text.

label_evaluation.redundancy.redundancy(data: list[dict]) → list[dict][source]¶

Identify duplicate entries in a preprocessed dataset.

Args:: data (list of dict): Preprocessed list of dictionaries with labels’ transcription.
Returns:: list of dict: List of dictionaries containing duplicate entries.