label_evaluation package

Submodules

label_evaluation.accuracy_classifier module

label_evaluation.accuracy_classifier.cm(target: list, pred: DataFrame, gt: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) None[source]

Compute confusion matrix to evaluate the performance of the classification.

Args:

target (list): Names matching the classes. pred (pd.DataFrame): Predicted classes. gt (pd.DataFrame): Ground truth classes. out_dir (Path): Path to the target directory to save the confusion matrix plot.

label_evaluation.accuracy_classifier.metrics(target: list, pred: DataFrame, gt: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) str[source]

Build a text report showing the main classification metrics, to measure the quality of predictions of the classification model, and save it to a text file.

Args:

target (list): Names matching the classes. pred (pd.DataFrame): Predicted classes. gt (pd.DataFrame): Ground truth classes. out_dir (Path): Directory where the report file will be saved.

Returns:

str: Classification report as a text output.

label_evaluation.evaluate_text module

exception label_evaluation.evaluate_text.EmptyReferenceError(message=None)[source]

Bases: Exception

Custom exception for handling cases where the reference string is empty.

label_evaluation.evaluate_text.calculate_cer(reference: list, hypothesis: list) float[source]

Calculate the Character Error Rate (CER) between reference and hypothesis.

Args:

reference (list): List of reference (ground truth) strings. hypothesis (list): List of hypothesis (predicted) strings.

Returns:

float: The computed CER value.

label_evaluation.evaluate_text.calculate_scores(gold_text: str, predicted_text: str) tuple[source]

Calculate Word Error Rate (WER) and Character Error Rate (CER) between ground truth and prediction.

Args:

gold_text (str): Ground truth transcription. predicted_text (str): Predicted transcription.

Returns:

tuple: (WER, CER) both rounded to two decimal places.

label_evaluation.evaluate_text.create_plot(data: list, score_name: str, file_name: str) None[source]

Create and save a violin plot for the given error scores.

Args:

data (list): List of numerical scores to visualize. score_name (str): Name of the score (e.g., “CER” or “WER”). file_name (str): Path to save the plot image.

label_evaluation.evaluate_text.evaluate_text_predictions(ground_truth_file: str, predictions_file: str, out_dir: str) tuple[source]

Evaluate OCR predictions against a ground truth dataset.

Args:

ground_truth_file (str): Path to the ground truth CSV file. predictions_file (str): Path to the predictions JSON file. out_dir (str): Output directory for results.

Returns:

tuple: (List of WER scores, List of CER scores)

label_evaluation.evaluate_text.get_gold_transcriptions(filename: str, sep: str = ',') dict[source]

Load ground truth transcriptions from a CSV file into a dictionary.

Args:

filename (str): Path to the CSV file. sep (str, optional): Delimiter used in the CSV file. Defaults to ‘,’.

Returns:

dict: Dictionary with keys as unique identifiers and values as transcription text.

label_evaluation.evaluate_text.load_json_predictions(filename: str) list[source]

Load predictions from a JSON file.

Args:

filename (str): Path to the JSON file.

Returns:

list: List of predictions from the JSON file.

label_evaluation.iou_scores module

label_evaluation.iou_scores.box_plot_iou(df_concat: DataFrame, accuracy_txt_path: str | None = None) Figure[source]

Generate a box plot for IOU scores.

Args:

df_concat (pd.DataFrame): DataFrame with IOU scores. accuracy_txt_path (str, optional): Path to save accuracy percentages.

Returns:

go.Figure: Plotly figure object.

label_evaluation.iou_scores.calculate_iou(pred_coords: tuple[float, float, float, float], gt_coords: tuple[str, float, float, float, float]) float[source]

Calculates Intersection over Union (IOU) scores by comparing predicted and ground truth segmentation coordinates.

Args:

pred_coords (tuple): Coordinates for the predicted bounding box (xmin, ymin, xmax, ymax). gt_coords (tuple): Coordinates for the ground truth bounding box (class, xmin, ymin, xmax, ymax).

Returns:

float: IOU score.

label_evaluation.iou_scores.comparison(df_pred_filename: DataFrame, df_gt_filename: DataFrame) DataFrame[source]

Compare bounding box coordinates and calculate IOU scores.

Args:

df_pred_filename (pd.DataFrame): DataFrame with predicted labels. df_gt_filename (pd.DataFrame): DataFrame with ground truth labels.

Returns:

pd.DataFrame: DataFrame with added IOU scores.

label_evaluation.iou_scores.concat_frames(df_pred: DataFrame, df_gt: DataFrame) DataFrame[source]

Concatenate predicted and ground truth datasets with IOU scores.

Args:

df_pred (pd.DataFrame): DataFrame with predicted bounding boxes. df_gt (pd.DataFrame): DataFrame with ground truth bounding boxes.

Returns:

pd.DataFrame: Concatenated DataFrame with calculated IOU scores.

label_evaluation.redundancy module

label_evaluation.redundancy.clean_data(data: list[dict]) list[dict][source]

Preprocess the dataset by converting text to lowercase, removing punctuation and whitespace, and excluding entries containing ‘http’.

Args:

data (list of dict): List of dictionaries with labels’ transcription.

Returns:

list of dict: Preprocessed list of dictionaries.

label_evaluation.redundancy.per_redundancy(data: list[dict]) int[source]

Calculate the percentage of transcription redundancy in a dataset.

Args:

data (list of dict): Preprocessed list of dictionaries with labels’ transcription.

Returns:

int: Percentage of redundant text.

label_evaluation.redundancy.redundancy(data: list[dict]) list[dict][source]

Identify duplicate entries in a preprocessed dataset.

Args:

data (list of dict): Preprocessed list of dictionaries with labels’ transcription.

Returns:

list of dict: List of dictionaries containing duplicate entries.