label_processing package¶

Submodules¶

label_processing.detect_empty_labels module¶

label_processing.detect_empty_labels.detect_dark_pixels(image: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/envs/latest/lib/python3.10/site-packages/PIL/Image.py'>, crop_box: tuple, threshold: int = 100) → float[source]¶

Detect the proportion of dark pixels in an image.

Args:: image (Image): Input image. crop_box (tuple): (left, upper, right, lower) coordinates for image cropping. threshold (int): Threshold for classifying dark pixels. Defaults to 100.
Returns:: float: Proportion of dark pixels.

label_processing.detect_empty_labels.find_empty_labels(input_folder: str, output_folder: str, threshold: float = 0.01, crop_margin: float = 0.1) → None[source]¶

Find and move empty and non-empty labels to respective folders.

Args:: input_folder (str): Path to the directory containing input images. output_folder (str): Path to the directory where filtered images will be stored. threshold (float): Threshold for classifying empty labels. Defaults to 0.01. crop_margin (float): Margin for cropping images. Defaults to 0.1.
Returns:: None

label_processing.detect_empty_labels.is_empty(image: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/envs/latest/lib/python3.10/site-packages/PIL/Image.py'>, crop_margin: float, threshold: float) → bool[source]¶

Determines if an image is empty based on a given threshold and crop margin.

Args:: image: PIL Image object crop_margin: float, proportion of the image size to crop from the borders threshold: float, proportion of black pixels below which the image is considered empty
Returns:: bool, whether the image is empty or not

label_processing.label_detection module¶

class label_processing.label_detection.PredictLabel(path_to_model: str, classes: list, jpg_path: str | Path | None = None, threshold: float = 0.8)[source]¶

Bases: object

Class for predicting labels using a trained object detection model.

Attributes:: path_to_model (str): Path to the trained model file. classes (list): List of classes used in the model. jpg_path (str|Path|None): Path to a specific JPG file for prediction. threshold (float): Threshold value for scores. Defaults to 0.8. model (detecto.core.Model): Trained object detection model.

class_prediction(jpg_path: Path | None = None) → DataFrame[source]¶

Predict labels for a given JPG file.

Args:: jpg_path (Path): Path to the JPG file.
Returns:: pd.DataFrame: Pandas DataFrame with prediction results.

property jpg_path¶: str|Path|None: Property for JPG path.

retrieve_model() → Model[source]¶

Retrieve the trained object detection model.

Returns:: detecto.core.Model: Trained object detection model.

label_processing.label_detection.clean_predictions(jpg_dir: Path, dataframe: DataFrame, threshold: float, out_dir=None) → DataFrame[source]¶

Filter predictions based on a threshold and save the results to a CSV file.

Args:: jpg_dir (Path): Path to the directory with JPG files. dataframe (pd.DataFrame): Pandas DataFrame with predictions. threshold (float): Threshold value for scores. out_dir (str): Output directory for saving the CSV file.
Returns:: pd.DataFrame: Pandas DataFrame with filtered results.

label_processing.label_detection.create_crops(jpg_dir: Path, dataframe: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) → None[source]¶

Creates crops by using the csv from applying the model and the original pictures inside a directory.

Args:: jpg_dir (): path to directory with jpgs. dataframe (str): path to csv file. out_dir (Path): path to the target directory to save the cropped jpgs.

label_processing.label_detection.crop_picture(img_raw: ndarray, path: str, filename: str, **coordinates) → None[source]¶

Crop the picture using the given coordinates.

Args:: img_raw (numpy.ndarray): Input JPG converted to a numpy matrix by cv2. path (str): Path where the picture should be saved. filename (str): Name of the picture. coordinates: Coordinates for cropping.

label_processing.label_detection.prediction_parallel(jpg_dir: str | Path, predictor: PredictLabel, n_processes: int) → DataFrame[source]¶

Perform predictions for all JPG files in a directory with parallel processing.

Args:: jpg_dir (Path|str): Path to JPG files for prediction. predictor (PredictLabel): Prediction instance. n_processes (int): Number of processes for parallel execution.
Returns:: pd.DataFrame: Pandas DataFrame containing the predictions.

label_processing.label_rotation module¶

label_processing.label_rotation.get_image_paths(input_image_dir: str) → List[str][source]¶

Get a list of image paths in the input directory.

Args:: input_image_dir (str): Directory containing input images.
Returns:: list: List of image paths.

label_processing.label_rotation.get_predicted_angles(model: Model, images: ndarray) → List[int][source]¶

Predict angles for a list of images using a trained model.

Args:: model (tf.keras.Model): Trained model. images (np.ndarray): List of images.
Returns:: list: List of predicted angles.

label_processing.label_rotation.load_image(image_path: str) → ndarray[source]¶

Load an image from a file path.

Args:: image_path (str): Path to the image file.
Returns:: np.ndarray: Loaded image.

label_processing.label_rotation.load_images(image_paths: List[str]) → ndarray[source]¶

Load images from a list of image paths.

Args:: image_paths (list): List of image paths.
Returns:: np.ndarray: Loaded images.

label_processing.label_rotation.predict_angles(input_image_dir: str, output_image_dir: str, model_path: str) → None[source]¶

Load a trained model, predict angles for input images, and rotate images accordingly.

Args:: input_image_dir (str): Directory containing input images. output_image_dir (str): Directory to save rotated images.
Returns:: None

label_processing.label_rotation.rotate_image(image: ndarray, angle: int) → ndarray[source]¶

Rotate an image based on a given angle.

Args:: image (np.ndarray): Input image. angle (int): Angle of rotation in multiples of 90 degrees.
Returns:: np.ndarray: Rotated image.

label_processing.label_rotation.rotate_images(image_paths: List[str], predicted_angles: List[int], output_image_dir: str) → None[source]¶

Rotate images based on their predicted angles and save them to the output directory.

Args:: image_paths (list): List of image paths. predicted_angles (list): List of predicted angles. output_image_dir (str): Directory to save rotated images.
Returns:: None

label_processing.label_rotation.rotate_single_image(image_path: str, angle: int, output_dir: str) → bool[source]¶

Rotate a single image based on a given angle and save the rotated image.

Args:: image_path (str): Path to the input image file. angle (int): Angle of rotation in multiples of 90 degrees. output_dir (str): Directory to save the rotated image.
Returns:: bool: True if the image is rotated, False otherwise.

label_processing.label_rotation.save_image(image: ndarray, output_path: str) → bool[source]¶

Save an image to a file path.

Args:: image (np.ndarray): Image to save. output_path (str): Path to save the image.
Returns:: bool: True if the image is saved, False otherwise.

label_processing.ocr_vision module¶

class label_processing.ocr_vision.VisionApi(path: str, image: bytes, credentials: str, encoding: str)[source]¶

Bases: object

Class for interacting with the Google Cloud Vision API for OCR tasks on images.

process_string(result_raw: str) → str[source]¶

Process the Google Vision OCR output, replacing newlines with spaces and encoding as specified.

Args:: result_raw (str): Raw output string directly from Google Vision.
Returns:: str: Processed string.

static read_image(path: str, credentials: str, encoding: str = 'utf8') → VisionApi[source]¶

Read an image file and return an instance of the VisionApi class.

Args:: path (str): Path to the image file. credentials (str): Path to the credentials JSON file. encoding (str, optional): Encoding for the result (‘ascii’ or ‘utf8’). Defaults to ‘utf8’.
Returns:: VisionApi: Instance of the VisionApi class.

vision_ocr() → dict[str, str][source]¶

Perform the actual API call, handle errors, and return the processed transcription.

Raises:: Exception: Raises an exception if the API does not respond.
Returns:: dict[str, str]: Dictionary with the filename and the transcript.

label_processing.tensorflow_classifier module¶

label_processing.tensorflow_classifier.class_prediction(model: Sequential, class_names: list, jpg_dir: str, out_dir=None) → DataFrame[source]¶

Create a dataframe with predicted classes for each picture.

Args:: model (tf.keras.Sequential): Trained Keras Sequential image classifier model. class_names (list): Model’s predicted classes. jpg_dir (str): Path to the directory containing the original jpgs. out_dir (str): Path where the CSV file will be stored.
Returns:: DataFrame (pd.DataFrame): Pandas DataFrame with the predicted results.

label_processing.tensorflow_classifier.create_dirs(dataframe: DataFrame, path: str) → None[source]¶

Create separate directories for every class.

Args:: dataframe (pd.Dataframe): DataFrame containing the classes as a column. path (str): Path of the chosen directory.

label_processing.tensorflow_classifier.filter_pictures(jpg_dir: Path, dataframe: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) → None[source]¶

Create new folders for each class of the newly named classified pictures.

Args:: jpg_dir (Path): Path to directory with images. dataframe (pd.DataFrame): Pandas DataFrame with class predictions. out_dir (Path): Path to the target directory to save the cropped images.

label_processing.tensorflow_classifier.get_model(path_to_model: str) → Sequential[source]¶

Load a trained Keras Sequential image classifier model.

Args:: path_to_model (str): Path to the model file.
Returns:: model (tf.keras.Sequential): Trained Keras Sequential image classifier model.

label_processing.tensorflow_classifier.make_file_name(label_id: str, pic_class: str) → None[source]¶

Create a fitting filename.

Args:: label_id (str): String containing the label id. pic_class (str): Class of the label.
Returns:: filename (str): The created filename.

label_processing.tensorflow_classifier.rename_picture(img_raw: ndarray, path: str, filename: str, pic_class: str) → None[source]¶

Rename the pictures using the predicted class.

Args:: img_raw (numpy.ndarray): Input jpg converted to a numpy matrix by cv2. path (str): Path where the picture should be saved. filename (str): Name of the picture. pic_class (str): Class of the label.

label_processing.text_recognition module¶

class label_processing.text_recognition.ImageProcessor(image: ndarray, path: str, blocksize: int | None = None, c_value: int | None = None)[source]¶

Bases: object

A class for image preprocessing and other image actions.

property blocksize: int¶

blur(ksize: tuple[int, int] = (5, 5)) → ImageProcessor[source]¶

Apply Gaussian blur to the image.

Args:: ksize (Tuple[int, int], optional): The kernel size for blurring. Defaults to (5, 5).
Returns:: Image: An instance of the Image class representing the blurred image.

property c_value: int¶

copy_this() → ImageProcessor[source]¶

Creates a copy of the current Image instance.

Returns:: ImageProcessor: A copy of the current Image instance.

deskew(angle: float64 | None) → ImageProcessor[source]¶

Rotate the image to deskew it.

Args:: angle (Optional[np.float64]): The skew angle to use for deskewing.
Returns:: Image: An instance of the Image class representing the deskewed image.

dilate() → ImageProcessor[source]¶

Dilate the image using a 5x5 kernel.

Returns:: Image: An instance of the Image class representing the dilated image.

erode() → ImageProcessor[source]¶

Erode the image using a 5x5 kernel.

Returns:: Image: An instance of the Image class representing the eroded image.

get_grayscale() → ImageProcessor[source]¶

Convert the image to grayscale.

Returns:: Image: An instance of the Image class representing the grayscale image.

get_skew_angle() → float64 | None[source]¶

Calculate and return the skew angle of the image.

Returns:: Optional[np.float64]: The skew angle in degrees or None if it couldn’t be determined.

property image: ndarray¶

property path: str¶

preprocessing(thresh_mode: Threshmode) → ImageProcessor[source]¶

Perform a series of preprocessing steps on the image.

Args:: thresh_mode (Threshmode): The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).
Returns:: ImageProcessor: An instance of the Image class representing the preprocessed image.

static read_image(path: str | Path) → ImageProcessor[source]¶

Read an image from the specified path and return an instance of the Image class.

Args:: path (str): The path to a JPG file.
Returns:: Image: An instance of the Image class.

read_qr_code() → str | None[source]¶

Tries to identify if a picture has a QR-code and then reads and returns it.

Returns:: Optional[str]: Decoded QR-code text as a str or None if there is no QR-code found.

remove_noise() → ImageProcessor[source]¶

Remove noise from the image using median blur.

Returns:: Image: An instance of the Image class representing the noise-reduced image.

save_image(dir_path: str | Path, appendix: str | None = None) → None[source]¶

Save the image to a specified directory with an optional appendix.

Args:: dir_path (str | Path): The directory path where the image will be saved. appendix (str, optional): An optional string to append to the image filename. Defaults to None.

thresholding(thresh_mode: Enum) → ImageProcessor[source]¶

Perform thresholding on the image.

Args:: thresh_mode (Threshmode): The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).
Returns:: Image: An instance of the Image class representing the thresholded image.

class label_processing.text_recognition.Tesseract(languages='eng+deu+fra+ita+spa+por', config='--psm 6 --oem 3', image: ImageProcessor | None = None)[source]¶

Bases: object

property image: ImageProcessor¶

image_to_string() → dict[str, str][source]¶

Apply OCR and image parameters on JPG images.

Returns:: dict[str, str]: A dictionary containing the image ID (filename) and the OCR-processed text.

class label_processing.text_recognition.Threshmode(value)[source]¶

Bases: Enum

Different possibilities for thresholding.

Args:: Enum (int):

ADAPTIVE_GAUSSIAN = 3¶

ADAPTIVE_MEAN = 2¶

OTSU = 1¶

classmethod eval(threshmode: int) → Enum[source]¶

label_processing.text_recognition.find_tesseract() → None[source]¶: Searches for the tesseract executable and raises an error if it is not found.

label_processing.utils module¶

label_processing.utils.check_dir(directory: str) → None[source]¶

Checks if the directory given as an argument contains jpg files.

Args:: directory (str): path to directory
Raises:: FileNotFoundError: raised if no jpg files are found in the directory

label_processing.utils.check_nuri_format(transcript: str) → bool[source]¶

Check NURI’s format in OCR transcription “text”.

Args:: transcript (str): text field from OCR output
Returns:: bool: True if NURI pattern found, False otherwise

label_processing.utils.generate_filename(original_path: str, appendix: str, extension: str | None = None) → str[source]¶

Gets the path to a file or directory as an input and returns it with an appendix added to the end.

Args:: original_path (str): original path to file or directory appendix (str): what needs to be appended extension (Optional[str]): either no extension (for directories) or a file extension as a string
Returns:: str: new file or directory name

label_processing.utils.load_dataframe(filepath_csv: str) → DataFrame[source]¶

Loads the CSV file using Pandas.

Args:: filepath_csv (str): path to the CSV file
Returns:: pd.DataFrame: The CSV as a Pandas DataFrame

label_processing.utils.load_jpg(filepath: str) → ndarray[source]¶

Loads the jpg files using the OpenCV module.

Args:: filepath (str): path to jpg files
Returns:: np.ndarray: OpenCV image object

label_processing.utils.load_json(file: str) → dict[source]¶

Load JSON data from a file and deserialize it.

Args:: file (str): The name of the file containing JSON data.
Returns:: dict: The JSON data as a dictionary

label_processing.utils.read_vocabulary(file: str) → dict[source]¶

Read a CSV file containing vocabulary and convert it to a dictionary.

Args:: file (str): The name of the CSV file containing vocabulary data.
Returns:: dict: A dictionary where keys and values are taken from the CSV data.

label_processing.utils.replace_nuri(transcript: dict[str, str]) → dict[str, str][source]¶

Correct NURI format in OCR transcription JSON output.

Args:: transcript (dict[str, str]): JSON transcript with “ID” and “text” fields.
Returns:: dict[str, str]: JSON transcript with corrected NURI formats in “text” field.

label_processing.utils.save_json(data: list[dict], filename: str, path: str) → None[source]¶

Saves a json file with human-readable format.

Args:: data (list[dict]): output of the OCR filename (str): name for the json file path (str): path where the json should be saved