label_processing package

Submodules

label_processing.detect_empty_labels module

label_processing.detect_empty_labels.detect_dark_pixels(image: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/envs/latest/lib/python3.10/site-packages/PIL/Image.py'>, crop_box: tuple, threshold: int = 100) float[source]

Detect the proportion of dark pixels in an image.

Args:

image (Image): Input image. crop_box (tuple): (left, upper, right, lower) coordinates for image cropping. threshold (int): Threshold for classifying dark pixels. Defaults to 100.

Returns:

float: Proportion of dark pixels.

label_processing.detect_empty_labels.find_empty_labels(input_folder: str, output_folder: str, threshold: float = 0.01, crop_margin: float = 0.1) None[source]

Find and move empty and non-empty labels to respective folders.

Args:

input_folder (str): Path to the directory containing input images. output_folder (str): Path to the directory where filtered images will be stored. threshold (float): Threshold for classifying empty labels. Defaults to 0.01. crop_margin (float): Margin for cropping images. Defaults to 0.1.

Returns:

None

label_processing.detect_empty_labels.is_empty(image: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/envs/latest/lib/python3.10/site-packages/PIL/Image.py'>, crop_margin: float, threshold: float) bool[source]

Determines if an image is empty based on a given threshold and crop margin.

Args:

image: PIL Image object crop_margin: float, proportion of the image size to crop from the borders threshold: float, proportion of black pixels below which the image is considered empty

Returns:

bool, whether the image is empty or not

label_processing.label_detection module

class label_processing.label_detection.PredictLabel(path_to_model: str, classes: list, jpg_path: str | Path | None = None, threshold: float = 0.8)[source]

Bases: object

Class for predicting labels using a trained object detection model.

Attributes:

path_to_model (str): Path to the trained model file. classes (list): List of classes used in the model. jpg_path (str|Path|None): Path to a specific JPG file for prediction. threshold (float): Threshold value for scores. Defaults to 0.8. model (detecto.core.Model): Trained object detection model.

class_prediction(jpg_path: Path | None = None) DataFrame[source]

Predict labels for a given JPG file.

Args:

jpg_path (Path): Path to the JPG file.

Returns:

pd.DataFrame: Pandas DataFrame with prediction results.

property jpg_path

str|Path|None: Property for JPG path.

retrieve_model() Model[source]

Retrieve the trained object detection model.

Returns:

detecto.core.Model: Trained object detection model.

label_processing.label_detection.clean_predictions(jpg_dir: Path, dataframe: DataFrame, threshold: float, out_dir=None) DataFrame[source]

Filter predictions based on a threshold and save the results to a CSV file.

Args:

jpg_dir (Path): Path to the directory with JPG files. dataframe (pd.DataFrame): Pandas DataFrame with predictions. threshold (float): Threshold value for scores. out_dir (str): Output directory for saving the CSV file.

Returns:

pd.DataFrame: Pandas DataFrame with filtered results.

label_processing.label_detection.create_crops(jpg_dir: Path, dataframe: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) None[source]

Creates crops by using the csv from applying the model and the original pictures inside a directory.

Args:

jpg_dir (): path to directory with jpgs. dataframe (str): path to csv file. out_dir (Path): path to the target directory to save the cropped jpgs.

label_processing.label_detection.crop_picture(img_raw: ndarray, path: str, filename: str, **coordinates) None[source]

Crop the picture using the given coordinates.

Args:

img_raw (numpy.ndarray): Input JPG converted to a numpy matrix by cv2. path (str): Path where the picture should be saved. filename (str): Name of the picture. coordinates: Coordinates for cropping.

label_processing.label_detection.prediction_parallel(jpg_dir: str | Path, predictor: PredictLabel, n_processes: int) DataFrame[source]

Perform predictions for all JPG files in a directory with parallel processing.

Args:

jpg_dir (Path|str): Path to JPG files for prediction. predictor (PredictLabel): Prediction instance. n_processes (int): Number of processes for parallel execution.

Returns:

pd.DataFrame: Pandas DataFrame containing the predictions.

label_processing.label_rotation module

label_processing.label_rotation.get_image_paths(input_image_dir: str) List[str][source]

Get a list of image paths in the input directory.

Args:

input_image_dir (str): Directory containing input images.

Returns:

list: List of image paths.

label_processing.label_rotation.get_predicted_angles(model: Model, images: ndarray) List[int][source]

Predict angles for a list of images using a trained model.

Args:

model (tf.keras.Model): Trained model. images (np.ndarray): List of images.

Returns:

list: List of predicted angles.

label_processing.label_rotation.load_image(image_path: str) ndarray[source]

Load an image from a file path.

Args:

image_path (str): Path to the image file.

Returns:

np.ndarray: Loaded image.

label_processing.label_rotation.load_images(image_paths: List[str]) ndarray[source]

Load images from a list of image paths.

Args:

image_paths (list): List of image paths.

Returns:

np.ndarray: Loaded images.

label_processing.label_rotation.predict_angles(input_image_dir: str, output_image_dir: str, model_path: str) None[source]

Load a trained model, predict angles for input images, and rotate images accordingly.

Args:

input_image_dir (str): Directory containing input images. output_image_dir (str): Directory to save rotated images.

Returns:

None

label_processing.label_rotation.rotate_image(image: ndarray, angle: int) ndarray[source]

Rotate an image based on a given angle.

Args:

image (np.ndarray): Input image. angle (int): Angle of rotation in multiples of 90 degrees.

Returns:

np.ndarray: Rotated image.

label_processing.label_rotation.rotate_images(image_paths: List[str], predicted_angles: List[int], output_image_dir: str) None[source]

Rotate images based on their predicted angles and save them to the output directory.

Args:

image_paths (list): List of image paths. predicted_angles (list): List of predicted angles. output_image_dir (str): Directory to save rotated images.

Returns:

None

label_processing.label_rotation.rotate_single_image(image_path: str, angle: int, output_dir: str) bool[source]

Rotate a single image based on a given angle and save the rotated image.

Args:

image_path (str): Path to the input image file. angle (int): Angle of rotation in multiples of 90 degrees. output_dir (str): Directory to save the rotated image.

Returns:

bool: True if the image is rotated, False otherwise.

label_processing.label_rotation.save_image(image: ndarray, output_path: str) bool[source]

Save an image to a file path.

Args:

image (np.ndarray): Image to save. output_path (str): Path to save the image.

Returns:

bool: True if the image is saved, False otherwise.

label_processing.ocr_vision module

class label_processing.ocr_vision.VisionApi(path: str, image: bytes, credentials: str, encoding: str)[source]

Bases: object

Class for interacting with the Google Cloud Vision API for OCR tasks on images.

process_string(result_raw: str) str[source]

Process the Google Vision OCR output, replacing newlines with spaces and encoding as specified.

Args:

result_raw (str): Raw output string directly from Google Vision.

Returns:

str: Processed string.

static read_image(path: str, credentials: str, encoding: str = 'utf8') VisionApi[source]

Read an image file and return an instance of the VisionApi class.

Args:

path (str): Path to the image file. credentials (str): Path to the credentials JSON file. encoding (str, optional): Encoding for the result (‘ascii’ or ‘utf8’). Defaults to ‘utf8’.

Returns:

VisionApi: Instance of the VisionApi class.

vision_ocr() dict[str, str][source]

Perform the actual API call, handle errors, and return the processed transcription.

Raises:

Exception: Raises an exception if the API does not respond.

Returns:

dict[str, str]: Dictionary with the filename and the transcript.

label_processing.tensorflow_classifier module

label_processing.tensorflow_classifier.class_prediction(model: Sequential, class_names: list, jpg_dir: str, out_dir=None) DataFrame[source]

Create a dataframe with predicted classes for each picture.

Args:

model (tf.keras.Sequential): Trained Keras Sequential image classifier model. class_names (list): Model’s predicted classes. jpg_dir (str): Path to the directory containing the original jpgs. out_dir (str): Path where the CSV file will be stored.

Returns:

DataFrame (pd.DataFrame): Pandas DataFrame with the predicted results.

label_processing.tensorflow_classifier.create_dirs(dataframe: DataFrame, path: str) None[source]

Create separate directories for every class.

Args:

dataframe (pd.Dataframe): DataFrame containing the classes as a column. path (str): Path of the chosen directory.

label_processing.tensorflow_classifier.filter_pictures(jpg_dir: Path, dataframe: DataFrame, out_dir: Path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-label-processing/checkouts/latest/docs')) None[source]

Create new folders for each class of the newly named classified pictures.

Args:

jpg_dir (Path): Path to directory with images. dataframe (pd.DataFrame): Pandas DataFrame with class predictions. out_dir (Path): Path to the target directory to save the cropped images.

label_processing.tensorflow_classifier.get_model(path_to_model: str) Sequential[source]

Load a trained Keras Sequential image classifier model.

Args:

path_to_model (str): Path to the model file.

Returns:

model (tf.keras.Sequential): Trained Keras Sequential image classifier model.

label_processing.tensorflow_classifier.make_file_name(label_id: str, pic_class: str) None[source]

Create a fitting filename.

Args:

label_id (str): String containing the label id. pic_class (str): Class of the label.

Returns:

filename (str): The created filename.

label_processing.tensorflow_classifier.rename_picture(img_raw: ndarray, path: str, filename: str, pic_class: str) None[source]

Rename the pictures using the predicted class.

Args:

img_raw (numpy.ndarray): Input jpg converted to a numpy matrix by cv2. path (str): Path where the picture should be saved. filename (str): Name of the picture. pic_class (str): Class of the label.

label_processing.text_recognition module

class label_processing.text_recognition.ImageProcessor(image: ndarray, path: str, blocksize: int | None = None, c_value: int | None = None)[source]

Bases: object

A class for image preprocessing and other image actions.

property blocksize: int
blur(ksize: tuple[int, int] = (5, 5)) ImageProcessor[source]

Apply Gaussian blur to the image.

Args:

ksize (Tuple[int, int], optional): The kernel size for blurring. Defaults to (5, 5).

Returns:

Image: An instance of the Image class representing the blurred image.

property c_value: int
copy_this() ImageProcessor[source]

Creates a copy of the current Image instance.

Returns:

ImageProcessor: A copy of the current Image instance.

deskew(angle: float64 | None) ImageProcessor[source]

Rotate the image to deskew it.

Args:

angle (Optional[np.float64]): The skew angle to use for deskewing.

Returns:

Image: An instance of the Image class representing the deskewed image.

dilate() ImageProcessor[source]

Dilate the image using a 5x5 kernel.

Returns:

Image: An instance of the Image class representing the dilated image.

erode() ImageProcessor[source]

Erode the image using a 5x5 kernel.

Returns:

Image: An instance of the Image class representing the eroded image.

get_grayscale() ImageProcessor[source]

Convert the image to grayscale.

Returns:

Image: An instance of the Image class representing the grayscale image.

get_skew_angle() float64 | None[source]

Calculate and return the skew angle of the image.

Returns:

Optional[np.float64]: The skew angle in degrees or None if it couldn’t be determined.

property image: ndarray
property path: str
preprocessing(thresh_mode: Threshmode) ImageProcessor[source]

Perform a series of preprocessing steps on the image.

Args:

thresh_mode (Threshmode): The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).

Returns:

ImageProcessor: An instance of the Image class representing the preprocessed image.

static read_image(path: str | Path) ImageProcessor[source]

Read an image from the specified path and return an instance of the Image class.

Args:

path (str): The path to a JPG file.

Returns:

Image: An instance of the Image class.

read_qr_code() str | None[source]

Tries to identify if a picture has a QR-code and then reads and returns it.

Returns:

Optional[str]: Decoded QR-code text as a str or None if there is no QR-code found.

remove_noise() ImageProcessor[source]

Remove noise from the image using median blur.

Returns:

Image: An instance of the Image class representing the noise-reduced image.

save_image(dir_path: str | Path, appendix: str | None = None) None[source]

Save the image to a specified directory with an optional appendix.

Args:

dir_path (str | Path): The directory path where the image will be saved. appendix (str, optional): An optional string to append to the image filename. Defaults to None.

thresholding(thresh_mode: Enum) ImageProcessor[source]

Perform thresholding on the image.

Args:

thresh_mode (Threshmode): The thresholding mode to use (OTSU, ADAPTIVE_MEAN, or ADAPTIVE_GAUSSIAN).

Returns:

Image: An instance of the Image class representing the thresholded image.

class label_processing.text_recognition.Tesseract(languages='eng+deu+fra+ita+spa+por', config='--psm 6 --oem 3', image: ImageProcessor | None = None)[source]

Bases: object

property image: ImageProcessor
image_to_string() dict[str, str][source]

Apply OCR and image parameters on JPG images.

Returns:

dict[str, str]: A dictionary containing the image ID (filename) and the OCR-processed text.

class label_processing.text_recognition.Threshmode(value)[source]

Bases: Enum

Different possibilities for thresholding.

Args:

Enum (int):

ADAPTIVE_GAUSSIAN = 3
ADAPTIVE_MEAN = 2
OTSU = 1
classmethod eval(threshmode: int) Enum[source]
label_processing.text_recognition.find_tesseract() None[source]

Searches for the tesseract executable and raises an error if it is not found.

label_processing.utils module

label_processing.utils.check_dir(directory: str) None[source]

Checks if the directory given as an argument contains jpg files.

Args:

directory (str): path to directory

Raises:

FileNotFoundError: raised if no jpg files are found in the directory

label_processing.utils.check_nuri_format(transcript: str) bool[source]

Check NURI’s format in OCR transcription “text”.

Args:

transcript (str): text field from OCR output

Returns:

bool: True if NURI pattern found, False otherwise

label_processing.utils.generate_filename(original_path: str, appendix: str, extension: str | None = None) str[source]

Gets the path to a file or directory as an input and returns it with an appendix added to the end.

Args:

original_path (str): original path to file or directory appendix (str): what needs to be appended extension (Optional[str]): either no extension (for directories) or a file extension as a string

Returns:

str: new file or directory name

label_processing.utils.load_dataframe(filepath_csv: str) DataFrame[source]

Loads the CSV file using Pandas.

Args:

filepath_csv (str): path to the CSV file

Returns:

pd.DataFrame: The CSV as a Pandas DataFrame

label_processing.utils.load_jpg(filepath: str) ndarray[source]

Loads the jpg files using the OpenCV module.

Args:

filepath (str): path to jpg files

Returns:

np.ndarray: OpenCV image object

label_processing.utils.load_json(file: str) dict[source]

Load JSON data from a file and deserialize it.

Args:

file (str): The name of the file containing JSON data.

Returns:

dict: The JSON data as a dictionary

label_processing.utils.read_vocabulary(file: str) dict[source]

Read a CSV file containing vocabulary and convert it to a dictionary.

Args:

file (str): The name of the CSV file containing vocabulary data.

Returns:

dict: A dictionary where keys and values are taken from the CSV data.

label_processing.utils.replace_nuri(transcript: dict[str, str]) dict[str, str][source]

Correct NURI format in OCR transcription JSON output.

Args:

transcript (dict[str, str]): JSON transcript with “ID” and “text” fields.

Returns:

dict[str, str]: JSON transcript with corrected NURI formats in “text” field.

label_processing.utils.save_json(data: list[dict], filename: str, path: str) None[source]

Saves a json file with human-readable format.

Args:

data (list[dict]): output of the OCR filename (str): name for the json file path (str): path where the json should be saved