图像提取脚本错误？修复指南，解决缺少static目录问题

python

2024-03-22 04:57:02

图像提取脚本错误：修复指南

介绍

本文将探讨在运行图像提取脚本extract_img.py时遇到的常见错误，并提供逐步解决方法。该脚本用于从PDF文件中提取图像，但可能会因缺少必要的目录而失败。通过遵循本指南，你可以快速解决该错误，并成功提取所需图像。

错误原因

当脚本extract_img.py无法找到名为static的目录时，就会发生错误。fitz模块依赖此目录来存储临时文件。如果没有此目录，脚本将无法正常运行。

解决方案

为了解决此错误，在运行脚本之前，请确保已创建static目录。目录的路径应为：

C:/Users/Factoryz Amandine/OneDrive/Bureau/Python/CCOR02752150_3.pdf/static

你可以通过在命令提示符或文件管理器中导航到此路径并创建目录来手动创建它。

修改后的脚本

为了自动化目录创建过程，可以使用以下修改后的脚本：

import os

# 创建 static 目录
os.makedirs('static', exist_ok=True)

from os import chdir
import shutil, os
import io
from PIL import Image
import fitz
from unif_noun import unif_noun #other file python for change file noun.

def execute_func(rootdir):
    for subdir, dirs, files in os.walk(rootdir):
        for file in files:
            filepath = subdir + os.sep + file
            if filepath.endswith(".pdf"):
                #extract(f"{filepath}")
                # open the file
                pdf_file = fitz.open(file)
                images = list()
                for page_index in range(len(pdf_file)):
                    # get the page itself
                    page = pdf_file[page_index]
                    image_list = page.getImageList()
                    # printing number of images found in this page
                    # if image_list:
                    #     print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
                    # else:
                    #     print("[!] No images found on page", page_index)
                    for image_index, img in enumerate(page.getImageList(), start=1):
                        images.append(img[0])
                for i, xref in enumerate(images, start=1):
                    if 1 < i < len(images) - 3:
                        # extract the image bytes
                        base_image = pdf_file.extractImage(xref)
                        image_bytes = base_image["image"]
                        # get the image extension
                        image_ext = base_image["ext"]
                        # load it to PIL
                        image = Image.open(io.BytesIO(image_bytes))
                        # save it to local disk
                        image.save(open(f"{unif_noun(file)}.{image_ext}", "wb"))
                        # Déplacer un fichier du répertoire
                        for subdir, dirs, files in os.walk(rootdir):
                            for f in files:
                                source = subdir
                                destination = 'C:/Users/.../VS Projects/img'
                                filename = os.path.basename(source)
                                dest = os.path.join(destination,filename)
                                shutil.move(source + f"{unif_noun(file)}.{image_ext}", dest)
execute_func(r'C:/Users/Factoryz Amandine/OneDrive/Bureau/Python/CCOR02752150_3.pdf')