按照官网步骤进行开通:https://cloud.tencent.com/product/generalocr/getting-started
开通后可获得两个参数:SecretId、SecretKey
腾讯提供的文字识别服务,支持多种图片识别,包括身份证、表格、或者其他通用文字识别等,本文主要使用表格识别服务。
接口调用说明:
1.打开表格识别V2接口说明文档:https://cloud.tencent.com/document/product/866/49525#1.-.E6.8E.A5.E5.8F.A3.E6.8F.8F.E8.BF.B0
2.进入接口调试页面
3.自动生成python语法的代码片段(供参考)
tencentcloud-sdk-python:
pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python
参考地址:https://cloud.tencent.com/document/sdk/Python
(1)pip install pywin32;
(2)pip install wheel
(3)pip install -U setuptools
(4)pip install pyinstaller
参考资料:https://blog.csdn.net/qiuqiuit/article/details/113080645
代码设计:
1.yaml
yaml文件管理配置信息:SecretId、SecretKey
yaml配置文件内容:config.yaml
# -*- coding: utf-8 -*- secret_id: AKIDvpcBVdZJ0b2Rfha****** secret_key: DD0ZtzsXzFI6h5FAsN******
对应的读取yaml内参数的代码如下:
import yaml def get_yaml_data(yaml_file): #1-打开yaml文件 file = open(yaml_file, 'r', encoding="utf-8") #2-读取文件 file_data = file.read() #3-将字符串转化为字典或列表 data = yaml.safe_load(file_data) #4-关闭按文件 file.close() #返回读取的数据内容 return data
2.ocr识别图片
# -*- coding: utf-8 -*- from tencentcloud.common import credential from tencentcloud.common.profile.client_profile import ClientProfile from tencentcloud.common.profile.http_profile import HttpProfile from tencentcloud.ocr.v20181119 import ocr_client, models import base64 # OCR识别封装 def img_to_excel(output_file_name, image_path, secret_id, secret_key): #1- 实例化一个认证对象,入参需要传入腾讯云账户secretId,secretKey cred = credential.Credential( secret_id, secret_key ) #2- 实例化client对象 http_profile = HttpProfile() http_profile.endpoint = "ocr.tencentcloudapi.com" client_profile = ClientProfile() client_profile.httpProfile = http_profile client_profile.signMethod = "TC3-HMAC-SHA256" client = ocr_client.OcrClient(cred, "ap-shanghai", client_profile) #3-实例化一个请求对象--使用表格V2 fast_request = models.TableOCRRequest() #4-读取图片数据,使用Base64编码 with open(image_path, 'rb') as f: image = f.read() image_base64 = str(base64.b64encode(image), encoding='utf-8') fast_request.ImageBase64 = image_base64 #5-通过client对象调用访问接口,传入请求对象----使用表格V2 resp=client.TableOCR(fast_request) #6-获取返回数据(Data为Base64编码后的Excel数据) data = resp.Data # 转换为Excel output_file_name = str(output_file_name) path_excel = output_file_name + ".xlsx" with open(path_excel, 'wb') as f: f.write(base64.b64decode(data)) return path_excel
3.运行调用
# -*- coding: utf-8 -*- import ocr import yaml_class ''' 图片统一命名格式:'num'+N(编号)+'.jpn' excel文件名称格式:与图片的N(编号)一一对应 ''' def ocr_pic(N): #调用yaml读取函数,已列表形式返回配置数据secret_id,secret_key config = yaml_class.get_yaml_data("config.yml") #遍历读取图片,N为图片的个数 for image_path in range(1, N + 1): pic_path = image_path # 调用ocr识别图片并转成excel文件 path_excel = ocr.img_to_excel( 'excel_list/' + str(pic_path) , #excel文件路径+文件名 'pic_list/' + 'num' + str(image_path) + '.png', #图片文件路径+文件名 secret_id=config['secret_id'], secret_key=config['secret_key'], ) if __name__ == '__main__': #入参为图片数量,假设有2张图片 ocr_pic(2)
4.结果