摘要
市场调研发现,由于印度存在复杂多样的本地语言,印度市场仍然存有大量的用户因为本地语言使用不畅而导致使用手机体验不佳。国内手机厂商在印度市场语言翻译方面亟待本土化,这对手机厂商来说是个很大的机会,同时伴随着人工智能在图像识别领域的不断发展,可以借助图像文本识别技术来解决语言翻译问题。在图像文本识别领域,文本识别的准确率受测试数据集影响,文本提取的准确率越高,翻译的准确性才会越高,所以根据用户真实场景去构建测试集尤为重要。文章提供了适用于手机图像文本识别数据集构建的方法、自动化统计文本识别准确率的模型和其自动化实现方法。
Market research shows that due to the complex and diverse local languages in India,there are still a large number of users in the Indian market who have poor mobile phone experience due to poor use of local languages.Domestic mobile phone manufacturers urgently need to localize language translation in the Indian market,which is a great opportunity for mobile phone manufacturers.At the same time,with the continuous development of artificial intelligence in the field of image recognition,image text recognition technology can be used to solve the problem of language translation.In the field of image text recognition,the accuracy of text recognition is affected by the test data set.The higher the accuracy of text extraction,the higher the accuracy of translation.Therefore,it is particularly important to build the test set according to the user’s real using scene.This paper provides a method suitable for the construction of mobile phone image text recognition data set,a model for automatic statistical text recognition accuracy and its automatic implementation method.
作者
曹慧静
CAO Huijing(Transsion Holdings Technology Company,Shanghai 202106,China)
出处
《现代信息科技》
2022年第12期11-13,19,共4页
Modern Information Technology
关键词
图像识别
测试集
自动化
OCR评估模型
image recognition
test set
automation
the OCR evaluation model