English is supported by default. If you need OCR supporting other languages, extra language-pack of the ocr-engine is needed and the data-path which is used by ocr-engine for loading language data should be reset to your language-pack dir — download and unpack the following package, open GoldenDIct’s Preference dialog and swich to OCR Popup, select an OCR-Engine and set the data path by clicking the button next to the Engines’ List, then select module language(s) for the engine 「如需OCR
支持其它语言,请下载下列OCR
支持库,解压后到首选项对话框的划词页,针对OCR
引擎重设其识别库目录后选择需要识别的语言」
划词插件插件存放在运行目录下的gdp
文件夹内(名称以gdp.gsc
和gdp.ocr
开始的文件),Tesseract
引擎使用的默认数据存放在运行目录下的tessdata
文件夹内,Nicomsoft
引擎使用的默认数据存放在运行目录下的nsocr
文件夹内,划词引擎及数据按需加载但非GoldenDict++
版运行的必需组件 — 在不启用划词时程序并不加载划词相关的功能模块(也即不会多占内存和其它硬件资源)。
Name | File Name | Platform | Ratings | Remarks |
---|---|---|---|---|
MacVision | gdp.ocr.macvision.* | macOS 10.15~13.5 | ***** | Apple’s Vision framework, preferred and recommended on macOS Big Sur or Monterey |
WinRT Ocr | gdp.ocr.winrtocr.* | Windows 8/10/11 | ***** | Windows.Media.Ocr, preferred and recommended on Windows |
Tesseract | gdp.ocr.tesseract.* | All | ***** | With the power of Tesseract, hundreds of languages are supported. Preferred and recommended |
WeChatOCR | gdp.ocr.wechatocr.* | Windows x64 | **** | Automatically installed with WeChat x64 |
Youdao OCR | gdp.ocr.youdao.* | All | ***** | ai.youdao.com/DOCSIRMA/html/ocr |
Baidu OCR | gdp.ocr.baidu.* | All | ***** | ai.baidu.com/ai-doc/OCR |
Tencent OCR | gdp.ocr.tencent.* | All | ***** | cloud.tencent.com/document/product/866 |
Google OCR | gdp.ocr.google.* | All | *** | NOT Tested;developers.google.cn/codelabs |
Nicomsoft | gdp.ocr.nicomsoft.* | Windows | *** | Nicomsoft OCR is no longer officially maintained or updated |
winmask | gdp.gsc.winmask.* | Windows/Linux | ***** | Perfect graber supports taking dynamic shot on multi-screens. Preferred on Windows and Linux |
fromcliboard | gdp.gsc.fromcliboard.* | All | ***** | Screen graber using external tools. Preferred on macOS and Linux |
qtcamera | gdp.gsc.qtcamera.* | All | *** | Camera image capture using QCamera. |
引擎支持的语言由Apple
公司新版本的macOS
或iOS
系统附带(自带,无需额外安装):
引擎支持的语言由Windows
系统提供,可在系统的设置
项中安装
额外的支持语言:
By default GD++ comes packaged with the following languages: English, Chinese Simplified, and Chinese-Traditional (GD++发行包中默认携带了英文
、简体中文
和繁体中文
的tessdata
数据包).
Follow these steps if you would like to install additional OCR languages (参考以下步骤安装额外的语言数据包):
- Download the appropriate OCR language dictionary (下载您需要的识别语言的数据包).
- Open the “.zip” file you just downloaded with 7-Zip or similar decompression software (用解压缩软件打开已下载的压缩包).
- Drag all files contained within the zip file to the tessdata folder (从解压缩软件的文件列表中拖拽所有的文件到
GD++
部署目录下的tessdata
文件夹内):- Re-select module language(s) for the engine (在
GD++
中重新为该引擎配置识别语言).
The following OCR languages are supported(全量tessdata
数据包支持的语言):
Chinese Simplified Chinese-Simplified (vertical) Chinese-Traditional Afrikaans Irish Norwegian Amharic Galician Occitan(post1500) Arabic Greek, Ancient(to1453) Oriya Assamese Gujarati Panjabi;Punjabi Azerbaijani Haitian; HaitianCreole Polish Azerbaijani-Cyrilic Hebrew Portuguese Belarusian Hindi Pushto;Pashto Bengali Croatian Quechua Tibetan Hungarian Romanian; Moldavian; Moldovan Bosnian Armenian Russian Breton Inuktitut Sanskrit Bulgarian Indonesian Sinhala;Sinhalese Catalan;Valencian Icelandic Slovak Cebuano Italian Slovak-Fraktur Czech Italian-Old Slovenian Javanese Sindhi Japanese(vertical) Spanish; Castilian Japanese Spanish; Castilian-Old Chinese-Traditional (vertical) Kannada Albanian Cherokee Georgian Serbian Corsican Georgian-Old Serbian-Latin Welsh Kazakh Sundanese Danish CentralKhmer Swahili Danish-Fraktur Kirghiz; Kyrgyz Swedish German Kurmanji (Kurdish-LatinScript) Syriac German-Fraktur Korean Tamil Dhivehi; Divehi; Maldivian Korean(vertical) Tatar Dzongkha Kurdish(ArabicScript) Telugu Greek, Modern(1453-) Kurdish(ArabicScript) Tajik English Lao Tagalog English, Middle(1100-1500) Latin Thai Esperanto Latvian Tigrinya Mathandequations Lithuanian Tonga Estonian Luxembourgish Turkish Basque Malayalam Uighur;Uyghur Faroese Marathi Ukrainian Persian Macedonian Urdu Filipino;Pilipino Maltese Uzbek Finnish Mongolian Uzbek-Cyrilic French Maori Vietnamese German-Fraktur Malay Yiddish French, Middle(ca.1400-1600) Burmese Yoruba WesternFrisian Nepali Dutch;Flemish ScottishGaelic;Gaelic
该引擎预设的配置参数存在于nsocr
目录下的Config.dat
文件中,可使用文本编辑器修改,参数信息请参考官方faq和help文档。
Chinese Simplified Chinese Traditional English Estonian Bulgarian Hungarian Slovak Finnish Catalan Indonesian Slovenian French Croatian Italian Spanish German Czech Latvian Swedish Romanian Danish Lithuanian Turkish Russian Dutch Norwegian Arabic Japanese Polish Portuguese Korean
参考GoldenDict++插件接口定义一文可以开发自己的划屏
和OCR
引擎。