Ocrdesktop

Tango-view-fullscreen.png

Tango-view-fullscreen.png

This article or section needs expansion.

Reason: This wiki page is a work in progress by chrys. (Discuss in Talk:Ocrdesktop#)

OCRdesktop is a useful accessibility tool to grab content from the screen as text via OCR technology.

It takes an image of the current window or workspace, prepares it for better results and uses tesseract to recognize text on it. The result is presented in a caret enabled text area, in a detailed list with coordinates and confidence or in the clipboard. It also can emulate clicks on the text. It consists of two main parts.

1. The main window: This is a caret browsable text area with the recognized content. On the bottom of the window are several buttons to perform mouse clicks, change views and other useful things.

2. The Macro executor: this is a window where you can choose to Run or Delete the current stored macros and preclicks. You also can skip running a macro by pressing the cancel button. (See #Macros and the preclick concept).

Installation

Just Install the package ocrdesktopAUR from the AUR. Make sure that you have the corresponding tesseract-data-<languagecode> packets for your language installed.

Setup

Assign the command ocrdesktop to a shortcut in your desktop environment. You also can use parameters to expand the function of OCRdesktop.

Basically, this is work in any desktop environment.

In Gnome you can do this via the Gnome Control Center in the Keyboard window under the Shortcuts tab.

Using

Just press the assigned shortcut. With no parameters OCRdesktop will recognize just the current window and present it in a caret enabled text area.

Help

You can see a little help and the available parameters if you enter the following in a terminal.

$ ocrdesktop -h

You can always mix different parameters.

Recognize current workspace

If you do not want to restrict recognition to the current window, use the -d option.

$ ocrdesktop -d

Emulate mouse events

On the main window are buttons which emulate clicks on the word at the current cursor position.

You can emulate the following mouse operations:

  • Single left click (Alt+l): common for selecting/activating entry's
  • Double left click (Alt+d): common for opening entry's in the same window
  • Single right click (Alt+r): open the context menu for the object under the mouse
  • Single middle click (Alt+m): Usually opens an object in a new tab
  • Route the mouse over an Object (Alt+t): used for mouse over events like tool tips

for doing a mouse operation immediately: Place the mouse on the word in the text area or list entry (in the list view) and press on the corresponding shortcut or button at the bottom of the main window.

Macros and the preclick concept

The concept of preclicks is not easy to understand at first, but it solves a really easy to understand problem.

In most desktop environments, global shortcuts don't work while a menu is open, (for example the file menu in the menu bar at the top of most programs).

Preclicks are basically macros that can be run before OCRdesktop takes its screen shot. This allows you to close all menus and let OCRdesktop click on the menu before it recognizes the window. Preclicks macros are really easy to use. In the main window is a check box Save as Preclick Alt+v. Set this check box and choose a mouse click that should be performed before OCRdesktop starts the next time, much the same as doing a normal mouse click emulation ( see #Emulate mouse events). After you click on the button that starts a mouse click, nothing will happen. next time you run OCRdesktop, it will ask you what to do. You can press Run, so all stored clicks will execute. After that, OCRdesktop takes its screen shot for OCR (with the opened menu). If you now check the Save as Preclick again the second click will also be stored, (e.g. for opening a sub menu). You can save as many mouse operations as you want. Choose Delete in the macro window to erase the macro, so its lost. If you press Cancel, no mouse clicks are performed, but the main window opens. The macro will not be deleted and you will be asked next time you start OCRDesktop if you want to run your stored clicks.

Tip: By default, the preclick macro file is stored under /tmp/MyOCRMacro.osm. If you want to run a macro more often, you can copy it to another location before deleting it.

You can execute an existing macro file stored anywhere on the hard disk by using the -m </path/to/macro/macroname.osm> option.

$ ocrdesktop -m </path/to/macro/macroname.osm>
Tip: You can combine it with the -n option. OCRdesktop will just start the click sequence for you without any GUI.

OCR to clipboard

OCRdesktop provides the possibility to send the currently recognized content to the clipboard.

This is easily done by specifying the -c option.

$ ocrdesktop -c

This opens the main window and sends the content to the clipboard. If you dont want to open the main window, you could add the "no GUI" option -n.

$ ocrdesktop -c -n

or

$ ocrdesktop -cn

Now you have the recognized text in the clipboard and no window appears.