How To Install ocrmypdf on Ubuntu 22.04
Introduction
In this tutorial we learn how to install ocrmypdf
on Ubuntu 22.04.
What is ocrmypdf
ocrmypdf is:
OCRmyPDF generates a searchable PDF/A file from a regular PDF containing only images, allowing it to be searched.
It uses the Tesseract OCR engine and so supports all the languages that Tesseract does.
Some other main features:
- Places OCR text accurately below the image to ease copy / paste
- Keeps the exact resolution of the original embedded images
- When possible, inserts OCR information as a lossless operation without rendering vector information
- Keeps file size about the same
- If requested deskews and/or cleans the image before performing OCR
- Validates input and output files
- Provides debug mode to enable easy verification of the OCR results
- Processes pages in parallel when more than one CPU core is available
- Battle-tested on thousands of PDFs, a test suite and continuous integration.
There are three methods to install ocrmypdf
on Ubuntu 22.04. We can use apt-get
, apt
and aptitude
. In the following sections we will describe each method. You can choose one of them.
Install ocrmypdf Using apt-get
Update apt database with apt-get
using the following command.
sudo apt-get update
After updating apt database, We can install ocrmypdf
using apt-get
by running the following command:
sudo apt-get -y install ocrmypdf
Install ocrmypdf Using apt
Update apt database with apt
using the following command.
sudo apt update
After updating apt database, We can install ocrmypdf
using apt
by running the following command:
sudo apt -y install ocrmypdf
Install ocrmypdf Using aptitude
If you want to follow this method, you might need to install aptitude first since aptitude is usually not installed by default on Ubuntu. Update apt database with aptitude
using the following command.
sudo aptitude update
After updating apt database, We can install ocrmypdf
using aptitude
by running the following command:
sudo aptitude -y install ocrmypdf
How To Uninstall ocrmypdf on Ubuntu 22.04
To uninstall only the ocrmypdf
package we can use the following command:
sudo apt-get remove ocrmypdf
Uninstall ocrmypdf And Its Dependencies
To uninstall ocrmypdf
and its dependencies that are no longer needed by Ubuntu 22.04, we can use the command below:
sudo apt-get -y autoremove ocrmypdf
Remove ocrmypdf Configurations and Data
To remove ocrmypdf
configuration and data from Ubuntu 22.04 we can use the following command:
sudo apt-get -y purge ocrmypdf
Remove ocrmypdf configuration, data, and all of its dependencies
We can use the following command to remove ocrmypdf
configurations, data and all of its dependencies, we can use the following command:
sudo apt-get -y autoremove --purge ocrmypdf
References
Summary
In this tutorial we learn how to install ocrmypdf
package on Ubuntu 22.04 using different package management tools: apt
, apt-get
and aptitude
.