How To Install ocrmypdf on Ubuntu 22.04

In this tutorial we learn how to install ocrmypdf on Ubuntu 22.04. ocrmypdf is add an OCR text layer to PDF files

Introduction

In this tutorial we learn how to install ocrmypdf on Ubuntu 22.04.

What is ocrmypdf

ocrmypdf is:

OCRmyPDF generates a searchable PDF/A file from a regular PDF containing only images, allowing it to be searched.

It uses the Tesseract OCR engine and so supports all the languages that Tesseract does.

Some other main features:

  • Places OCR text accurately below the image to ease copy / paste
  • Keeps the exact resolution of the original embedded images
  • When possible, inserts OCR information as a lossless operation without rendering vector information
  • Keeps file size about the same
  • If requested deskews and/or cleans the image before performing OCR
  • Validates input and output files
  • Provides debug mode to enable easy verification of the OCR results
  • Processes pages in parallel when more than one CPU core is available
  • Battle-tested on thousands of PDFs, a test suite and continuous integration.

There are three methods to install ocrmypdf on Ubuntu 22.04. We can use apt-get, apt and aptitude. In the following sections we will describe each method. You can choose one of them.

Install ocrmypdf Using apt-get

Update apt database with apt-get using the following command.

sudo apt-get update

After updating apt database, We can install ocrmypdf using apt-get by running the following command:

sudo apt-get -y install ocrmypdf

Install ocrmypdf Using apt

Update apt database with apt using the following command.

sudo apt update

After updating apt database, We can install ocrmypdf using apt by running the following command:

sudo apt -y install ocrmypdf

Install ocrmypdf Using aptitude

If you want to follow this method, you might need to install aptitude first since aptitude is usually not installed by default on Ubuntu. Update apt database with aptitude using the following command.

sudo aptitude update

After updating apt database, We can install ocrmypdf using aptitude by running the following command:

sudo aptitude -y install ocrmypdf

How To Uninstall ocrmypdf on Ubuntu 22.04

To uninstall only the ocrmypdf package we can use the following command:

sudo apt-get remove ocrmypdf

Uninstall ocrmypdf And Its Dependencies

To uninstall ocrmypdf and its dependencies that are no longer needed by Ubuntu 22.04, we can use the command below:

sudo apt-get -y autoremove ocrmypdf

Remove ocrmypdf Configurations and Data

To remove ocrmypdf configuration and data from Ubuntu 22.04 we can use the following command:

sudo apt-get -y purge ocrmypdf

Remove ocrmypdf configuration, data, and all of its dependencies

We can use the following command to remove ocrmypdf configurations, data and all of its dependencies, we can use the following command:

sudo apt-get -y autoremove --purge ocrmypdf

References

Summary

In this tutorial we learn how to install ocrmypdf package on Ubuntu 22.04 using different package management tools: apt, apt-get and aptitude.