How To Install pdfsandwich on Ubuntu 22.04

In this tutorial we learn how to install pdfsandwich on Ubuntu 22.04. pdfsandwich is Tool to generate sandwich OCR pdf files

Introduction

In this tutorial we learn how to install pdfsandwich on Ubuntu 22.04.

What is pdfsandwich

pdfsandwich is:

pdfsandwich generates “sandwich” OCR pdf files, i.e. pdf files which contain only images (no text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly “behind” the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals.

It is able to recognize the page layout even for multicolumn text.

Essentially, pdfsandwich is a wrapper script which calls the following binaries: convert, unpaper, gs (only for psd resizing), hocr2pdf (for tesseract < 3.03), and tesseract.

There are three methods to install pdfsandwich on Ubuntu 22.04. We can use apt-get, apt and aptitude. In the following sections we will describe each method. You can choose one of them.

Install pdfsandwich Using apt-get

Update apt database with apt-get using the following command.

sudo apt-get update

After updating apt database, We can install pdfsandwich using apt-get by running the following command:

sudo apt-get -y install pdfsandwich

Install pdfsandwich Using apt

Update apt database with apt using the following command.

sudo apt update

After updating apt database, We can install pdfsandwich using apt by running the following command:

sudo apt -y install pdfsandwich

Install pdfsandwich Using aptitude

If you want to follow this method, you might need to install aptitude first since aptitude is usually not installed by default on Ubuntu. Update apt database with aptitude using the following command.

sudo aptitude update

After updating apt database, We can install pdfsandwich using aptitude by running the following command:

sudo aptitude -y install pdfsandwich

How To Uninstall pdfsandwich on Ubuntu 22.04

To uninstall only the pdfsandwich package we can use the following command:

sudo apt-get remove pdfsandwich

Uninstall pdfsandwich And Its Dependencies

To uninstall pdfsandwich and its dependencies that are no longer needed by Ubuntu 22.04, we can use the command below:

sudo apt-get -y autoremove pdfsandwich

Remove pdfsandwich Configurations and Data

To remove pdfsandwich configuration and data from Ubuntu 22.04 we can use the following command:

sudo apt-get -y purge pdfsandwich

Remove pdfsandwich configuration, data, and all of its dependencies

We can use the following command to remove pdfsandwich configurations, data and all of its dependencies, we can use the following command:

sudo apt-get -y autoremove --purge pdfsandwich

References

Summary

In this tutorial we learn how to install pdfsandwich package on Ubuntu 22.04 using different package management tools: apt, apt-get and aptitude.