Using PDF Files is an easy and secure way to post and share documents. When you want to preserve the format and details of a table, it’s a good idea to convert the document to a PDF file. However, if you are importing a PDF file, it is quite difficult to extract the tables from one.
Of course you want to save time and you don’t want to create a new table from scratch. So how do you extract tables from PDF without relying on copy and paste? Here are some of the ways to achieve this.
Using Online Converters
The fastest way is to use online file converters to change the PDF file to any other document format you want. Several free online tools convert PDF to Excel to extract tabular data. Popular examples are cometdocs and Smallpdf. However, most of these tools only offer basic extraction capabilities.
Disclaimer: We strongly recommend that you avoid using online converters when dealing with PDF files that contain confidential and sensitive information.
Here’s how to extract tables from PDF files using Smallpdf:
Stage 1: Visit the Smallpdf website.
Step 2: Select the conversion to do.
Stage 3: Drag the PDF file to convert to PDF converter.
Step 4: After the file has finished uploading, click Convert to Excel.
Step 5: Click Download to save the PDF file to your device.
Step 6: Launch the downloaded excel to confirm if your table has been converted correctly.
Extracting Tables Using Microsoft Power BI
Microsoft BI is another useful tool for extracting tabular data from PDF files. In particular, the Power Query feature in Power BI makes it easy for users to import PDF files and extract tabular data within a document.
However, this only applies to those with an Office 365 subscription. Alternatively, you will need to purchase a Power BI package separately.
The Power Query feature is also available in the Power BI free trial. Here’s how to use it to extract table data:
Stage 1: Download, install and launch Microsoft Power BI.
Step 2: Select Get Data in the Home section of the app’s desktop.
Stage 3: Click File and then select PDF.
Step 4: Click Connect to browse to the location of the PDF file on your computer.
Step 5: Select the location of the PDF file to import into Power BI.
Step 6: Once the file is imported into Power BI, you should see an Explorer with table numbers and Page numbers. Select the Table Number to load.
Step 7: Select Upload to create the Table in Power BI.
Using Microsoft Excel
Like Power BI, Microsoft Excel has Power Query which can be used to load PDF files and extract tabular data. However, this feature is only available in Excel 2016 or newer. Here’s how it works:
Stage 1: Start Microsoft Excel.
Step 2: Select Data on the Ribbon.
Stage 3: Select Get Data to launch the drop-down menu.
Step 4: Select From file and PDF.
Step 5: Select the location of the PDF file to export to Excel.
Step 6: Once the file is exported to Excel, you should see an Explorer with table numbers, page numbers, or a preview of the data within the PDF. Select the Table Number to load.
Step 7: Select Upload to create the table in Excel.
If you are using Microsoft Office 2016 or a newer suite and you observe that your data tab does not include a Power Query feature. Here’s how to set it up.
Stage 1: Visit the Microsoft Website.
Step 2: Select the Language of the Plugin.
Stage 3: Select Download.
Step 4: Select the version of the add-on you want to download.
Step 5: Select Next to start the download.
Step 6: After the file has finished downloading, run the file to install the plugin.
Using Adobe Acrobat Pro DC
Adobe Acrobat Pro DC is another tool that can be used to extract tabular data from PDF files or convert PDF files to other file types like excel for data analysis. The tool has a 7-day trial and a professional version and can be used online or downloaded to a device.
Adobe Acrobat Web Edition
Here’s how to use the web version to extract table data:
Stage 1: Visit the Adobe Acrobat Pro DC web link.
Step 2: Select the Convert menu at the top of the page.
Stage 3: Scroll down the website and find PDF to Excel.
Step 4: Drag and drop the PDF file you want to convert in the new drop-down menu.
Step 5: After the file has finished uploading, select Export to XLSX.
Step 6: Select the Download icon at the top of the page to save the converted file to your device.
Adobe Acrobat on PC
Here’s how to use the PC version to extract table data:
Stage 1: Launch Adobe Acrobat Pro DC.
Step 2: Select Open File in Application.
Stage 3: Select the PDF file to import into the application using the file library.
Step 4: When the file opens, use the cursor to highlight the table data.
Step 5: Right-click the highlighted area and choose Export Selection As.
Step 6: Enter the details of the File Name in the File Library and select the File Type.
Comparing Two PDF Files
Extracting tables from PDF files is not an impossible task. You can import file types other than PDF using Power Query in Excel and Power BI. On the other hand, you can compare two PDF files side by side using Adobe Acrobat Pro DC.