This video tutorial shows, how to split a PDF into multiple PDFs with UiPath. In the second part the guide shows, how to splidt PDFs with dynamic ranges (page numbers can be dynamic). The use case also involves a lot of UiPath work with files and folders.
You could also watch:
🔵 Extract tables out of PDFs in UiPath - [ Ссылка ]
🔵 Invoice PDF Exctraction with Regex in UiPath - [ Ссылка ]
0:00 Intro to the Use Case
We want to split a PDF file into multiples PDF files. Imagine a PDF with multiple invoices in it. The challenge is, that we don't know the span of the pages to be splitted, they are dynamic (the invoice can be of 1, 2 or more pages). 📁 Download the files from the video: [ Ссылка ]
1:27 Install the PDF Package in UiPath
We install the UiPath.PDF.Activities by UiPath in order to be able to use the PDF activities.
1:49 Split a PDF into multiple one page PDFs
In the first part we split our PDF into multiple one page PDFs. This solution will only make one page PDFs, so we will have a problem if our splitted PDFs spans more than one page.
1:59 Get all filenames in a folder
We use a For Each and the .NET method Directory.GetFiles to get all files in a folder as strings, so we can work with them. Buy this book to learn all about VB.NET (the coding language in UiPath): [ Ссылка ] (AFFILIATE). As a best practice do remember to create a variable for the folder in scope and not hardcode it in the activity. We will also define the searchPattern, so we will only look for certain file types.
4:30 Get PDF Page Count
We use the activity to get the total page count of our PDF. The count is stored as an integer.
5:44 Extract the PDF pages one by one
Using two page counters (one for the current page and one for the total page count) and a While loop, we can iterate through the entire PDF. Use a Extract PDF Page Range activity and remember to add one to your current page counter. We use another .NET method to get the file names without extensions, path.GetFileNameWithoutExtensions. A good idea is to add a unique ID to the extracted file name.
10:57 Create folder if it doesn't exist
We create the Output folder, if it doesn't exist, using the activities Path Exists, If and Create Folder. Here you will learn to work with folders and booleans. Add this to always check, if the Output folder exist and create it, if it doesn't.
12:58 Split the PDF in multiple one or two page PDFs
Now we expand our solution to also cover, if the PDFs is spanning either one or two page. We now read each of the PDF pages into a string, so we can apply Regex on it and look if a "Page 2" exist. If yes, we know it's a two page PDFs. The solution is now to extract two pages (the current one and the previous) and then overwrite the previous page.
18:16 Split PDF with dynamic ranges
Now our PDFs to be extracted can be of any length and we therefore need to solve. The intuition is, that we want to check if got 2 pages, if yes then check for 3 pages, if yes then check for 4 pages and so forth...That is another While loop and a Regex Matches activity.
Connect with me:
🔔 Subscribe - [ Ссылка ]
💼 LinkedIn - [ Ссылка ]
👥 Facebook - [ Ссылка ]
💌 Email Newsletter - [ Ссылка ]
#uipath #rpa #automation
How To Split PDFs With Dynamic Ranges In UiPath
Теги
anders jensenuipathuipath guideuipath how touipath tutorialuipath tutorialsuipath split pdfhow to split pdf in uipathsplit pdf in uipathrobotic process automationrpauipath rpasplit pdf document into separate filessplit pdf files into multiple filesuipath pdf extractionextract pages from pdfhow to split a pdf file in uipathpdf into multiple pages in uipathpdf dynamic range split uipathhow to split pdfs with dynamic ranges in uipath