Swift version: 5.6
PDFKit comes with a built-in class called PDFDocument
, which allows us to load and parse PDF documents. It’s used when you want to put your PDF into a PDFView
, but it’s also useful when you just want to read text from the PDF: you can loop over each page in the PDF, read its attributedString
property, then append it to an attributed string containing all the text from the PDF.
First, add import PDFKit
in the Swift file you’re using, then add the following example code to read out the text contents of a file:
if let pdf = PDFDocument(url: yourDocumentURL) {
let pageCount = pdf.pageCount
let documentContent = NSMutableAttributedString()
for i in 0 ..< pageCount {
guard let page = pdf.page(at: i) else { continue }
guard let pageContent = page.attributedString else { continue }
documentContent.append(pageContent)
}
}
It’s an attributed string, so it will retain formatting from the PDF as best as it can.
SPONSORED Still waiting on your CI build? Speed it up ~3x with Blaze - change one line, pay less, keep your existing GitHub workflows. First 25 HWS readers to use code HACKING at checkout get 50% off the first year. Try it now for free!
Sponsor Hacking with Swift and reach the world's largest Swift community!
Available from iOS 11.0 – learn more in my book Advanced iOS: Volume Two
This is part of the Swift Knowledge Base, a free, searchable collection of solutions for common iOS questions.
Link copied to your pasteboard.