How to convert SharePoint pages into PDF files

In this post we step through how you can use Power Automate to convert modern SharePoint pages into PDF files and save them to a document library.

Intro

Recently I got asked to come up with a way to turn SharePoint pages into PDF files for use in an offline scenario. The converted SharePoint pages didn’t need to be formatted as it was only the body content of a SharePoint page that was needed. Also part of the brief was that when the SharePoint page is updated, the corresponding PDF file also updates.

There are several posts online that cover very topic this that I’ll reference at the end, but they didn’t quite do exactly what I wanted – so here’s my take on how to convert SharePoint pages into PDF files!

What you’ll need

  • A modern SharePoint site pages library (these come with every SharePoint site!)
  • A OneDrive location to temporarily store the SharePoint page outputs
  • Power Automate to build the automation
  • A document library to store the output PDF files

A note on the site pages library

In my example I didn’t want all the site pages to be converted into PDF files, so I added a choice column to ‘tag’ all the pages that should be converted. I set the default value of the choice column to be ‘Site Page’, so that the only pages that get converted are the ones I’m interested in. This is reflected in the flow below with the condition step.

Add a choice column to ‘tag’ the pages you wish to convert to PDF.

Building the flow

The trigger action for our flow is when a file is created or modified (properties only). This allows us to re-run the flow when SharePoint pages are updated to also update the PDF files.

  • Select the site you are using to create the SharePoint pages in site address (If you don’t see it listed just press enter custom value and paste the URL in)
  • Select the Site Pages library under library name

Next, I’ve added a condition to only convert pages that have been tagged ‘Runbook’ to PDF.

Condition: if Document type value is equal to ‘Runbook’.
  • Note: make sure you select the Value dynamic content for your choice column, rather than the choice column itself as that will break your flow.

If yes, next is a send an HTTP request to SharePoint step. Here I’m using a REST API call to get the body content of the SharePoint page.

Use a send an HTTP request to SharePoint step to get the body content of your page.
  • Set the site address to the site in question
  • Set method to GET
  • Enter the following in Uri:_api/web/lists/GetByTitle('Site%20Pages')/items('ID')/CanvasContent1
  • Replace ‘ID’ with the dynamic content ID from the when a file is created or modified step

Note: The output of this step generates some additional stuff you probably won’t want in your PDF like this:

 "d": { "CanvasContent1": "}}

I used the parse JSON step to remove the unwanted mark up and just get the plain text from the body content.

  • I added the body dynamic content from the send an HTTP request to SharePoint step in the content field in the parse JSON step
  • I copied the the output body from send an HTTP request to SharePoint of a successful run in flow history and pasted it into the parse JSON step
Output body from send an HTTP request to SharePoint to paste into the parse JSON step from a successful flow run.
  • I then pressed generate from sample, which output the following:
{
    "type": "object",
    "properties": {
        "d": {
            "type": "object",
            "properties": {
                "CanvasContent1": {
                    "type": "string"
                }
            }
        }
    }
}

Parse JSON step with generated schema.

From this I then used a create file action to create a temporary HTML file in OneDrive (more on this later), with the following config:

  • Folder path: / (root of the OneDrive account)
  • File name: Name from when a file is created or modified step
  • File content: CanvasContent1 from the parse JSON step
Create file action to create temporary HTML page in OneDrive.

Next, a convert file step to convert the HTML page into a PDF file:

  • File: ID from the create file step
  • Target type: PDF

Now we can use a create file action to create a PDF in our output document library in SharePoint:

  • Set the site address to the site you want to store the PDF files in
  • Set the folder path to the document library, or navigate to the relevant folder within that library
  • Set file name to file name from the convert file step
  • Set file content to file content from the convert file step
The create file action creates the PDF file in the destination document library.

I then used an update file properties action to pass metadata from the site pages library to the destination document library – this step is optional. Finally, a delete file action to delete the temporary HTML file from the OneDrive we created earlier:

Delete file action to remove the temporary HTML file.

Here’s the flow in it’s entirety:

Issues & troubleshooting

Formatting issues with the send an HTTP request to SharePoint

As mentioned above, when just using the send an HTTP request to SharePoint action, the output contains mark up that isn’t going to make sense within the PDF. The parse JSON action cleans this up and just leaves the body content of the page.

Create file action creates corrupt PDF files

When testing this flow out I originally didn’t have the convert file action in place. In the file name I added ‘.PDF’, but every time the output PDF was corrupt and errored like this when trying to open:

The flow also failed on this step and the error said that “Conversion of this file to PDF is not supported. (InputFormatNotSupported / pdf)”. I decided to scrap this approach and create a HTML page and add in the convert file action which worked around this issue.

Overwriting existing PDF files causes flow to fail

During testing of this flow I also noticed that when triggering the flow based off updating a site page, the create file create file action would error with a status 400 error saying “A file with the name [file name] already exists”.

I’ve wrote a separate post on how to overwrite files using the create file action, but basically the answer was to turn off chunking within the actions settings.

References


2 thoughts on “How to convert SharePoint pages into PDF files

  1. Anthony July 11, 2022 / 10:23 pm

    Hi, thanks for the comment. You could add another step to the flow to create a variable in which you format some HTML to include the page title and then the output JSON as the main content, then convert that into a HTML document then PDF

    Like

  2. Jess D. July 11, 2022 / 9:51 pm

    Is there a way to get the site page title to appear on the PDF?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s