How to convert SharePoint pages into PDF files

In this post we step through how you can use Power Automate to convert modern SharePoint pages into PDF files and save them to a document library.

Intro

Recently I got asked to come up with a way to turn SharePoint pages into PDF files for use in an offline scenario. The converted SharePoint pages didn’t need to be formatted as it was only the body content of a SharePoint page that was needed. Also part of the brief was that when the SharePoint page is updated, the corresponding PDF file also updates.

There are several posts online that cover very topic this that I’ll reference at the end, but they didn’t quite do exactly what I wanted – so here’s my take on how to convert SharePoint pages into PDF files!

What you’ll need

  • A modern SharePoint site pages library (these come with every SharePoint site!)
  • A OneDrive location to temporarily store the SharePoint page outputs
  • Power Automate to build the automation
  • A document library to store the output PDF files

A note on the site pages library

In my example I didn’t want all the site pages to be converted into PDF files, so I added a choice column to ‘tag’ all the pages that should be converted. I set the default value of the choice column to be ‘Site Page’, so that the only pages that get converted are the ones I’m interested in. This is reflected in the flow below with the condition step.

Add a choice column to ‘tag’ the pages you wish to convert to PDF.

Building the flow

The trigger action for our flow is when a file is created or modified (properties only). This allows us to re-run the flow when SharePoint pages are updated to also update the PDF files.

  • Select the site you are using to create the SharePoint pages in site address (If you don’t see it listed just press enter custom value and paste the URL in)
  • Select the Site Pages library under library name

Next, I’ve added a condition to only convert pages that have been tagged ‘Runbook’ to PDF.

Condition: if Document type value is equal to ‘Runbook’.
  • Note: make sure you select the Value dynamic content for your choice column, rather than the choice column itself as that will break your flow.

If yes, next is a send an HTTP request to SharePoint step. Here I’m using a REST API call to get the body content of the SharePoint page.

Use a send an HTTP request to SharePoint step to get the body content of your page.
  • Set the site address to the site in question
  • Set method to GET
  • Enter the following in Uri:_api/web/lists/GetByTitle('Site%20Pages')/items('ID')/CanvasContent1
  • Replace ‘ID’ with the dynamic content ID from the when a file is created or modified step

Note: The output of this step generates some additional stuff you probably won’t want in your PDF like this:

 "d": { "CanvasContent1": "}}

I used the parse JSON step to remove the unwanted mark up and just get the plain text from the body content.

  • I added the body dynamic content from the send an HTTP request to SharePoint step in the content field in the parse JSON step
  • I copied the the output body from send an HTTP request to SharePoint of a successful run in flow history and pasted it into the parse JSON step
Output body from send an HTTP request to SharePoint to paste into the parse JSON step from a successful flow run.
  • I then pressed generate from sample, which output the following:
{
    "type": "object",
    "properties": {
        "d": {
            "type": "object",
            "properties": {
                "CanvasContent1": {
                    "type": "string"
                }
            }
        }
    }
}

Parse JSON step with generated schema.

From this I then used a create file action to create a temporary HTML file in OneDrive (more on this later), with the following config:

  • Folder path: / (root of the OneDrive account)
  • File name: Name from when a file is created or modified step
  • File content: CanvasContent1 from the parse JSON step
Create file action to create temporary HTML page in OneDrive.

Next, a convert file step to convert the HTML page into a PDF file:

  • File: ID from the create file step
  • Target type: PDF

Now we can use a create file action to create a PDF in our output document library in SharePoint:

  • Set the site address to the site you want to store the PDF files in
  • Set the folder path to the document library, or navigate to the relevant folder within that library
  • Set file name to file name from the convert file step
  • Set file content to file content from the convert file step
The create file action creates the PDF file in the destination document library.

I then used an update file properties action to pass metadata from the site pages library to the destination document library – this step is optional. Finally, a delete file action to delete the temporary HTML file from the OneDrive we created earlier:

Delete file action to remove the temporary HTML file.

Here’s the flow in it’s entirety:

Issues & troubleshooting

Formatting issues with the send an HTTP request to SharePoint

As mentioned above, when just using the send an HTTP request to SharePoint action, the output contains mark up that isn’t going to make sense within the PDF. The parse JSON action cleans this up and just leaves the body content of the page.

Create file action creates corrupt PDF files

When testing this flow out I originally didn’t have the convert file action in place. In the file name I added ‘.PDF’, but every time the output PDF was corrupt and errored like this when trying to open:

The flow also failed on this step and the error said that “Conversion of this file to PDF is not supported. (InputFormatNotSupported / pdf)”. I decided to scrap this approach and create a HTML page and add in the convert file action which worked around this issue.

Overwriting existing PDF files causes flow to fail

During testing of this flow I also noticed that when triggering the flow based off updating a site page, the create file create file action would error with a status 400 error saying “A file with the name [file name] already exists”.

I’ve wrote a separate post on how to overwrite files using the create file action, but basically the answer was to turn off chunking within the actions settings.

References


Copy of this page option missing in SharePoint

This post describes an observation of how the copy of this page option will be missing for certain pages in SharePoint Online and how to get around it.

Modern pages are great in SharePoint Online…they look good, are easy to author and can be shared really easily once published. However, there are some quirks to the user experience when creating and copying pages, in particular the copy of this page option.

The issue

In the old days of SharePoint 2010, you could only copy or move pages through site content and structure – unless you used PowerShell. Well nowadays it’s as simple as a couple of clicks from the ribbon: 

  • Press + New
  • Select copy of this page
Copy of this page option available in SharePoint Online.

Or at least that’s what I thought! The problem occurs with the default homepages within modern SharePoint sites. When you go to try to make a copy of the homepage you will find the option is not available.

Copy of this page option missing from menu.

Workaround

The way I’ve managed to get around this issue is to make a copy of the homepage in the Site Pages library, then rename it to something more meaningful. To do this:

  • Open the Site Pages library
  • Select the Home.aspx page
  • Press Copy to
  • Leave the copy to location as the Site Pages library > press copy here
  • Press the three dots next to the copied page > rename
  • Give your page a new name > press rename
Copy the homepage, then rename it to something more meaningful.

That’s it, you can now work on a copy of the page in the same way you would using the copy of this page option.