How to convert SharePoint pages into PDF files

In this post we step through how you can use Power Automate to convert modern SharePoint pages into PDF files and save them to a document library.

Intro

Recently I got asked to come up with a way to turn SharePoint pages into PDF files for use in an offline scenario. The converted SharePoint pages didn’t need to be formatted as it was only the body content of a SharePoint page that was needed. Also part of the brief was that when the SharePoint page is updated, the corresponding PDF file also updates.

There are several posts online that cover very topic this that I’ll reference at the end, but they didn’t quite do exactly what I wanted – so here’s my take on how to convert SharePoint pages into PDF files!

What you’ll need

  • A modern SharePoint site pages library (these come with every SharePoint site!)
  • A OneDrive location to temporarily store the SharePoint page outputs
  • Power Automate to build the automation
  • A document library to store the output PDF files

A note on the site pages library

In my example I didn’t want all the site pages to be converted into PDF files, so I added a choice column to ‘tag’ all the pages that should be converted. I set the default value of the choice column to be ‘Site Page’, so that the only pages that get converted are the ones I’m interested in. This is reflected in the flow below with the condition step.

Add a choice column to ‘tag’ the pages you wish to convert to PDF.

Building the flow

The trigger action for our flow is when a file is created or modified (properties only). This allows us to re-run the flow when SharePoint pages are updated to also update the PDF files.

  • Select the site you are using to create the SharePoint pages in site address (If you don’t see it listed just press enter custom value and paste the URL in)
  • Select the Site Pages library under library name

Next, I’ve added a condition to only convert pages that have been tagged ‘Runbook’ to PDF.

Condition: if Document type value is equal to ‘Runbook’.
  • Note: make sure you select the Value dynamic content for your choice column, rather than the choice column itself as that will break your flow.

If yes, next is a send an HTTP request to SharePoint step. Here I’m using a REST API call to get the body content of the SharePoint page.

Use a send an HTTP request to SharePoint step to get the body content of your page.
  • Set the site address to the site in question
  • Set method to GET
  • Enter the following in Uri:_api/web/lists/GetByTitle('Site%20Pages')/items('ID')/CanvasContent1
  • Replace ‘ID’ with the dynamic content ID from the when a file is created or modified step

Note: The output of this step generates some additional stuff you probably won’t want in your PDF like this:

 "d": { "CanvasContent1": "}}

I used the parse JSON step to remove the unwanted mark up and just get the plain text from the body content.

  • I added the body dynamic content from the send an HTTP request to SharePoint step in the content field in the parse JSON step
  • I copied the the output body from send an HTTP request to SharePoint of a successful run in flow history and pasted it into the parse JSON step
Output body from send an HTTP request to SharePoint to paste into the parse JSON step from a successful flow run.
  • I then pressed generate from sample, which output the following:
{
    "type": "object",
    "properties": {
        "d": {
            "type": "object",
            "properties": {
                "CanvasContent1": {
                    "type": "string"
                }
            }
        }
    }
}

Parse JSON step with generated schema.

From this I then used a create file action to create a temporary HTML file in OneDrive (more on this later), with the following config:

  • Folder path: / (root of the OneDrive account)
  • File name: Name from when a file is created or modified step
  • File content: CanvasContent1 from the parse JSON step
Create file action to create temporary HTML page in OneDrive.

Next, a convert file step to convert the HTML page into a PDF file:

  • File: ID from the create file step
  • Target type: PDF

Now we can use a create file action to create a PDF in our output document library in SharePoint:

  • Set the site address to the site you want to store the PDF files in
  • Set the folder path to the document library, or navigate to the relevant folder within that library
  • Set file name to file name from the convert file step
  • Set file content to file content from the convert file step
The create file action creates the PDF file in the destination document library.

I then used an update file properties action to pass metadata from the site pages library to the destination document library – this step is optional. Finally, a delete file action to delete the temporary HTML file from the OneDrive we created earlier:

Delete file action to remove the temporary HTML file.

Here’s the flow in it’s entirety:

Issues & troubleshooting

Formatting issues with the send an HTTP request to SharePoint

As mentioned above, when just using the send an HTTP request to SharePoint action, the output contains mark up that isn’t going to make sense within the PDF. The parse JSON action cleans this up and just leaves the body content of the page.

Create file action creates corrupt PDF files

When testing this flow out I originally didn’t have the convert file action in place. In the file name I added ‘.PDF’, but every time the output PDF was corrupt and errored like this when trying to open:

The flow also failed on this step and the error said that “Conversion of this file to PDF is not supported. (InputFormatNotSupported / pdf)”. I decided to scrap this approach and create a HTML page and add in the convert file action which worked around this issue.

Overwriting existing PDF files causes flow to fail

During testing of this flow I also noticed that when triggering the flow based off updating a site page, the create file create file action would error with a status 400 error saying “A file with the name [file name] already exists”.

I’ve wrote a separate post on how to overwrite files using the create file action, but basically the answer was to turn off chunking within the actions settings.

References


How to overwrite files using the create file action in Power Automate

In this post we explore a common error experienced when trying to update existing files in SharePoint using the create file step in Power Automate – and how to resolve it.

The problem

I recently ran into an issue seemingly many others have encountered when trying to update an existing file in SharePoint using the create file step in Power Automate. When running my flow, I received the following error:

Bad Request error in Power Automate when trying to use the create file step to update an existing file in SharePoint.
  • Bad Request status: 400
  • Message: A file with the name [NAME OF FILE] already exists. It was last modified by [NAME OF USER] on [DATE]

The solution

The solution to this has already been shared a fair few times via the Power Automate blogs, but here it is:

  • Press the ellipsis … > settings within the create file step
Open the ellipsis and press settings within the create file step.
  • Scroll down until Content Transfer > set allow chunking to off
  • Save and re-run your flow

That’s it, now when I use the create file step to update existing files, it runs successfully!

Note: Chunking of content is used for splitting up large content for downloads/ uploads. You may need to consider this before turning off chunking in your flow. More information on chunking can be found here.


Description column in SharePoint libraries – how to use

In this post we take a look at the description column that is available within SharePoint Online document libraries and how to make use of it.


There is a description column available in modern SharePoint document libraries. This description column comes out of the box when you create a new document library, or when you add/remove columns from the existing Shared Documents library that comes with every Team/ SharePoint site.

The problem

The description column exhibits some very strange behaviour. First of all, when you try add it to a library view it does not appear within the properties pane, nor is it a column you can add via edit “columns”.

When adding the properties column to document library views you are not able to edit the column.

If you add the column into a document library view and switch to edit in grid view mode, the column becomes read only meaning you can’t add anything into any of the cells within the library.

If you try to edit the column via edit in grid view the cells within the library are read-only.

The solution

This is more of a workaround than a solution as this appears to be a bug rather than desired behaviour. If you make the column mandatory, then optional it seems to do the trick! To do this:

  • Left-click on the Description column heading > column settings > edit
  • Press more options > set require that this column contains information to yes
  • Press save
Set the Description column to required, then optional to enable editing.
  • With this setting changed, when you edit your document library in grid view now you are able to enter data into the Description field
Once you have made the Description column required, you are able to add data via grid view.
  • Go back to edit the Description column settings and switch off require that this column contains information to make it optional

Add Description column to properties pane

Although the above will allow you to add data to the Description column, it will not add the Description column into the properties pane. You can get the Description column to appear by adding it to the default content type in the document library. Here’s how you can do it:

  • Press the cog > library settings
  • Under general settings > advanced settings
  • Set allow management of content types to yes
  • Press OK
Set allow management of content types to yes in the document library.
  • Scroll down to content types > select the document content type
  • Press add from existing site or list columns
Press add from existing site or list columns to add the Description column to the Document content type.
  • Ensure the Description column is selected > press add
  • Press OK
Add the Description column into the Document content type.

Go back to your document library, select a document and open the properties pane. You will now see the Description column is displayed in the properties pane, as well as being an editable field when you press edit all.

Bonus! History lesson and give your feedback

This description column appears to be a rather recent addition to all tenants, making it’s appearance in summer 2021 with the advent of Microsoft Lists. I went back to SharePoint 2010 just to double-check and sure enough the Description column was not present!

Out of the box SharePoint 2010 document library with no description column.

I’ve added a new item to the Microsoft feedback portal (UserVoice replacement) in the hope this gets addressed by the SharePoint product team so if you could upvote it that would be great!


Edit in grid view button missing – how to resolve

In this post we take a look at the common causes for the edit in grid view button to not be visible in SharePoint Online & Microsoft Lists.

The problem

Edit in grid view was made generally available to all Microsoft customers in February 2021 and is available for lists and document libraries in SharePoint Online or Microsoft Lists. I recently had an issue reported to me that the edit in grid view button was missing from the ribbon in a SharePoint Online document library.

After taking a look myself, sure enough this was the case and the option wasn’t present. The first thing to note about this particular document library was that the default view had grouping enabled on a particular column. In trying to replicate the issue, at first when I created a new document library the edit in grid view button was present:

I noticed that when I applied the same grouping to the view the edit in grid view button disappeared!

Applying a grouping to a library view causes the edit in grid view button to disappear.

The solution

The solution for this is more of a workaround as this appears to be a Microsoft bug. I decided the best way to get around this was to create a specific view that defaults into grid view mode when selected. To do this:

  • With your library open > press the cog > library settings
  • Scroll down to views > press create view
  • Select datasheet view
  • Give the view a name > select the columns you wish to display > press OK

Now you have a view that defaults to grid view without users having to select it!

Other ways around this issue would be to:

  • Remove the grouping for the view in question to allow edit in grid view
  • Create a new view with the grouping removed and show users how to find it to edit in grid view

Unfortunately this seems to be an bug that although it has been raised with Microsoft, the SharePoint UserVoice has since been shut down so it’s unclear if it is being worked on or not. You can raise feature requests through the Microsoft feedback portal.

Bonus – free history lesson!

Out of curiosity I wanted to see if this was an issue in SharePoint 2010 as I was sure I would have come across it by now. As expected, it wasn’t and datasheet view works fine when views have groupings within them.


How to break permissions inheritance on large libraries/ lists in SharePoint

This post describes a long-standing issue with managing permissions for large libraries or lists in SharePoint Online and gives a workaround for how you can break permission inheritance.

Intro

If you have ever tried to migrate a large volume of data into SharePoint libraries or lists you will have likely encountered an issue with trying (and failing) to break permissions inheritance on lists/ libraries following the migration.

I was recently dealing with this limitation myself as part of a migration project. We had migrated terabytes of data from on-premise file servers into SharePoint Online libraries and as we were beginning to break permissions inheritance on the library saw this error:

The issue

Faced with the problem of needing to secure lots of data that, based on the error above didn’t seem possible – I decided to refresh my memory on just what the limits of lists/ libraries are in SharePoint Online:

A list can have up to 30 million items and a library can have up to 30 million files and folders. When a list, library, or folder contains more than 100,000 items, you can’t break permissions inheritance on the list, library, or folder. 

Microsoft docs – SharePoint limits

With the above being true, the error mentions the list view threshold – which is the way SharePoint throttles and limits resources that govern the amount of data and throughput that can be managed. The list view threshold is set to approximately 5000 items by Microsoft and cannot be changed.

Now the issue with not being able to change the list view threshold, although unique to SharePoint Online isn’t a new thing – in fact it seems to have been a problem for lots of people for a long time. So if we are unable to change the list view threshold, what options do we have?

Well as it turns out the list view threshold error appears to be a bit of a red herring. I had taken a look at Microsoft guidance for managing large lists and libraries – creating indices within the library, but this had no affect as the total number of items in my library exceeded 100,000.

An example of a library that had over 100,000 files/ folders in SharePoint Online.

Workaround

The answer for me came from an inspired post on the SharePoint Stack Exchange! I give full credit for the simple, yet brilliant way to get this to work to Kasper Bo Larsen, who suggested that you should just delete enough stuff to get under 100,000 > break permissions inheritance > then restore the deleted items.

This works a treat!

Now it’s not perfect, nor would I call this a fix but if you have already migrated your data then this certainly will work for you.

Things to note

#1 Remember to break inheritance before any migration tasks begin!

I know this is a super obvious one and hindsight is always 20/20, but I’ve learnt from this mistake and built it into any future migration runbooks to break permissions inheritance before starting any migration tasks. It will save you a lot of hassle I promise…

#2 Trying to break permissions inheritance via PowerShell will yield same results

I had the thought during my investigations of this that maybe trying to break permissions inheritance via PowerShell might supersede some of the restrictions deployed to us mere GUI administrators – this was not the case and the PowerShell route fails just the same.

#3 There are no service limits or boundaries for the SharePoint recycle bin

I tried to find out if there were any limits to how much data you can restore from the SharePoint recycle bin in one go – I couldn’t find anything. With that said, I was able to restore over 60,000 files/ folders in one sitting back to it’s original location after breaking permissions inheritance so I don’t believe that will be a blocker.


How to show the folder path of a file in library views

Introduction

This post looks at ways in which you can show the folder path of a file as a standalone column within a SharePoint document library view.

UPDATE: I’ve updated this post after some comments asking how to do this for modern SharePoint libraries. Click here for more.

Classic SharePoint

The scenario

A common request I get is:

How do I see what folders/ sub-folders my files are in at a glance

– all users everywhere

Out of the box, there aren’t any columns available that you could potentially leverage to display this information in a standard SharePoint 2010 library.

The solution

So, just by adding one value in SharePoint Designer, here’s how you do it:

  • Navigate to the library you wish to change, create a new view under Library Tools > Library > Create View
  • Choose the relevant format of your view, give your view a name and press OK
  • Open SharePoint Designer > Open the site > open the library you were just working in
  • In the Views pane > click to open the view you just created
In SharePoint Designer, clicking on the view name will open the view in edit mode
  • In the code editor window, scroll down until you see something like the following:
<ViewFields>
	<FieldRef Name="DocIcon"/>
	<FieldRef Name="LinkFilename"/>
	<FieldRef Name="Modified"/>
	<FieldRef Name="Editor"/>
</ViewFields>
  • Add the following field reference in between the opening <ViewFields> and closing </ViewFields>
  • Add the field reference in the display order you would like it to appear in the view
Add the field reference to the View Fields list
  • Press the Save icon to save your changes
  • Press the Preview button to see your view in action in the browser

Now you will notice there is a new column being displayed “Path”, that is showing us the full location of the file or folder in the libary. You’ll also notice that this path will display data when at the library root, or in any folders or sub-folders in the library.

Library root displaying a files path
File in sub-folder displaying relative location

Bonus

Taking this one step further, what if we wanted to show files of a certain type, then create a view that groups these files by their folder location? Guess what, that’s exactly what I did!

  • Navigate to your library > create a new view as before, this time base your new view off the one you just created
  • If you wish to only show files of a particular type, use the filter by settings (for example below is filtered to only shows Word documents)
  • Make sure “show all items without folders” is selected
  • Press OK
Filtering to only show word documents, also showing items without folders
  • Back in SharePoint Designer > Open up the view you just created
  • Scroll down until you see the opening <Query> tag and add the following beneath it:
<GroupBy Collapse="FALSE" GroupLimit="30">
	<FieldRef Name="FileDirRef"/>
</GroupBy>

Save and preview your view, it should now be grouping by the Path field:

I know this has proven really useful for my company, so hopefully this helps out someone else too 🙂


Modern SharePoint

The scenario

With Microsoft retiring SharePoint 2010 designer workflows, plus the movement away from SharePoint Designer in general, a few readers have asked for a solution that works with modern SharePoint.

When researching this I considered whether suggesting to use SharePoint Designer 2013, as the above solution would still work in SharePoint Online using SPD 2013. But, as Microsoft say themselves although SPD 2013 remains supported, it’s depreciated – so I decided to go in a different direction.

The below example walks you through how you can create a flow in Power Automate to update a file after it’s been created to have the folder path shown in the document library view:

The solution

For this solution you will need to have access to create Flows in Power Automate, as well as an existing Document Library created in SharePoint Online:

  • Navigate to the document library you wish to show the folder path for
  • Add a single line of text column to the document library > give it a name (I called mine FolderPath)
  • Under the ellipsis, press Automate > Power Automate > Create a flow
  • In Power Automate, either use an existing, relevant template or start from blank
  • The trigger action should be When a file is created (properties only)
  • Set the Site Address and Library Name where you want to add the folder path
  • Insert a new step > select Update file properties.
  • Set the following values for the update file properties step:
    • Site Address: same as previous step
    • Library Name: same as previous step
    • Id: ID
    • FolderPath: Folder path

NOTE: The FolderPath within the Update file properties step is the custom column we created earlier. The Folder path (highlighted in red) is dynamic content available within the step in the flow. The folder path dynamic content is the path to the folder the item is in, relative to the site address.

Ensure you select the system Folder path dynamic content to pull the right data into the custom FolderPath column.

Here’s the flow in it’s entirety:

At this point test and save your flow to make sure it is working as expected 🙂

Bonus #1 – turn your folder path column into a hyperlink column

So if like me you want to take this one step further, wouldn’t it be good if we could easily make our newly showing folder paths, actual hyperlinks to the folders? Well the good news is you can!

  • Navigate back to your document library > click on the FolderPath column > Format this column
  • Under Apply formatting to make sure FolderPath is selected
  • Paste the following JSON into the custom formatting box:
{
  "$schema": "https://developer.microsoft.com/json-schemas/sp/column-formatting.schema.json",
  "elmType": "a",
  "style": {
    "color": "blue",
    "font-weight": "bold"
  },
  "attributes": {
    "target": "_blank",
    "href": "='YOUR SHAREPOINT SITE URL' + @currentField"
  },
  "txtContent": "@currentField"
}

NOTE: for more information on turning field values into hyperlinks, check out this awesome sample from sp-dev-list-formatting.

  • Press Save
  • Your FolderPath column values should now be legitimate hyperlinks that click through to the relevant folders

Bonus #2 – update existing files in the document library

This was another suggestion from a reader with regards to how to update files that existed in the document library before the flow was created.

Running a flow manually for individual files

When I began to consider how to do this I started by looking at ways to manually start the flow.

It appears the only real way to do this is to create a new column that adds a button next to each file, that allows you to run the flow. I’m not really enamoured by this approach as it doesn’t seem ideal to have an extra column to run a flow showing on every file in your library. If this is something you would like to pursue then I would recommend this great article by WonderLaura who has the process of creating a button to trigger a flow covered!

Update our flow to update all files if folder path is empty

My solution to this problem was to update the flow we created earlier to get the properties for all files in the library, then add a condition that checks if the FolderPath column is empty, then if yes runs our flow as before.

  • First, I added a Get files (properties only) action which gets all the files from the source library
  • Then I added a new Condition action, which simply checks if the FolderPath column we created is equal to null. You will also notice a new Apply to each action will be created
  • I then moved the previous Apply to each action into the “If yes” condition
  • I left “If no” blank, as it is ok to leave a condition blank if you don’t want it to do anything

Here’s the updated flow in full, which changes highlighted in red:


Problems creating list or library views based on created date

The situation

Data retention and deletion…I’m sure this is a something that anyone involved in Office 365, SharePoint on information management in general gets fed up of saying since the recent GDPR legislation!

Recently we have been rationalising and cleaning up our data in preparation for moving to Office 365. We are starting with SharePoint as the first target repository or silo of content.

The general consensus is to delete files and folders over 7 years old unless there is a pre-existing data retention policy to adhere to. So the next task is to identify those files that fall within our threshold, and ultimately delete.

Luckily, we have Tree Size Pro and ShareGate so I was able to relatively easily identify the files in question (there were a lot!).

The setup

As our SharePoint environment is a) rather full; and b) rather old, I made the decision to incrementally delete files rather than en-masse to mitigate risk, targeting the lists/libraries containing the most out of date content. I started by creating a view in the first library – library A with the following parameters:

  • Standard library view
  • Filtered by Created Date if less than or equal to 01/01/2011
  • Folders or Flat: Show items inside folders
    Show this view: In all folders

(all other settings are left default)

Results this returned looked good, I could see folders and files in this view that matched the criteria – brilliant! Based on my previous statement I decided to delete in batches out of working hours, again to mitigate risk. I deleted first from library A, then from the first stage and finally from the second stage recycle bin all in this fashion.

The problem

I had permanently deleted around 50% of the total volume of content to be deleted from library A when we started to receive reports of current files being ‘missing’ from library A…not a good day.

After these reports were investigated they were indeed true. It turns out that when folders are included within a library view, folders that match the filter will be shown in the view, regardless of whether the files inside match.

We tested the view exluding folders and all the files returned matched the filter criteria. The same results were demonstrated from a SharGate report of the same nature. The report of all files over 7 years old brought back folders over 7 years old, but they also contained files that were newer.

Conclusion

At present, we are not entirely sure as to why these filters are not able to drill down past a top-level folder. It appears to be difficult to specify via view settings to only show files within folders, including the folder itself that matches the criteria.

We have decided to omitt folders from our reports and views going forward and to solely focus on files as this is the most reliable way we can delete files.

Bonus: for those of you with ShareGate, heres an example of my report we created to bring back all files over 7 years old, excluding folders. I ran this report across the entire intranet application over a weekend and it worked a treat 🙂

SG-report

Hide a SharePoint list or library from view all site contents

Have you ever been asked to hide a list or library from a SharePoint site? If so, you go straight for selecting ‘no’ to displaying the list or library on the Quick Launch or removing it from the navigation. However, your eagle eyed users notice the handy view all site contents option and see that it is still listed there – they want it gone!

Luckily, all you need is SharePoint Designer and it is as simple as a click of a button…

(These steps were created using SharePoint Server 2010)

  • Open the site that where list or library resides in SharePoint Designer
  • Under Lists and Libraries – Select the list or library you wish to hide
  • On the main list settings page – find the Settings section
    SPD
  • Check the Hide from browser option
    hidefrombrowser

Thats it! when you option the view all site content page now, that list or library will no longer be showing. Also, if you want to re-instate it at a later date, just un-check the box and it will re-appear.

This also works for SharePoint 2013, 2016 and SharePoint Online, under the site contents page.

Quirks of routing document sets

Recently I encountered an issue with a process that was created that routed document sets from one library into another. There were a number of document sets that followed this process that encountered errors when trying to move.

The process was automated via a standard sharepoint workflow, typically there were no useful errors in the workflow history list – just error occurred! Well as it turned out there wasn’t any issue with the workflow or the process of moving document sets to a drop-off library.

There are however, limitations when moving document sets to drop-off libraries to consider, such as:

Document Sets over 50MB in size cannot be moved to the drop-off library

Whether this is done using a workflow, or the send to feature a document set can’t be any bigger than 50MB.

docsets-filesize-error

Document Sets containing folders won’t move

By default, you aren’t able to create folders in document sets using native sharepoint controls. However you can bypass this by opening with explorer view. This will still result in the document set erroring when trying to move.

XML file types will cause issues

If your sharepoint environment allows for this type of file, it will still cause the document set to fail to move to the drop-off library.

Routing of Document Sets requires timer job

When document sets are sent to a drop-off library, they will wait there until content organizer rules route them to their destination. This process is is driven by the content organizer processing timer job, which runs daily by default.

If you have found any other quirks when routing document sets, please let me know…otherwise it’s a piece of cake 🙂