Generating PDFs from an ASP.NET Core app using a Node library

Quick blogpost on converting HTML to PDF from an ASP.NET Core application using a Node library by Marc Bachmann called html-pdf. I've also setup a docker-based sample github repository if you just want to see the final thing.

Create a new project

Let's quickly create a new ASP.NET Core project using the commandline tools:

# create a new project
dotnet new webapi --name PdfSample
# run the project
cd PdfSample
dotnet run
# browse to localhost:5000
# you should see a 404 error

Write the node script

Install html-pdf:

npm install html-pdf --save

And add the node script to be invoked by the ASP.NET application in a Node folder:

// File: Node/createPdf.js
const pdf = require('html-pdf');
module.exports = function (result, html, options) {
    pdf.create(html, options).toStream(function(err, stream) {
        stream.pipe(result.stream);
    });
}; 

The script calls create() from the html-pdf package and pipes its output to the Duplex stream result accessible by NodeServices. The arguments html and options will be passed from the ASP.NET application while invoking the script.

Create an action that invokes the node script

Let's create a controller-action for the / route that invokes our node script and generates a sample PDF:

// File: Controllers/HomeController.cs
public class HomeController : Controller
{
    [HttpGet("/")] // action to invoke for the "/" route
    public async Task<IActionResult> Index(
        [FromServices]INodeServices nodeServices)
    {
        var html = "<h1>Hey!</h1>"; // html to be converted
        var options = new { }; // html-pdf options

        var stream = await nodeServices.InvokeAsync<Stream>(
            "./Node/createPdf.js", // script to invoke
            html,
            options
        );
        return File(
            fileStream: stream, 
            contentType: "application/pdf"
        );
    }
}
  • We create an action for the / route using [Route("")] & [HttpGet("")].
  • It obtains an INodeServices instance from the DI container using the [FromServices] annotation.
  • We invoke the script using the module name relative to the project root and the arguments to be passed to the script.

Register NodeServices with the DI

Before we can run it we'll need to register it with the DI.
We do that using an extensionsion method in the Startup class' ConfigureServices() method:

services.AddNodeServices();

Run the Application

Run the app using dotnet run and the PDF should be served at localhost:5000.

Setup for publishing

The createPdf.js needs to be part of your publish output. You can achieve this by editing the .csproj file and adding a section as follows within the <Project></Project> tags:

<ItemGroup>
  <Content Include="Node\createPdf.js">
    <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
  </Content>
</ItemGroup>

The app can now be published using:

dotnet publish -c Release

The output will be in the ./bin/Release/publish directory by default.
Note that the node_modules folder is not published. You may either use MSBUILD to copy the folder on build/publish by editing the .csproj file like above, or run npm install html-pdf as part of your deploy script.

I prefer the deploy script because I'd like to avoid publishing the front end packages from node_modules.

Setting up docker

I spent more than 8 hours trying to get the setup to work on Docker, which is why I decided to write this post in the first place.

I had two issues while writing the docker file, both relating to PhantomJS. The first error was when trying to install html-pdf using npm at build time. html-pdf downloads a prebuild binary of PhantomJS which is compressed using bzip2. Here's the error message:

tar (child): bzip2: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

The second error was a runtime error where I wasn't able to get a proper error message -- the application would just crash abruptly.

The trick was to install bzip2 for the html-pdf installation to succeed and libfontconfig for PhantomJS to work as expected. You may do that on debian based systems using:

apt install bzip2
apt install libfontconfig

Here's the full Dockerfile. Add it to the root of your project and run it using:

docker build -t aspnetpdf .
docker run -d -p 8080:80 aspnetpdf

Conclusion

That's it. We've seen how to convert HTML to PDF in an ASP.NET Core application using Marc Bachmann's html-pdf with NodeServices. Pretty cool if you ask me!

If you've come this far, you should totally check the GitHub sample and run it. No excuse if you already have docker on your machine 😁


If you're considering following this approach in a real project, here are a few pointers to save you time:

  • PhantomJS currently has issues with custom fonts on Windows. The font will need to be installed on the Windows instance for it to work.
  • PhantomJS is based on WebKit which uses GDI+ under the hoods on Windows. Because of this, we couldn't use it in a traditional Azure Web App. More information here. We ended up switching to Azure Web App for Containers.