The Word to PDF problem

Converting a .docx to PDF should be simple. The reason it isn't tells you a lot about how software gets complicated.

Rebecca had been working on the proposal for three weeks. It was 22 pages: a cover sheet, an executive summary, a technical section, two appendices, and a lot of careful formatting. The deadline was 9am. At 8:40 she sat down to convert it to PDF, which she had assumed would take thirty seconds, and discovered it was going to take considerably longer.

Her laptop didn’t have the right version of Word to use “Save as PDF” reliably — the output kept shifting the margins on page 11. The online converter she tried first wanted her to create an account. The second one had a 5MB file limit and her document was 8MB. The third one worked, or seemed to work, until she opened the output and found that the custom font she’d used for the section headers had been replaced with something called Helvetica Neue, which wasn’t wrong exactly, but wasn’t right either. She emailed the PDF anyway. The proposal won. She still doesn’t know what happened to her file.

The Word to PDF problem is strange because it seems like it should be trivial. Word documents and PDFs are both document formats. They both represent text and images on pages. Converting between them sounds like it should be like converting a JPEG to a PNG: a mechanical operation with a predictable result.

It isn’t, because Word and PDF are philosophically different things.

Word is a flow format. Its fundamental unit is a stream of content — text, images, objects — that gets laid out according to rules: paragraph styles, page margins, column widths. The same document will look slightly different in different versions of Word because the layout engine — the software responsible for deciding where each line of text breaks, where images anchor — is not perfectly consistent across versions. It’s also not standardised. Microsoft’s implementation of DOCX rendering is not identical to Google Docs’ implementation, which is not identical to LibreOffice’s, which is not identical to any online converter’s.

PDF, by contrast, is a fixed format. Its fundamental unit is a positioned object: text at coordinates x,y in font F at size N, image at coordinates a,b with width W and height H. There is no flow, no reflow, no interpretation. What the creator specified is what gets rendered, exactly, on any device, forever. This is why PDFs look the same everywhere: because everything about the layout is baked into the file.

Converting from one to the other requires a layout engine to make thousands of micro-decisions: where exactly does this paragraph break, how much space does this heading need, how should this table distribute its column widths. Different engines make different decisions. This is why Rebecca’s font disappeared — the online converter’s layout engine didn’t have her custom font installed, so it substituted the closest one it knew about.

The online converters that handle this well are ones that use a real, full-featured layout engine — essentially a browser engine running headlessly on a server. This works. It also means your document travels to a server, gets rendered by someone else’s browser, and comes back as a PDF. For a business proposal, a CV, an NDA, this is a meaningful data transfer.

The insight that changes the equation is that your browser is already that layout engine. Chrome’s rendering engine, which powers most browsers and Electron apps, is exactly the kind of full-featured renderer that produces accurate DOCX-to-PDF output. The browser can load a DOCX file using a library like docx-preview, render it into a DOM with accurate fonts and layout, and then use the browser’s built-in print function to produce a PDF — exactly as if you’d printed the page. The PDF inherits the browser’s rendering fidelity. No server. No account. No trip across the internet.

Rebecca’s document would have converted correctly, with her fonts intact, on her own machine, in her own browser, without anyone else touching it.

fwip converts Word documents to PDF in your browser. Your document never leaves your device. Try it →

Try docx to pdf →