Healthbase Blog

PDF attachments in HL7 messages

Last year in Australia there was an agreement amongst a number of parties, led by the Australian Department of Health, for the results of diagnostic tests to be sent to the national electronic health record system for accessing by both care providers and the individuals for whom the tests were performed. The current proposal to support this specifies that a single PDF report for each test be sent by the laboratory through to the national electronic health record system.

That proposal has led to an increased interest in PDF renditions of pathology reports as well as kindling an interest in how to add attachments to both HL7 v2 messages and CDA documents ( and probably via FHIR now too ). Of course, PDF lab report files are not the only sort of attachment that can be carried by either of these two media. Images, HTML, RTF, Word documents, and a number of other formats can be used. Electronic referrals and discharge summaries are also likely candidates for carrying attachments of various kinds.

For pathology test result reporting in particular, there  has been a mechanism adopted in Australia for many years for carrying a human-readable rendition of the report in HL7 v2 messages. Although I have never seen any figures cited, HL7 ORU messages using some variant of this convention probably constitute close to 100% of all electronic pathology result reporting in Australia. The convention is enshrined in Australian Standard AS4700.2 and the associated HB262 Handbook for pathology messaging, both freely available at This convention states that the last OBX segment ( denoted the “display segment” ) in the message should carry the full human-readable report, including supporting some formatting and highlighting. The latter is often used for displaying abnormal results.

What, I suspect is less known, is that many renditions of this human-readable report carry information that is not available in the atomised data carried in the other OBX segments. This is particularly likely to be the case for reports formatted in PIT or with HL7’s FT (Formatted Text) datatype. This additional information often includes such things as the results of the same test carried out on the patient at the same lab previously, so that changes or trends can readily be seen. It also often includes the names of all the tests that were ordered for that patient through the current order, and which of those tests are still awaiting results. So this display segment very often essentially mirrors as much as is practicable, a paper copy of the results that might be posted or faxed to the recipient health care providers ( Very occasionally the patient may even get to receive a paper copy! ).

One potential advantage of a PDF rendition of the full report carried as an attachment to the message or CDA document is that it can actually fully mirror the paper version. There are medico-legal attractions to this as well as safety and quality advantages. There are probably some potential safety and quality downsides, too.

So finally, I get to the crux of this article – how to carry a PDF attachment in an HL7 v2 message. Well it should be in an OBX segment. For diagnostic test results according to AS4700.2 and HB262, a single PDF is assumed and it should be the final OBX segment ( or penultimate if there is a digital signature ). The datatype (OBX-2) should be “ED” for encapsulated data. The identifer/name of the observation in OBX-3 should be “PDF^Display format in PDF^AUSPDI“.

The part most people seem to struggle with is the actual contents of OBX-5, the Observation Value. This must conform to the specification for ED datatypes in HL7 v2. This is where v2 has a problem. Well multiple problems, actually. The ED datatype has multiple parts, but for HL7 v2.2 through v2.4, PDF files aren’t strictly supported. Firstly, since PDF files can contain binary (non-ascii) data, an encoding of the PDF file is required. HL7 v2.2 through v2.4 does support base64 encoding, but beyond this support, the ED datatype lacks the necessary values in other components. This is because these versions of HL7 rely on identifier values defined in HL7 tables – as distinct from user defined tables. The set of values for ED component 2 Type of data is HL7 table 0191. The closest candidates available in this table are TX or TEXT. The set of values for ED component 3 Data subtype is table HL7 table 0291. No value of “PDF” is available.

There is an undocumented convention for some Australian HL7 messaging standards that recommend overwriting HL7-defined tables for the HL7 version of the message with the latest published values. If you apply this rule, then you could look for the relevant values to support encoded PDF files from versions of HL7 2.5.1 or 2.6 or 2.7 or .. and apply those values to an HL7 v2.3.1 message say, and still be compliant with Australian Standards ( or so I have heard ). However, from HL7 v2.6-ish upwards, the table number for ED component 2 was changed from 0191 to 0834 Mime types. The table number of ED component 3 remained 0291 Data subtype, but all the previous version values mysteriously disappeared from the table.  

The bottom line of all this is that Standards Australia recommends using an overall segment looking something like:-

OBX|1|ED|PDF^Display format in PDF^AUSPDI||^application^pdf^Base64^JVBERi0xLjQNCiXT9MzhDQoxIDAgb2JqDQo8PCAvVGl0bGUg… ….Ng0KJSVFT0YNCg==||||||F|||20150714082250+1000

Other variants of OBX-5 that could legitimately be supported might start like: ^TX^PDF^Base64^…   or  ^TEXT^PDF^Base64^…  or ^text^PDF^Base64^…  or ^AP^PDF^Base64^… or ^application^PDF^Base64^… or …

I have seen people use “BASE64” or “base64“. HL7 standards v2.2 through v2.4 say nothing about supporting variations in case for HL7 table values. If these are to be supported, then this becomes the thin edge of a very long wedge. Other errors people make are often incorrectly creating base64 versions of the PDF. These should not contain space characters nor end of line characters of any sort.

Test  a message with an embedded PDF

You can find a sample message file with an embedded, encoded PDF file at the bottom of the Standards Australia IT14 samples section of the Healthbase pathology message viewer. If you select the clinical or technical view the PDF can be rendered for viewing in-browser for most default setups. You can, of course, load any test file for viewing using the upload function of the viewer. This PDF-rendered view will not work in all browsers on all systems, but should generally work at least in Mozilla Firefox on any platform and Safari on Mac and IOS.

Creating a test message with PDF attachment

Most systems with POSIX command line tools such as Mac OS X, linux, cgywin, etc. will have sufficient tools to readily produce a suitable base64 encoded PDF fragment. For the embedded PDF test I generated as per above, I simply took the PDF file and ran the following:-

uuencode -m pdf_sample_1.pdf pdf_sample | tail -n +2 |sed  -e ‘$ d’ | tr -d ‘\n’ >pdf_sample.b64

The resulting fragment simply needs to be placed in the data field of OBX-5, i.e. OBX-5.5.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Powered by WordPress