By Sharon Machlis
Executive Editor, Data & Analytics, InfoWorld |
There are several ways to create a Word document from programming languages, including R Markdown and the officer package with R and the python-docx library in Python. But one of the newest and more intriguing is Quarto, a free, open source technical publishing system from RStudio (now Posit) that’s native to Python and Julia as well as R.
One of the big advantages of Quarto is that, unlike a Word-specific package, the same Quarto file with minor tweaks can be used to generate dozens of output formats in addition to Word, including PowerPoint, HTML, and PDF. (Find out more: “What is Quarto? RStudio rolls out next-generation R Markdown”.) In addition, you can automate the creation of Word reports and include results of your analysis and visualization code.
Here’s how to use Quarto to create Word documents.
Because Quarto isn’t a language-specific library, you install it like any other stand-alone software. You can find binary downloads for Windows, macOS, and Linux on Quarto’s “Get Started” page.
If you’re an R user and you have an up-to-date version of RStudio, Quarto should be included by default. You don’t need to install Quarto separately.
If you want to use Quarto in Visual Studio Code, install the Quarto extension in addition to the Quarto application software. To render Quarto documents that include Python code, my system also instructed me to install Jupyter Notebook by running python3 -m pip install jupyter
.
You can create and render Quarto files with any plain text editor and your terminal, just as you can with R or Python scripts, since they are plain text and not binary files. However, you’d miss out on all of the built-in tools of an IDE, such as code completion suggestions and a render button.
Once you’ve got Quarto installed, you can create a new Quarto file in your IDE the usual way, either File > New File > Quarto Document
(not Quarto Presentation) in RStudio, or File > New File
in VS Code and choose “Quarto” as the language.
In RStudio, you’ll have a choice of a few Quarto document output formats. Select Word, and you can then either auto-generate a Word sample document or a blank doc. It can be helpful until you’re familiar with Quarto syntax to see what the sample looks like.
Sample Quarto document generated by RStudio when selecting Word output.
The default YAML header in RStudio includes a title, output format (in this case docx for Word), and editor (visual WYSIWYG or source).
If you’re starting with a blank document in VS Code, you can add the basic YAML header at the top:
As far as I know, there is no WYSIWYG Quarto editor in VS Code, so there is no reason to specify an editor.
Then start creating your content.
Quarto uses Pandoc’s version of Markdown syntax for writing text. That includes single underscores around text you want in italics, double asterisks for text you want to bold, blank lines between paragraphs, two or more spaces at the end of a line to create a line break, and hash symbols at the start of a line to signify header font size. A single hash indicates the largest font size, h1; two is the second largest, h2; and so on.
Some CSS-based document styling designed for Quarto HTML output formats won’t work when exporting to Word. However, you can create a separate reference style Word document with font styles, sizes, and such for your document.
The code below should be run in your terminal (not R or Python console) to create a default Word styling document, in this example called my_doc_style.docx (you can call it anything):
This creates a regular Word .docx file, not a Microsoft Word .dotx template. You can open your reference .docx and customize its styles as with any Word document by opening the Styles panel from the Word ribbon.
To use the template in a Quarto doc, add it to the document’s YAML header with syntax like this:
There are other customizations available for Quarto Word documents, such as adding a table of contents or section numbering, which you can see in the Quarto documentation for Word.
One of the best things about generating a Word doc from R or Python is the ability to run code and add results to your document—including graphics.
You do this by adding code chunks to your Quarto file, which are set off by three backticks, like this for R:
or this for Python:
You can set options for a code chunk such as whether to display the code (echo), run the code (eval), show code warning messages, and so on. Chunk options start off with #|
(often referred to as a “hash pipe”) for R, Python, or Julia.
The chunk options below would show results of R code in a chunk but not display the code in the Word doc:
Other options include #| fig-cap: My caption
for a figure caption, #| warning: false
to not display any warning messages when code runs, and #| cache: true
to cache results of a compute-intensive chunk where data won’t change.
You can execute code within the figure caption option by using !expr
with syntax such as
You can render a Quarto document in RStudio or VS Code by using the Render button, the keyboard shortcut Ctrl/Cmd + Shift + K
or do so with the terminal command
for a document named my_quarto_document.
R users can also use the quarto R package’s command
Note: Occasionally, the initial Word document preview that popped up from RStudio in early versions didn’t always display my graph. That seems to be fixed. However, if that happens to you, try duplicating the initial .docx file as a new, editable Word document, since that fixed the issue for me.
Being able to create Word files with results of your code is useful not only for single-time documents. It also lets you streamline regular data reporting and updates with code that pulls new data from an external source, runs new calculations, and generates up-to-date graphs with a single render call.
But Quarto also has the ability to add parameters to a report, which are like variables defined externally during rendering. That lets you use a report as a template and create the same report for different parameters like cities or regions. For example, if you needed to run one report for each of 10 cities, city
could be defined as a parameter in your document’s YAML header, such as
That sets a parameter named city with a default value of New York. You can then access the value of the city parameter in your R code with params$city
, such as
To create multiple reports in R using the same Quarto document but different values for the parameter, I typically create a function to render my document and then use the purrr package’s walk()
function to run my function on a list of items. For example, if my parameterized Quarto document is named params_test.qmd with one parameter named city, this could be my render function in R:
And this is how I’d use my function to generate three separate documents for New York, Chicago, and Los Angeles:
Python syntax is a bit different and is based on the papermill library. For example, defining a parameter is done in a Python code chunk that would look like
You can read more about parameterizing Python documents in the Quarto Parameters documentation.
If you’re interested in R and more tips about R, head to the Do More With R page!
Sharon Machlis is Executive Editor, Data & Analytics at IDG, where she works on data analysis and in-house editor tools in addition to writing and editing. Her book Practical R for Mass Communication and Journalism was published in December 2018.
Copyright © 2022 IDG Communications, Inc.
Copyright © 2022 IDG Communications, Inc.