Predictive Soil Mapping with R book

Predictive Soil Mapping (PSM) is based on applying statistical and/or machine learning techniques to fit models for the purpose of producing spatial and/or spatiotemporal predictions of soil variables i.e. maps of soil properties and classes at different resolutions. It is a multidisciplinary field combining statistics, data science, soil science, physical geography, remote sensing, geoinformation science and number of other sciences. Predictive Soil Mapping with R (an Open Access book distributed under the Creative Commons Attribution-ShareAlike 4.0 International License) focuses on using state of the art Statistical and Machine Learning techniques to produce more accurate and more usable maps of soil variables. Follow the book progress via the book’s github repository. The html version of the book is available at

Hard copies of this book can be ordered from By purchasing a hard copy of this book from Lulu you donate $12 to the OpenGeoHub foundation (this fund is used exclusively for scholarships / to support OpenGeoHub summer schools). You are also allowed to print your own copies, in which case you can opt to donate OpenGeoHub by using the button at the bottom of the page. 

Publishing with Lulu

This book is available as Open Access publication which is important distinction from other similar rmarkdown-based books:

  • The authors keep copyright / in this case: all content of the book is available under the Creative Commons Attribution-ShareAlike License,
  • Book costs are about 2–3 times smaller than similar books published through large commercial publishers,
  • To produce updates and new editions, no special permission is required from the publisher,

How to publish your own book via It is a relatively straight forward process and book submission wizard at will interactively help you pick up the best settings. Detailed instructions on how to self-publish your own PDF via are also available here. Few important technical notes:

  • Unfortunately does not support B5 format, which is typically used for many of the Springer and CRC Press R series books. Instead you can use the crown quarto (18.90 x 24.58 cm) paper format, which is even better for wider computer code outputs.
  • Although it is possible to publish your book as full color prints, this is not recommended as it can increase the printing costs significantly. Consider instead using grayscale-friendly color palettes for maps and plots such as the Viridis and then suggesting readers to use html version of the book for full color experience. 
  • To make your custom book cover, consider using Inkscape software / SVG format. It might take few iterations until you get the correct spine width, begin and total page size.

To submit a PDF (version 1.3 currently recommended) with embedded font, flattened transparencies and compatible color formats, consider running the following two Ghostscript commands on your initial document:

gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress \
    -sDEVICE=pdfwrite -sOutputFile=book1.pdf book0.pdf

This will embed all fonts and increase resolution of images (prepress). After that, to convert everything to grayscale we can use:

gs -sOutputFile=book2.pdf -dPDFSETTINGS=/prepress \
    -sDEVICE=pdfwrite -sColorConversionStrategy=Gray \
    -dProcessColorModel=/DeviceGray -dCompatibilityLevel=1.3 -dNOPAUSE -dBATCH book1.pdf

In this case book0.pdf is the initial PDF generated by rmarkdown / pandoc, book1.pdf is the PDF with embedded font and increased DPI, and book2.pdf is the final grayscale press-ready PDF.

Sometimes some of the PDF plots generated using R can lead to memory problems during printing (e.g. plots of several thousands of points) and the final press-ready PDF will not be approved for press by Lulu. In such cases the most robust way to submit a PDF for press is to convert all pages to PNGs and then build a PDF from images (hence no font embedding or fixing of vector graphics will be required). On a Linux OS this can be implemented in two steps. First convert PDF to PNGs using a generic file name:

pdftoppm book2.pdf page -png -r 600 -gray

This will generate long list of PNG files using a generic file naming e.g. “page-001.png, page-002.png, …”. Next build a PDF using original sequence of PNGs:

convert "page*.png" -quality 100 book3.pdf

This process is however, memory and storage consuming (for the book above it used 60GiB of RAM and 12GiB of temporary space on disk). Before you can use ImageMagick to convert few hundred of PNGs to a PDF, you also need to edit the /etc/ImageMagick-6/policy.xml file and increase the memory limitations to allow for heavy processing tasks e.g.:

  <!-- <policy domain="resource" name="memory" value="16GiB"/> -->
  <!-- <policy domain="resource" name="disk" value="16GiB"/> -->

After you finish publishing the book via, it is also a good idea to open a Zenodo repository and then upload all new versions of the book (with an unique DOI) so that anyone can track changes / fixes in the book.

Spread the love