2018

1.1 R – what?

R – What is R?

  • R is a language, a software and a philosophy
  • R is open, free, transparent and flexible
  • R is one of the most active languages/programmes

R – What is R, a bit more formal?

  • Free software for statistics and visualisation
  • Open-source GNU project, based on S (John Chambers)
  • primarily accessible via command line
  • High-level, interpreted, functional, object-oriented

Everything that is called in R is a function.

Everything that is created in R is an object.

R – What does "flexible" include?

  • R can be used for
    • Calculations and data visualisation
    • Image and sound manipulation
    • Internet and email scraping
    • Generation of reports
    • Writing of books
    • Generation of presentations

R is no spreadsheet tool because…

  • we do not explicitly see the data,
  • we explicitly write down what is done,
  • we have to think about data structure,
  • we combine functions almost all the time,
  • we usually generate graphical (not tabular) output,
  • we can easily work without touching the mouse.

  • R is object oriented and vector based.

R – Questions and Activities

  • Time: 5 minutes
    • Which five keywords describe R best?
    • What are pros and cons of "free and open software"?
    • Start RStudio, use R as a calculator
      • Calculate the volume of a sphere with 5 cm radius
      • Calculate the square root of the square of the cosine of a number

1.2 RStudio

RStudio – what?

  • RStudio is a graphical user and developer interface for R
  • A second skin to work more efficiently with R

RStudio – a first glimpse

RStudio – Possibilities

  • Available for almost all computer platforms
  • Integration of all relevant tools for R in one programme
  • Support of multiple projects
  • Support of versioning tools
  • Support of interactive graphics
  • Generation of many different types of documents

RStudio – Structure

  • Four windows
    • Script window
    • Console window
    • Workspace window
    • Multi purpose window
  • Size and content can be adjusted all the time
  • RStudio is html based (also the RStudio Server software)

RStudio – Support for scripting

  • "The magic keys": Tab, Arrow up/down, Shift, Control and Enter
  • Syntax highlighting
  • Auto completion
  • Auto indention
  • Execution of lines, snippets and entire scripts

RStudio – Creating projects

  • Projects are self consistent R sessions, including all objects, work spaces, scripts and histories
  • Support consistent and efficient work, and quick switches between different projects
  • Initiation via: Menu Projects > New Project

RStudio – Further reading

RStudio – Questions and activities

  • Time: 5 minutes
    • (Start RStudio)
    • Create a new Project ("R_Intro")
    • Create a new R script
      • Generate a sequence of values (x <- 1:10)
      • Assign to y the root of the square of x
      • Inspect environment and history

Data types and data structures

Data types – what?

  • Data types define how values are treated and which operations are possible
  • R uses the "common" data types as well as more specialised ones
  • Data types can be converted into each other to some extent
  • Data types follow a given hierarchy:

logical > integer > double > complex > character

  • Conversion in the other direction results in loss of information

Data types – Numeric

  • in R Integer or Double are by deault Numeric
  • Definition mit numeric()
  • Conversion with as.numeric() (decimal place is cut off by as.integer())
  • Check with is.numeric()
  • General data type query with typeof()

Data types – Complex

  • Complex numbers, containing real and imaginary part
  • Extraction of parts with Re() and Im()
  • Definition with complex()
  • Conversion with as.complex()
  • Query with is.complex()

Data types – Logical

  • Logical values, values that can only take two states: TRUE (1) or FALSE (0)
  • Usually the result of logical operations (Boolean Algebra)
  • Further calculations are possible (0 and 1)
  • Definition with logical()
  • Conversion with as.logical()
  • Query mit is.logical()

Data types – Character

  • Literally "characters" are single letters, numbers, symbols, digits etc.
  • In R character values are strings, alphanumeric values of arbitrary length
  • character is the most open and confined data type in R
  • Assignment with " "
  • Definition with character()
  • Conversion with as.character()
  • Query with is.character()

Data types – Time and Date

  • Time and date types… are a wide an dopen field
  • In R, time values are defined as POSIX (Portable Operating System Interface)
  • POSIXct (ct calendar time) and POSIXlt (lt local time)
  • Number of seconds since a deadline: 1970-01-01 00:00:00 UTC
  • Definition and conversion can be an extensive chapter, thus: (as.POSIXct() and strptime())
  • Extraction of dat parts with format() or strftime()

Data types – NA, NaN und NULL

  • Missing values have the value NA (i.e., type logical), test with is.na()
  • Operations with non-defined results (e.g., 0 / 0) cause the type NaN (Not a Number)
  • The NULL-Object is used to describe that an object is not present.NULL has no data type and cannot be converted.

Data structures – what?

  • Define organisation and indication of data
  • structures are (to some extent) convertible
  • Query of structure with str()

Data structures – Vector

  • Vectors are the most simple data structure
  • A vector contains m values, organised as one column (col)
  • Vectors are one-dimensional structures (m:1)
  • Vectors can be of one consistent data type
  • Creation with c() (combine) or seq() (sequence)
  • Query number of values, i.e., the length with length()
  • Examples: c(1, 2, 3, 4:10), seq(from = 1, to = 10, by = 2), c(c(1, 2, 3), seq(4, 6), 7:10)

Data structures – Matrix

  • A matrix has m rows (rows) and n columns (cols)
  • Matrices are two-dimensional objects (m:n)
  • Matrices can be of one consistent data type
  • Creation with matrix(data, nrow, ncol)
  • Or creation with rbind() (row-bind) and cbind() (col-bind)
  • Conversion with as.matrix()
  • Resolution with as.numeric() (or as.character() etc.)

Data structures – Matrix

  • Query the number of all values with length()
  • Query number of rows, columens or dimension: nrow(), ncol(), dim()
X <- matrix(data = 1:10, nrow = 2)
dim(X)
## [1] 2 5
nrow(X)
## [1] 2
  • Transposing with t(), diagonale diag()
  • Adding further dimensions leads to an array (array(data, dim))

Data structures – Data frame

  • Data frames are the most common data structure in R
  • Consist of vectors of equal length but arbitrary data type
  • Data frames are "kind of" two-dimensional objects (m:n)
  • Columns (variables) shall have names (define with names())
  • Rows (specimen) may also have names (rownames())
  • Creation with data.frame()
  • Conversion with as.data.frame()
  • Resolution with unlist()

Data structures – Data frame

  • Query the length (i.e., number of variables) with length()
  • Query number of specimen, variables, dimension: nrow(), ncol(), dim()
x <- 1:3
y <- c("a", "b", "c")
d <- data.frame(x = x, y = y)

str(d)
## 'data.frame':    3 obs. of  2 variables:
##  $ x: int  1 2 3
##  $ y: Factor w/ 3 levels "a","b","c": 1 2 3
names(d)
## [1] "x" "y"

Data structures – List

  • Most flexible and waggely data structure in R
  • Contains any data type of any length per element
  • Basic structure are vectors, but of arbitrary length
  • Elements shall have names (define with names())
  • Creation with list() or vector(mode = "list", length = n)
  • Conversion with as.list()
  • Resolution mit unlist()

Data structures – List

  • Query the length (i.e., number of elements) with length()
x <- 1:3
y <- c("a", "b", "c")
l <- list(x = x, y = y)

str(l)
## List of 2
##  $ x: int [1:3] 1 2 3
##  $ y: chr [1:3] "a" "b" "c"
l
## $x
## [1] 1 2 3
## 
## $y
## [1] "a" "b" "c"

Data structures – S4-Objekte

  • Foundation of object-oriented programming in R
  • Somewhat (but not really) comparable with list, more restrictive
  • Consist of components of arbitrary data type and structure
data(volcano)
V <- raster::raster(x = volcano)

str(V)
## Formal class 'RasterLayer' [package "raster"] with 12 slots
##   ..@ file    :Formal class '.RasterFile' [package "raster"] with 13 slots
##   .. .. ..@ name        : chr ""
##   .. .. ..@ datanotation: chr "FLT4S"
##   .. .. ..@ byteorder   : chr "little"
##   .. .. ..@ nodatavalue : num -Inf
##   .. .. ..@ NAchanged   : logi FALSE
##   .. .. ..@ nbands      : int 1
##   .. .. ..@ bandorder   : chr "BIL"
##   .. .. ..@ offset      : int 0
##   .. .. ..@ toptobottom : logi TRUE
##   .. .. ..@ blockrows   : int 0
##   .. .. ..@ blockcols   : int 0
##   .. .. ..@ driver      : chr ""
##   .. .. ..@ open        : logi FALSE
##   ..@ data    :Formal class '.SingleLayerData' [package "raster"] with 13 slots
##   .. .. ..@ values    : num [1:5307] 100 100 101 101 101 101 101 100 100 100 ...
##   .. .. ..@ offset    : num 0
##   .. .. ..@ gain      : num 1
##   .. .. ..@ inmemory  : logi TRUE
##   .. .. ..@ fromdisk  : logi FALSE
##   .. .. ..@ isfactor  : logi FALSE
##   .. .. ..@ attributes: list()
##   .. .. ..@ haveminmax: logi TRUE
##   .. .. ..@ min       : num 94
##   .. .. ..@ max       : num 195
##   .. .. ..@ band      : int 1
##   .. .. ..@ unit      : chr ""
##   .. .. ..@ names     : chr ""
##   ..@ legend  :Formal class '.RasterLegend' [package "raster"] with 5 slots
##   .. .. ..@ type      : chr(0) 
##   .. .. ..@ values    : logi(0) 
##   .. .. ..@ color     : logi(0) 
##   .. .. ..@ names     : logi(0) 
##   .. .. ..@ colortable: logi(0) 
##   ..@ title   : chr(0) 
##   ..@ extent  :Formal class 'Extent' [package "raster"] with 4 slots
##   .. .. ..@ xmin: num 0
##   .. .. ..@ xmax: num 1
##   .. .. ..@ ymin: num 0
##   .. .. ..@ ymax: num 1
##   ..@ rotated : logi FALSE
##   ..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots
##   .. .. ..@ geotrans: num(0) 
##   .. .. ..@ transfun:function ()  
##   ..@ ncols   : int 61
##   ..@ nrows   : int 87
##   ..@ crs     :Formal class 'CRS' [package "sp"] with 1 slot
##   .. .. ..@ projargs: chr NA
##   ..@ history : list()
##   ..@ z       : list()

Data types und data structures – Questions I

  • Time: 5 - 10 minutes
    • Create a vector x from the numbers 0 to 5
    • Create a vector y from six arbitrary letters
    • Check data types of both objects
    • Combine the two vectors to a new vector z. Which data type do you expect for z?
    • Convert x to a logical vector x2
    • Calculate the sums of x, y and z

Data types und data structures – Questions II

  • Time: 5 - 10 minutes
    • Convert x into a mastrix X with two lines
    • Query the length, number of rows and columns of X
    • Create a data fram D from x and y
    • Convert the data frame to the list L
    • Change the names of the elements of L to "A" and "B"
    • Explore the data structures of X, D and L. What are the differences and commonalities?

R packages – What?

R packages – What?

  • This course does cover
    • a brief idea of the concept of R packages
    • a discussion/justification of package contents
    • hands-on work to build a package
    • work beyond heaving a package onto the shelf and vanish
  • You are looking for an introduction to R … sorry, bad luck!

R packages

Further resources

R packages

Prerequisites and computational minions

  • RStudio: The skin for R, a proper developer (and user) environment
    • Support for creating scripts, packages, reports, books, slides, …
    • Plots, data environment history, file browser, help viewer
    • Auto completion, syntax highlighting, context help
    • Version control and project management
  • The R package 'devtools' by H. Wickham, J. Hester & W. Chang
  • The R package 'roxygen2' by H. Wickham, P. Danenberg & M. Eugster

R packages

Concepts

  • R lives from sharing code, and packages are the vehicles for this idea
  • Packages can be the pillars for open and reproducible science
  • Packages comprise the full set of self-contained components

Libraries are no packages! Libraries contain packages. You pull a package from a library.

R packages

Some illusions

  • R packages have to go to the Comprehensive R Archive Network (CRAN)
  • Writing R packages is a lot of boring effort
  • You can start easy and add further details later
  • People will use your package in the way you designed it

R packages

Why writing packages

  • You want to share your code with others
  • The code is organised in a coherent way
  • You want to handle/distribute only a single file
  • Bundling code makes keeping track much easier than collecting scripts
  • The code is automatically tested (by your examples and other routines)

Working with packages simply saves time and brain cells

R packages – Contents

R package contents

An overview


What do you think? What belongs to an R package?

R package contents

An overview of obvious material

  • A name (yes, a package name)
  • A description (the meta information)
  • A function (yes, there are packages with only one function)
  • A documentation (you want to understand what the function does)
  • A working examples (you don't want to rely on the documentation, only)

  • A set of further stuff that will be covered later

R package contents

An overview

R package contents

The package name

  • This is (or should be) the hardest job when creating a package
  • Requirements
    • Only letters, numbers and periods are allowed
    • Start with a letter, do not end with a period
  • Advice
    • Pick a unique name you can google, best describing your package
    • Check if the name already exists beforehand
  • Example: 'praise' by G. Csardi & S. Sorhus

R package contents

The DESCRIPTION file

  • The mandatory file that defines all the metadata of the package: name & short title, version & date, author & contact, license & dependencies:

R package contents

The DESCRIPTION file - Title and description

  • Title must be capitalised, only one line, not ending with a period
  • Description can be several sentences, but only one paragraph. Lines can only contain 80 characters and must be indented by 4 spaces.
Title: Environmental seismology toolbox
Description: A collection of functions to handle seismic data for the
    purpose of investigating the seismic signals emitted by Earth surface
    processes. The package supports inporting standard formats, data 
    preparation and analysis techniques and data visualisation.
  • Both elements are important. They will be indexed and Google has learned a lot to spot R packages.

R package contents

The DESCRIPTION file – Dependencies

  • Dependency options in short (read more):
    • Depends: all packages your package essentially needs to run
    • Imports: will be covered by the namespace section
    • Suggests: optionally needed packages
    • LinkingTo: needed to reference C++/C libraries
  • CRAN became strict with the number entries in 'Depends'. Use importFrom() in the file NAMESPACE, instead.

R package contents

The DESCRIPTION file – Author information and roles

  • Author information can be defined more comprehensively (read more):
Authors@R: person("First", "Last", email = "first.last@example.com",
                  role = c("aut", "cre"))
  • Essential for correct citation of packages! Nota bene: Type citation("PACKAGENAME") to see how a package should be cited
  • Don't use fake mails. CRAN and users cannot communicate with you.

Since end of 2017, CRAN supports ORCID

Authors@R: person(...,
                  comment = c(ORCID = "0000-0002-9079-593X"))

R package contents

The DESCRIPTION file – License issues

  • The key element to inform who can use the package for which purpose!
  • Either a link to a license file (License: file LICENSE) or a keyword of standard licenses (read more):
    • GPL-2 or GPL-3: copy-left license , other users must license code GPL-compatile. Common for CRAN-submission.
    • CC0: give away all rights, anything can be done with the code
    • BSD or MIT: permissive licenses, require additional file LICENSE.

R package contents

The DESCRIPTION file – Version patterns

  • Version numbers must be numeric and separated by a period.
  • They are more than just counters, they define dependency satisfactions

  • Format: MAJOR.MINOR.PATCH (start a released package with 0.1.0)
    • MAJOR releases should be rare
    • MINOR releases should keep the package up to date
    • PATCH releases may be frequent (but think of CRAN team time budget)
  • Make use of a NEWS file to announce version history and changes.

R package contents

Further contents

R package contents

R code

  • The actual function definitions
  • Will be covered by the next section

R package contents

Code documentation

  • The (second) most important part of a reasonable package
  • Omitting it means, nobody will be able to use your package

  • Documentation in R is reference documentation, similar to dictionaries
  • Additional documentation is covered by vignettes (not covered here)

  • In R, documentation in *.Rd-files is formalised and follows a pseudo LaTeX scheme (short version, long version)

R package contents

Code documentation - code example

R package contents

Code documentation - how it looks like

R package contents

Code documentation

  • Why should you be not attempt to build documentation files manually?
    • Tedious, clumsy, not intuitive
    • Prone to forget updating after changing the function
  • Alternative, write documentation in function definitions
    • roxygen2
    • inlinedocs (no longer updated)
  • Application see next section

R package contents

Examples and example data

  • Working (and worked) examples are mandatory documentation items
  • Serve two things
    • Explain usage of the function
    • Creates a test, excecuted every time the package is built
  • Typically useful to include example data sets
    • Must be provided as *.rda files (generated with save())
    • Must be stored in directory data
    • Must be documented individually

R package contents

Further contents

  • Namespace (will be dealt with automatically, read more)
    • Defines function name assignments to packages
  • Compiled code (not covered here)
    • C++ code present in src will be compiled during installation
  • Shiny apps and other installed software (not covered here)
    • further files/software in inst will be copied to main directory

Time for a short brainstorming break!

Pffffft, a lot of dense and boring input, right?

Brainstorming break

The task: Think about and collect the essential items for your own package. Note the results in a plain text document for later use. Time: about 4 minutes.

We will need this material soon to build a package, … an empty one, … which will finally only have one function.

Brainstorming break

The task: Think about and collect the essential items for your own package. Note the results in a plain text document for later use. Time: about 4 minutes.

We will need this material soon to build a package, … an empty one, … which will finally only have one function.

Already need a short reminder? :)

Brainstorming break

  • package name
  • package title
  • package description
  • version
  • date
  • author(s)
  • maintainer (plus email address)
  • license

Creating a package in RStudio

Creating a package in RStudio

Start with a new project: Menu >> File > New Project…

Alternatively use devtools::create("path/to/package/pkgname")

Alternatively (DON'T) use package.skeleton(). It will create an overloaded package template that needs more modification.

Creating a package in RStudio

Start with a new project: Menu File > New Project…

Creating a package in RStudio

The template as overview

  • A new RStudio-project is started (file PACKAGE_NAME.Rproj)
  • Close and delete the template function and its documentation straight away.
  • Configure the Build tools: Menu >> Tools > Project Options > Build Tools
    • make sure to check "Generate documentation with Roxygen".
  • The NAMESPACE contains only: exportPattern("^[[:alpha:]]+"), i.e., every function is exported!
  • The DESCRIPTION file contains auto-generated content. Modify it so that it fits your needs.

From scripts to functions

From scripts to functions

An unfortunate start …

a <- 10:50
print(a)
plot(a)
b <- 210:250
plot(a, b)
A <- a * b
c <- 5
V <- A * c
print(A)
print(V)
plot(a,V)

From scripts to functions

… ok …

a <- 10:50
b <- 210:250
c <- 5

A <- a * b
V <- A * c

plot(a)
plot(a, b)
plot(a,V)

print(a)
print(A)
print(V)

From scripts to functions

… better …

## define object geometry
a <- 10:50
b <- 210:250
c <- 5

## calculate area and volume
A <- a * b
V <- A * c

## plot object dimensions
plot(a)
plot(a, b)
plot(a, V)

## print values
print(a)
print(A)
print(V)

From scripts to functions

Wrapping code to functions

f <- function(a, b, c) {
  
  ## calculate area and volume
  A <- a * b
  V <- A * c

  ## plot object dimensions
  plot(a)
  plot(a, b)
  plot(a, V)

  ## return values
  return(list(A = A,
              V = V))
}

From scripts to functions

Wrapping code to functions

f <- function(a, b, c, plot = TRUE) {
  
  ## calculate area and volume
  A <- a * b
  V <- A * c

  ## optionally plot object dimensions
  if(plot == TRUE) {
    
    plot(a)
    plot(a, b)
    plot(a, V)
  }

  ## return values
  return(list(A = A,
              V = V))
}

From scripts to functions

Wrapping code to functions

f(a = 10, b = 100, c = 5, plot = FALSE)
## $A
## [1] 1000
## 
## $V
## [1] 5000

From scripts to functions

In words

  • Structure your script
    • Variable/object/argument definitions
    • Data/variable checks and automatic assignments
    • Data manipulation/evaluation
    • Optional further outputs
    • Return object creation
  • Wrap it into a function definition
    • FUNCTION_NAME(ARUMENT_1, ARGUMENT_2) {FUNCTION BODY}

From scripts to functions

In words

  • Ooops, what did we forget?

From scripts to functions

In words

  • Ooops, what did we forget?

  • Function documentation (in a separate file)

\name{f}
\alias{f}
\title{Calculate and plot cuboid areas and volumes.}
\usage{f(a, b, c, plot = TRUE)}
\arguments{
\item{a}{\code{Numeric} vector, length of the cuboid.}
\item{b}{\code{Numeric} vector, width of the cuboid.}
\item{c}{\code{Numeric} vector, height of the cuboid.}}
\value{A list with cuboid area and volume.}
\description{The function takes numeric vectors of the cardinal 
  dimensions of a cuboid object and calculates area and volume. The results can optionally be plotted.}
\examples{f(a = 10, b = 100, c = 5, plot = FALSE)}
\author{Michael Dietze}

From scripts to functions

Documentation using 'roxygen2'

  • So why not writing documentation into the function definition file?

Another brief function example

f <- function(x, p = 2) {
  ## calculate the power of x
  y <- x^p

  ## return value
  return(y)
}

From scripts to functions

Documentation using roxygen2

Can be rewritten like this:

#' @title Calculate the power of a vector               # TITLE
#' 
#' @description The function calculates something       # DESCRIPTION
#' 
#' @details The function simply combines the arguments. # DETAILS 
#'
#' @param x input vector                                # ARGUMENTS
#' @param p power exponent                              # ARGUMENTS
#' @return vector of the power p of x.                  # VALUE
#' @author Michael Dietze                               # AUTHOR(S)
#' @examples
#' f(x = 10, p = 3)                                     # EXAMPLES
#' @export                                              # NAMESPACE ENTRY
f <- function(x, p = 2) {                               # USAGE
  return(x^p)
}

From scripts to functions

Documentation using roxygen2

And become something like:

From scripts to functions

Documentation using roxygen2

  • 'roxygen2' is a package that parses function source files for tags (e.g., #' @param) and converts them to the structure of a *.Rd-file.

  • Usually the first line becomes the title (thus, keep it to one line)
  • Second set of lines becomes becomes description
  • Third and further set of lines becomes details (optional)

  • Further down follow tagged items

From scripts to functions

Documentation using roxygen2

  • @param - Function arguments, note argument and then description
  • @return - Function value
  • @examples - Examples section
  • @export - Namespace export, usually the function name
  • @seealso - Related functions to link to
  • @keywords - Well, keywords
  • @section - Arbitrary sections to further structure the documentation

From scripts to functions

Documentation using roxygen2

  • Further LaTeX-like tags to structure the text can be
    • Text formatting (\emph{}, \strong{}, \code{})
    • Links (\code{\link{}}, \href{}{})
    • Lists (\enumerate{}, \itemize{})
    • Equations (\eqn{}, \deqn{})
    • Tables (\tabular{}{\tab \cr})
  • Details see here, Example see source of Luminescence::analyse_baSAR.R

From scripts to functions

What else to document?

Are we finished? Not yet.

  • Data sets need to be documented, as well
  • The package should to be documented, likewise

From scripts to functions

Documentation of data sets using roxygen2

  • Further tags
    • @format - Overview of data structure (copy-paste output of str())
    • @source - Source of the data set, e.g., the internet link
  • Documentation in a file called like the dataset and saved in R/DATA_SET.R
  • Alternatively, all documentation in package documentation (see next slide)
#' Ten numbers from 1 to 10
#'
#' A dataset containing ten ordered natural numbers
#'
#' @format A vector with 10 variables:
#' int [1:10] 1 2 3 4 5 6 7 8 9 10
"x"

From scripts to functions

Documentation of packages using roxygen2

  • Similar to data sets, but the package is defined as a NULL-object
  • Definition of imports (entire packages or external functions)
  • Save as file called PACKAGE_NAME-package.Rin R/
#' A package of diverse functions
#'
#' The package is used to store all my functions, save from my brain.
#'
#' @docType package
#' @name PACKAGE_NAME
#' @import stats
#' @importFrom utils read.table, write.table
NULL

From scripts to functions

HAVE WE FINALLY LOST EVERYONE? :)

In a nutshell: documenting package and datasets means:

  • create an empty R file pkg_name-package.R in pkg_name/R
  • write the 'roxygen2' tags for package documentation (followed by NULL)
  • write the 'roxygen2' tags for data set documentation (followed by data set name)

From scripts to functions

HAVE WE FINALLY LOST EVERYONE? :)

#' eseis: Environmental Seismology Toolbox
#' 
#' This package eseis provides functions to read/write seismic data files...
#'
#' @name eseis
#' @aliases eseis
#' @docType package
#' @author Michael Dietze
#' @importFrom graphics image plot axis axis.POSIXct box mtext
NULL

#' Seismic trace of a rockfall event.
#' 
#' The dataset comprises the seismic signal of a rockfall.
#' 
#' @name rockfall
#' @docType data
#' @format The format is: num [1:98400] 65158 65176 65206 65194 65155 ...
#' @examples
#' ## load example data set and plot it.
#' data(rockfall)
#' plot(rockfall, type = "l")
"rockfall"

Time to move the fingers!

Pffffft, no more details please!

Hands-on session

Task A: Create a simple example dataset: a sequence of numbers from 1 to 10 called x. Save it as x.rda in your package. Maybe you need to create a directory called data, first?

Task B: Write a function that can multiply a numeric vector (x) by a constant (c, default is 1) and return the result. Document the function using roxygen2 tags, including an example.

Task C: Document the package and the example data set using roxygen2.

Time: 5-8 min

Hands-on session

Task A

x <- seq(from = 1, to = 10)
save(x, file = "data/x.rda")

Hands-on session

Task B

#' Multiply a vector by a constant
#' 
#' The function uses simple R functionalities to multiply a numeric 
#' vector x by a constant c and returns the resulting vector.
#' 
#' @param x Numeric vector to be multiplied
#' @param c Numeric value multiplicator, default is \code{1}
#' @return Numeric vector, product of \code{x} and \code{c}
#' @author Michael Dietze
#' @examples
#' data(x)
#' mtp(x = x, c = 2)
#' @export mtp
mtp <- function(x, c = 1) {
  return(x * c)
}

Hands-on session

Task C

#' toolbox: A growing set of handy functions
#' 
#' This package contains one function to help solving problems in R.
#'
#' @name toolbox
#' @docType package
#' @author Michael Dietze
NULL

#' Numeric vector
#' 
#' The dataset contains a numeric vector from 1 to 10.
#' 
#' @name x
#' @docType data
#' @format The format is: num [1:10] 1 2 3 4 5 6 7 8 9 10
#' @examples
#' ## load example data set and plot it.
#' data(x)
#' plot(x)
"x"

Building a package

Building a package in RStudio

Ready to go!

  • Click in the upper right section of RStudio at the "Build" tab.
    • "Build & Reload" will compile the package and restart R
    • "Check" will run the R-internal check routines
    • "More" well, will show you the above and some more options
  • Click on "Check". See what happens. Are there any warnings? Notes? Errors?

Building a package in RStudio

Build a source version hardcopy

  • To share the package, you need to build a source version of it
    • Menu Build > Build Source Package
    • You find a PACKAGE_VERSION_tar.gz-file in your R directory
  • Packages for Windows-users need to be compiled in a different way
  • These packages are ready, they can be freely distributed and installed locally.

The deliberately omitted points

  • Test your functions
  • What if Check stops with an error? See where the error happened:
    • If in the examples, then check the respective function
    • If in the overhead tests before, well, … google for it
  • What if a package user reports an error?
    • Fix it, update package version and notify all relevant persons
    • Thus, use a versioning system (next section)

Building a package in RStudio

Done!

Now, that is it. All that is left is for you to build your package. So, go ahead!

Geschafft