stripplot of groups and points in time (lattice package)

R is an open source software environment. It is free, under the GPL and everyone can use it quite easily. S-PLUS is the predecessor of R but it is commercial. Fortunately, R works under several OS like Linux, Mac or the usual Win32 environment. It seems that R is becoming the ’status quo‘ of statistical working. Almost every analysis can be realized and what is not available can be coded by oneself. Its graphical abilities are enormous and it also allows to process text (i.e. ’strings‘)  so that qualitative data analysis is also possible. R is mostly command-line oriented, however there are some GUIs for the beginners and some editors to control R (Tinn-R, ESS). On CRAN there is an overview of available R-packages which are more than 500! Additionally, there are a lot of tutorials available to learn R in general or to learn specific aspects of this powerfull computing language. Now, there are also a lot of books available (see references below).

I used R in various research projects of social sciences and of course there is always a specific analysis which can be done by making use of a third party software which enables this or that feature that R has not (yet). But I don’t think I will ever use another tool for statistical computing. R is just too good.

The following short tutorials/ hints/ excerpts are on my blog:

In the following, all books and tutorials I have not read personally are marked.


boxplot (from a simulation)

Tutorials and free books:


histogram and density plot (MCAR test)


  • Venables, William N. und Smith, D.M. (1991-2001). An introduction to R. Bristol/ UK: Network Theory Limited. © R Development Core Team. R-Code, Errata.
  • Dalgaard, Peter (2002). Introductory statistics with R. New York: Springer. R-Code, Errata.
  • Pinheiro, Jose C. und Bates, Douglas M. (2002). Mixed-Effects Models in S and S-PLUS. Springer.
  • Fox, John (2002). An R and S-PLUS companion to applied regression. Thousand Oaks: Sage. R-Code, Errata.
  • Handl, Andreas (2002). Multivariate Analysemethoden. Theorie und Praxis multivariater Verfahren unter besonderer Berücksichtigung von S-PLUS. Berlin: Springer. Online als e-book erhältlich bei QUANTLET, R-Code, Errata.
  • Maindonald, John und Braun, John (2003). Data Analysis and Graphics Using R – An Example-Based Approach. Cambridge University Press. R-Code, Errata.
  • Venables, William N. und Ripley, Brian D. (2003). Modern Applied Statistics with S. Springer.
  • Dubravko, Dolic (2003). Statistik mit R. Oldenbourg (this I never read)
  • Huet, Sylvie; Bouvier, Anne und Gruet, Marie-Anne (2003). Statistical Tools for Nonlinear Regression. New York: Springer. (this I never read)
  • Kelly, Laurie und Faraway, Julian James (2004). Linear Models with R. Chapman & Hall/CRC. R-Code, Errata.
  • Schlittgen, Rainer (2004). Statistische Auswertungen mit R. Oldenbourg. (this I never read)
  • Verzani, John (2004). Using R for Introductory Statistics. Chapman & Hall/CRC.
  • Vittingoff, Eric; Glidden, David V. und Shiboski, Stephen C. (2004). Regression Methods in Biostatistics. Springer.
  • Everitt, Brian S. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.
  • Ligges, Uwe (2005). Programmieren mit R. Berlin: Springer. R-Code, Errata.

Tinn-R and R working together (win32)

GUIs for the helpless (I do not recommend using a GUI because while doing statistics one should know what one is doing and I don’t think that any GUI supports this! I hope using not a GUI triggers ‚thinking instead of clicking‘). Interesting websites you should visit:

  • GUIs on CRAN
  • ESS – Emacs Speaks Statistics (Linux, Win32 customizing by John Fox)
  • Tinn-R – an editor to control R (Win32)
  • SciViews-R – a gui add-on for R (Win32)
  • Rpad – Online R (via browser or via local net)
  • Quantian Live-Linux scientific cd/dvd based on Debian (with R and much more)
  • JGR – Java GUI for R

A collection of small R scripts I used for my own work (Gürtler, 2005):

  • Power Set Count
    • you need ‚ContAna.R‘ package by Hartmut Oldenbürger and ’subsets()‚ package (based on ‚combinations()‚ from CRAN)
    • Production of power set count – PotenzM.R
    • Production of set union of all propositions – Pcodes.R
    • Counting power set count and production of a matrix ‚power set count x persons‘ – PotenzA.R
  • Test on hierarchical clustering (Oldenbürger, 1981)
  • Optimal cut through a proximity matrix (Oldenbürger, 1981)
    • Empirical analysis of the optimal cut -> prototype vector -> prototype
    • Production of (0,1)-adjacence-matrix based on prototype vector
    • visualization of cophenetic correlations
    • optSchnitt.R


    Jeffreys-Carnap / Bayes-Laplace posterior distribution

  • Some R scripts for easier work with AQUAD 6
    • Production of huge tables for AQUAD 6 – MaAqdtab.R
    • Reading AQUAD 6 sequential codes and save them as .csv tables – ReadAqdtab.R
    • Counting of codes from AQUAD 6 codefiles in case of huge amounts of files (columns= files, rows = codes) – ReadoutCodes.R
    • Automatic production of ‚hypotheses of connection‘ for AQUAD 6 (it can be read automatically by AQUAD 6)  – hyp2aqd.R
  • ‚On the difference in means‘ (implementation of G.L. Bretthorst’s ingenious Bayesian solution to the Behrens-Fisher problem, Bretthorst, 1993; see also Studer, 1998)
    • …to come… as Mathematica as well as R script
    • Status: it works for small sizes (up to n=100) but not for large ones (this depends on numbers that are too large so that the whole script has to be arranged for log()’s)