Introduction to R

Offered by:LuciLinX with Petr Nazarov
When:17 October 2011
Where:University of Luxembourg, Bâtiment des Sciences, Room 0.11
Duration:Session 1: 9h00 - 13h00 Session 2: 14h00 - 17h00
Requirements:PCs available, but you can bring your own too
Restrictions:Maximum 20 participants, first-come first-serve
Registration:By email to Petr Nazarov
Contact:petr DOT nazarov AT crp-sante DOT lu


This course provides the kick-start for those, who are interested in using R-language for the analysis and manipulation of their experimental data. We start from the basic expressions and will go through the topics up to multivariate ANOVA. The workshop will be separated into two sessions: the first one oriented on the basics on R programming and the second, oriented on statistical data analysis with R. Many examples with mostly biological orientation will be given during the workshop. Participants are encouraged but not obliged to bring their own laptops.


Session I. Introduction. Data manipulation and visualization (9.00 – 13.00)
  • 1.General introduction
    • a.main features (scripting, Win/Linux, export results) and the diversity of tasks/applications
    • b.information package
    • and installation
    • system
  • 2.Basic operations
    • a.mathematical operations and variables
    • b.types of data (bool, int, double, character, factors)
    • c.vectors and operations
    • d.matrixes and data frames and operations
    • e.operation with strings: paste, sprintf, sub, grep
    • f.selection of subgroups in the data (indexes)
    • g.control flows: if, for and while
  • 3.Data import and export
    • a.import data tables from local discs and internet
    • b.export data
    • and load the variables in binary format
  • 4.Data visualization
    • a.plot, barplot
    • b.multiple plots
    • c.hist, heatmap
  • 5.Custom functions
    • a.Custom functions
    • b.Loading and sharing functions
Session II. Statistical data analysis (14.00 – 17.00)
  • 6.Descriptive statistics
    • a.selection of subgroups in the data
    • b.mean, median, var, sd, mad
    • c.summary()
    • d.plot histogram
  • 7.Principle components and clustering methods
    • a.PCA and its application
    • b.k-means clustering, heatmap
  • 8.Statistical tests and random numbers
    • a.random number generation
    • b.tests for the means t.test, wilcox.test
    • c.test for the variances and distribution model: var.test, pearson.test, shapiro.test, ks.test
  • 9.Model fitting
    • a.regression
    • b.ANOVA
    • c.Multivariate ANOVA