Introduction to R
Offered by: | LuciLinX with Petr Nazarov |
Level: | Beginners |
When: | 17 October 2011 |
Where: | University of Luxembourg, Bâtiment des Sciences, Room 0.11 |
Duration: | Session 1: 9h00 - 13h00 Session 2: 14h00 - 17h00 |
Requirements: | PCs available, but you can bring your own too |
Restrictions: | Maximum 20 participants, first-come first-serve |
Registration: | By email to Petr Nazarov |
Contact: | petr DOT nazarov AT crp-sante DOT lu |
Description:
This course provides the kick-start for those, who are interested in using R-language for the analysis and manipulation of their experimental data. We start from the basic expressions and will go through the topics up to multivariate ANOVA. The workshop will be separated into two sessions: the first one oriented on the basics on R programming and the second, oriented on statistical data analysis with R. Many examples with mostly biological orientation will be given during the workshop. Participants are encouraged but not obliged to bring their own laptops.
Plan
Session I. Introduction. Data manipulation and visualization (9.00 – 13.00)
- 1.General introduction
- a.main features (scripting, Win/Linux, export results) and the diversity of tasks/applications
- b.information package
- c.download and installation
- d.help system
- 2.Basic operations
- a.mathematical operations and variables
- b.types of data (bool, int, double, character, factors)
- c.vectors and operations
- d.matrixes and data frames and operations
- e.operation with strings: paste, sprintf, sub, grep
- f.selection of subgroups in the data (indexes)
- g.control flows: if, for and while
- 3.Data import and export
- a.import data tables from local discs and internet
- b.export data
- c.save and load the variables in binary format
- 4.Data visualization
- a.plot, barplot
- b.multiple plots
- c.hist, heatmap
- 5.Custom functions
- a.Custom functions
- b.Loading and sharing functions
Session II. Statistical data analysis (14.00 – 17.00)
- 6.Descriptive statistics
- a.selection of subgroups in the data
- b.mean, median, var, sd, mad
- c.summary()
- d.plot histogram
- 7.Principle components and clustering methods
- a.PCA and its application
- b.k-means clustering, heatmap
- 8.Statistical tests and random numbers
- a.random number generation
- b.tests for the means t.test, wilcox.test
- c.test for the variances and distribution model: var.test, pearson.test, shapiro.test, ks.test
- 9.Model fitting
- a.regression
- b.ANOVA
- c.Multivariate ANOVA