B. Tech Biosciences and Bioengineering
BT 307 Biological Data
Analysis 2-0-2-6
Syllabus: Data, descriptive
statistics, and visualization: Introduction to different types of data in
biology; Descriptive statistics like mean, median, mode, quartiles, standard
deviation, standard error; Different types of plots like scatter plot, bar
graph, line graph, pie chart, box plot, frequency histogram; Understanding
error bars. Probability and probability distributions: basic concepts of
probability, conditional probability, Bayes theorem; binomial, multinomial,
Poisson, exponential, and Gaussian distribution; Sampling distribution and
central limit theorem. Hypothesis testing: Student's t-test, Z-test,
Chi-squared test, ANOVA. Correlation, regression and estimation: Pearson
correlation; Regression: linear, non-linear, single and multivariate; concept
of likelihood and method of maximum likelihood. Tools for data of high
throughput experiments: principle component analysis; Clustering of data:
K-means algorithm, hierarchical clustering; Visualization tools: heat map,
volcano plot. Laboratory component: R and MS Excel based exercises on graphical
visualization of data, different tests of hypothesis, estimation of
correlation, regression, PCA, clustering.
Texts:
1. S. Ross, A First Course in Probability, 9th
Edition, Pearson Education India, 2014.
2. R. C. Elston and W. D. Johnson, Basic Biostatistics for Geneticists and Epidemiologists: A Practical
Approach, 1st Edition,
Wiley, 2008.
3. G. Hartvigsen, A Primer
in Biological Data Analysis and Visualization Using R, 1st
Edition, Columbia
University Press, 2014.
References:
1.M. C. Whitlock,
and D. Schluter, The Analysis of
Biological Data, 2nd Edition, W. H. Freeman &
Company, 2014.
2.G. P. Quinn,
and M. J. Keough, Experimental Design and
Data Analysis for Biologists, 1st Edition,
Cambridge University Press, 2002.
3. M. D. Ugarte,
A. F. Militino, and A. T. Arnholt,
Probability and Statistics with R, 2nd
Edition, CRC
Press, 2016.