Multi-variate Analyses and Machine Learning ============================================ 1. Briefly explain what is the purpose of unfolding procedures. Say something more about commonly used approaches. 2. Rectangular cuts and Fisher discriminant. Explain the methods and how one can optimise their performance. 3. Decision trees: How one defines classification of the final leafs. How one measure quality of the predictions: error and accuracy. How one measures the performance. 4. Idea of ensemble classifiers and boosting: Could you explain the concept of weighted weak classifiers and weighted data. Could you write down formula for final mode predictions. 5. How do we access performance of ML algorithms? Explain what is the "training error", "validation error", "generalization error", "test error". What does it mean "cross-validation"? Draw illustrative plot how they typically behave with regression model complexity. What does it mean "over-fitting"? How we can mitigate it adding extra term to the cost function 6. We measure performance of the classifier based on: "classification error", "classification accuracy", "confusion matrix". Could you explain what does it mean? What is the problem of "class majority".