Published on

Statistics and Data Science Program Highlights

Authors
  • avatar
    Name
    Alex Salce
    Twitter

Statistics and Data Science MS GIDP Highlights

Qualifying Exam

The Statistics and Data Science Qualifying Exam is comprised of a Theory section and a Methodology section. Each section has six problems, and students select 5 to answer. The exams were held August 3rd3^{rd} and 4th4^{th}, 2023. I received PhD Pass on the Methodology portion of the exam, and MS Pass Theory portion. Below are some reference materials that I created as reference for the exams.

Theory

A reference notebook for various distributions and their related distributions, general applications, and pertinent properties for theoretical analysis.

QE Distributions

Please note that this is for personal use only, and uses uncredited images and information compiled from miscellaneous sources. The source LaTeX\LaTeX file is available here and images for the file here, in case you would like to add or modify (this has not been independently reviewed in its entirety, so there could be errors).

Additional miscellaneous reference useful in Statistical theory problems and appilcations.

QE Misc

Please note that this is for personal use only, and uses uncredited images and information compiled from miscellaneous sources. The source LaTeX\LaTeX file is available here, in case you would like to add or modify (this has not been independently reviewed in its entirety, so there could be errors).

Methodology

Regression concepts covered in Methodology portion.

  • SLR & MLR forrmulation

  • ANOVA

  • AIC and BIC

  • Multicolinearity

  • Polynomial models

  • Interaction models

  • Qualitative predictors

  • Leverage, Influence measures, diagnostics

  • Weighted Least Squares

  • Loess, Ridge Regression

  • Bootstrapping

  • Regression pipeline

    • Plot
    • Transform Data
    • Check correlations
    • Create Linear Model
    • Assess interactions, higher order terms, variable selection
    • Build reduced model
    • Asses reduced model
    • Check assumptions of normality, constant variance, outliers, etc.
    • Check for high influence/leverage points
    • Check VIF

Experimental designs covered in Methodology portion.

  • Complete Randomized Design (CRD)
  • Complete Randomized Block Design (RCBD)
  • Latin Square Block Design
  • Graeco-Latin Square Block Design
  • Balanced Incomplete Block Design
  • Factorial Design
  • Crossed Design
  • Blocked Factorial Design
  • 2k2^k Factorial Design
  • Fractional 2k2^k Factorial Design
  • Random Effet Models
  • Mixed Effect Models
  • Nested Design
  • Split-Plot

Fall 2022 / Spring 2023

Statistical DOE

For my Statistical Design of Experiments course final project, I performed a 2k2^k factorial design experiment using three different free throw shooting technique factors for anlaysis. Surprisingly, I found that bending my legs and using a “spring” motion affected my shot with statistical significance. Here is a blog-style summary of the project, the results, and some other insights.

Statistical DOE

Fall 2023

Machine Learning and NBA Statistics

My Statistical Machine Learning final project used a dataset of all active NBA players' individual game statlines (including advanced stats) that I assembled queueing from the NBA API via Python. The data was used to train classification models that can predict a player's position (PG, SG, SF, PF, C) from an input statline.

Statistical Machine Learning

The project covers different approaches to the model, what model gives the best predictions, and what is most efficient. This simple novelty analysis may not have immediate practical applications, but has been invaluable to give me the framework for future analyses of NBA data that can be refreshed with the latest statistics.

Decision Analysis

An original exercise in the detailed decision analysis of a major decision affecting all Engineering directorates of a major corporation.

Decision Analysis

The project details extensive analysis that a company faces with selecting a new corporate standard electrical design tool. It uses decision analysis techniques such as value modeling, decision trees, risk assessment, and other Engineering decision techniques to provide the objectively best recommendation to the decision maker based upon available data.

Spring 2024

Machine Learning Optimization

Machine Learning Optimization

Fall 2024

Coming late 2024