Predicting the Passing Probability for a Course with Logistic Regression of a Single Predictor

Classification In many cases we may encounter data which contain categorical variables instead of quantitative variables. An example of categorical data could be something like a dataset containing a coloumn of a variable called “Eye Colour” with entries such as “Blue” or “Brown” instead of numerical values. In some cases we could decide to convert categorical data to numerical data if the assignment is appropriate. For example if there are only two possible options we can map one choice to the number 1 and the alternative to the number 0....

Armandt Erasmus

Predictive Modelling of Breast Cancer Diagnosis using K-Nearest Neighbors Algorithm

Predictive Modelling of Breast Cancer Diagnosis using K-Nearest Neighbors Algorithm In this classification project we will be using the K-Nearest Neighbors (KNN) algorithm on the Breast Cancer Wisconsin (Diagnostic) dataset which can be found at Breast Cancer Wisconsin (Diagnostic). The dataset contains 30 features which are extracted from a digitized image representing a fine needle aspirate (FNA) procedure conducted on a breast mass. Our task is to develop a predictive model that can accurately classify tumors as either benign or malignant given a set of features....

Armandt Erasmus

TV Sales Prediction with Linear Regression Analysis

Linear Regression Analysis Predictive analytics is a sub-field within statistics that aims to predict future events with reasonable accuracy by using data obtained from events in the past. A well-known and widely used model in predictive analytics is a regression model. Regression models use input data together with a well-defined function to produce an estimate value. Depending on the scenario, one might decide to use simple linear regression which is a regression technique used when dealing with a single input variable....

Armandt Erasmus