Undersampling is a data cleaning technique for dealing with imbalanced data in machine learning. Imbalanced data means that some classes or labels are overrepresented or underrepresented in the ...
I'd like to build a categorical predictive models using a binary classification (0 ,1). The dataset is composed by about 300K data points, but the classification is imbalanced (289K elements with 0 ...