You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First we observe that the feature Credit_Product has missing values. We use imputation here to fill all the values. Here we mark 'Unknown' for each NaN values.
Since after training on the data-set we get to know that the feature Credit_Product has highest feature importance. So we will now try to break the "Unknown" value into "U1" and "U0" as per our target variable Is_Lead.
Now we want to predict correct Credit_Product featue from rest of our dataset. So we train a RandomForestClassifier for classification of Credit_Product feature. After training, we will add all the probabilities of Credit_Product in the train data itself and for test data also.
Now we use CatBoostClassifier to train the data for the target variable Is_Lead.
After that we evaluate its roc-auc score.
Now we predict the target variable Is_Lead for the test data and save it to Predictions.csv.