How to begin a Supervised Machine Learning Problem in Kaggle!


Identifying the Problem Type:

It is important in Machine learning to understand the problem type first. If it is continuous output – [1,23,4,5,6, 5.5, 6.7,..], use Linear Regression. If it is a categorical output – [0,1,0,0,1…] or [‘High’, ‘low’, ‘Medium’, …] etc., go for Logistic Regression. Since your target labels are either 0 or 1, this is a problem to be worked with Logistic Regression or other Classification algorithms (SVM, Decision Tree, Random Forest).

Data Cleaning/Exploration:

You must convert your data to numeric format or standardized format for regression.https://realpython.com/python-data-cleaning-numpy-pandas/

Starter Code:

In case you are looking for a starter code for your problem, you can find that from Kaggle kernels. Here are a few links:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.