Python for Machine Learning & Data Science: Complete Day 1 Practice Guide for Beginners

Python for Machine Learning & Data Science – Complete Day 1 Practice Guide

Welcome to your first day of Python practice for Machine Learning (ML) and Data Science! This guide is designed for absolute beginners. By the end of this post, you will understand Python basics, data manipulation using NumPy and Pandas, basic data visualization with Matplotlib, and your first simple ML model using scikit-learn.


Step 1: Python Basics Refresher

Python is the foundation of all Data Science and Machine Learning projects. Make sure you are comfortable with:

  • Variables & Data Types: int, float, str, bool
  • Data Structures: Lists, Tuples, Dictionaries
  • Control Flow: if, elif, else and loops like for, while
  • Functions: Define reusable blocks of code

Task 1: Create a function to calculate square and cube of a number:

def square_and_cube(num):
    """
    This function takes a number as input
    and returns its square and cube.
    """
    square = num ** 2
    cube = num ** 3
    return square, cube

# Test the function
print(square_and_cube(5))

Tip: Running simple functions like this builds confidence and understanding of Python syntax.


Step 2: Import Essential Libraries

In Data Science, you will frequently use these Python libraries:

  • numpy – For numerical computations and arrays
  • pandas – For handling structured data
  • matplotlib & seaborn – For data visualization
  • scikit-learn – For Machine Learning algorithms

Task 2: Convert a simple Python list into a NumPy array:

import numpy as np

numbers = [1, 2, 3, 4, 5]
array = np.array(numbers)
print("NumPy array:", array)

Tip: NumPy arrays are faster and more efficient than Python lists, especially for large datasets.

Step 3: Working with Data using Pandas

Pandas is used for data handling. You can create, view, and manipulate structured data easily.

Task 3: Create a small dataset:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'Salary': [50000, 60000, 70000, 80000]
}

df = pd.DataFrame(data)
print(df)
print("\nData Shape:", df.shape)
print("\nSummary:\n", df.describe())

Tip: df.describe() gives a quick statistical summary of your numeric columns.


Step 4: Data Visualization

Visualizing data helps identify trends and patterns.

Task 4: Plot Age vs Salary using Matplotlib:

import matplotlib.pyplot as plt

plt.plot(df['Age'], df['Salary'], marker='o', color='green')
plt.title('Age vs Salary')
plt.xlabel('Age')
plt.ylabel('Salary')
plt.grid(True)
plt.show()

Tip: Visualization is crucial in Data Science to understand datasets and detect outliers.

[AdSense Ad Here]

Step 5: First Machine Learning Task – Linear Regression

Linear Regression is a simple ML model to predict continuous values. We will predict Salary based on Age.

from sklearn.linear_model import LinearRegression
import numpy as np

X = df[['Age']]  # Independent variable
y = df['Salary']  # Dependent variable

model = LinearRegression()
model.fit(X, y)

predicted_salary = model.predict(np.array([[28]]))
print("Predicted salary for age 28:", predicted_salary[0])

Tip: This small ML model demonstrates how Python can be used for real-world predictions.

Optional Challenge Tasks

  • Add a new column Bonus = Salary * 0.1 to your DataFrame.
  • Plot Age vs Bonus using Seaborn scatter plot.
  • Predict salary for multiple ages: [22, 35, 45].
  • Experiment with different Python functions and loops to automate calculations.

💡 By completing these tasks, you will have a solid foundation in Python, data manipulation, visualization, and ML basics.

👉 Next: Python for Machine Learning & Data Science Day 2

Happy Learning and Keep Practicing! 🚀

एक टिप्पणी भेजें

और नया पुराने

نموذج الاتصال