In Pandas, DataFrames are like tables where columns hold specific attributes. Often, we need to add new columns, update existing ones, or delete unwanted columns. Let’s explore how to do these operations effectively.
🔹 Adding New Columns
import pandas as pd
data = {
"Name": ["Alice", "Bob", "Charlie", "David"],
"Age": [25, 30, 35, 28],
"Salary": [50000, 60000, 75000, 58000]
}
df = pd.DataFrame(data)
# Add a new column with fixed value
df["Country"] = "USA"
# Add a column based on calculation
df["Bonus"] = df["Salary"] * 0.1
print(df)
✅ You can add columns with either a fixed value or using existing columns.
🔹 Updating Columns
# Update all values in a column
df["Country"] = "UK"
# Update specific rows based on condition
df.loc[df["Age"] > 30, "Bonus"] = df["Salary"] * 0.2
print(df)
✅ Use .loc to conditionally update rows in a column.
🔹 Deleting Columns
# Delete a column permanently
df.drop("Bonus", axis=1, inplace=True)
# Delete multiple columns
df.drop(["Age", "Country"], axis=1, inplace=True)
print(df)
✅ Use axis=1 to delete columns. Setting inplace=True modifies the DataFrame directly.
⚠️ Common Mistakes
- Forgetting
axis=1when dropping columns (default isaxis=0which drops rows). - Not using
inplace=Trueor reassigning todfafter drop (changes won’t be saved otherwise). - Misspelling column names — Pandas is case-sensitive.
- Updating without
.locfor conditional changes, which may give unexpected results.
No comments:
Post a Comment