KevsRobots Learning Platform
48% Percent Complete
By Kevin McAleer, 2 Minutes
This lesson is centered on importing and exporting data using Pandas. We’ll cover how to read data from various sources like CSV, Excel, and YAML files into Pandas Data Frames, and similarly, how to export Data Frames into these file formats. Mastering these skills is essential for efficient data handling and analysis.
To read data from a CSV file into a Data Frame, use pd.read_csv()
:
import pandas as pd
# Reading from a CSV file
df = pd.read_csv('path/to/your/file.csv')
print(df)
Reading from an Excel file is just as straightforward:
# Reading from an Excel file
df = pd.read_excel('path/to/your/file.xlsx')
print(df)
To remove duplicate rows from a Data Frame, use df.drop_duplicates()
:
# Removing duplicate rows
df = df.drop_duplicates()
To read YAML data, you’ll need an additional library, PyYAML
:
import yaml
import pandas as pd
# Reading from a YAML file
with open('path/to/your/file.yaml', 'r') as file:
yaml_data = yaml.safe_load(file)
df = pd.DataFrame(yaml_data)
print(df)
You can export a Data Frame to a CSV file using df.to_csv()
:
# Writing to a CSV file
df.to_csv('path/to/your/newfile.csv')
Similarly, to export to an Excel file:
# Writing to an Excel file
df.to_excel('path/to/your/newfile.xlsx')
In this lesson, we’ve covered the fundamentals of importing and exporting data using Pandas. You’ve learned how to work with different file formats, which is a key part of the data analysis workflow.
You can use the arrows ← →
on your keyboard to navigate between lessons.