Data analysis from a CSV file in Python
Learn different ways to read and write CSV files in Python.

Topics Covered
- What is a CSV file?
- How to read CSV files using the csv module?
- How to skip the header row in CSV with Python?
- How to read CSV files as a dictionary in Python?
- How to write to CSV files using the csv module?
- How to use the delimiter parameter in csv.writer?
- How to write a dictionary to a CSV file in Python?
What is a CSV file?
CSV stands for comma separated value.
You might have come across this file format while downloading data from an excel spreadsheet or a database. CSV files are convenient for storing tabular data.
It should be clear from the name that values in a CSV file are separated by a comma(by default).
Below is an example of CSV file containing information about a family.
my_family.csv
name,age,height(cm),weight(kg)
Lenin,30,188,90
Phil,42,178,76
Claire,40,165,54
Alex,18,140,46
Usually the first line in a CSV file is called the Header which identifies the column name and data type. Every row after the header is a data record.
From the above example, you can see that each value(whether part of header or a data record) is separated by a comma. This separator character is called a Delimiter. A CSV file may use other delimiters other than comma.
Examples of other delimiter -
- tab
\t
- colon
:
- semi colon
;
- pipe
|
In this article, you will learn to work with CSV files using the csv
module and the pandas
library.
How to read CSV files using the csv module?
Reading from a CSV file is done with the csv.reader
object. You can open the CSV file as a text file with Python’s built-in open() function.
Example 1
import csv
with open('my_family.csv') as input:
csv_reader = csv.reader(input, delimiter=',')
line_count = 0
for row in csv_reader:
if line_count == 0:
print(f'Header row - {", ".join(row)}')
line_count += 1
else:
print(f'{row[0]} is {row[1]} years old, {row[2]} cm tall and {row[3]} kg heavy')
line_count += 1
print(f'Total: {line_count} lines')
Output
Header row - name, age, height(cm), weight(kg)
Lenin is 30 years old, 188 cm tall and 90 kg heavy
Phil is 42 years old, 178 cm tall and 76 kg heavy
Claire is 40 years old, 165 cm tall and 54 kg heavy
Alex is 18 years old, 140 cm tall and 46 kg heavy
Total: 5 lines
Since the first row is the header row(line_count
will be 0), it is treated differently. You can also skip the header row while reading the CSV.
How to skip the header row in CSV with Python?
Since csv.reader
object is an iterable, you can use next(reader object, None)
function to return the header row and skip over it.
Example
import csv
with open('my_family.csv') as input:
csv_reader = csv.reader(input, delimiter=',')
line_count = 0
next(csv_reader, None) #ignore the header
for row in csv_reader:
print(f'{row[0]} is {row[1]} years old, {row[2]} cm tall and {row[3]} kg heavy')
line_count += 1
print(f'Total: {line_count} lines')
Output
Lenin is 30 years old, 188 cm tall and 90 kg heavy
Phil is 42 years old, 178 cm tall and 76 kg heavy
Claire is 40 years old, 165 cm tall and 54 kg heavy
Alex is 18 years old, 140 cm tall and 46 kg heavy
Total: 4 lines
How to read CSV files as a dictionary?
You can read the CSV file as a dictionary by using the csv.DictReader
object.
An advantage of using the DictReader
object is that it turns each row into a dictionary which make accessing the fields a little more easier.
Example
import csv
with open('my_family.csv') as input:
csv_reader = csv.DictReader(input, delimiter=',')
for row in csv_reader:
print(f'{row["name"]} is {row["age"]} years old, {row["height(cm)"]} cm tall and {row["weight(kg)"]} kg heavy')
print(f'Total: {csv_reader.line_num} lines')
The csv_reader.line_num
method returns the total number of lines in the CSV file.
For the csv.DictReader
object, Python uses the column names as key from the header row.
The csv.DictReader
object doesn't have the header row in it.
How to write to CSV files using the csv module?
You can write to a CSV file using the csv.writer
object. Be careful to open the file in writing mode.
Example
import csv
header = ['Name', 'Age', 'Height(cm)', 'Weight(kg)']
data = [ ['Phil', 42, 178, 76],
['Alex', 18, 140, 46],
['Claire', 40, 165, 54] ]
filename = "my_family.csv"
with open(filename, 'w') as output:
csvwriter = csv.writer(output)
# Write a single list
csvwriter.writerow(header)
# Writing a list of lists
csvwriter.writerows(data)
Output
Name,Age,Height(cm),Weight(kg)
Phil,42,178,76
Alex,18,140,46
Claire,40,165,54
The writerow
method is going to write a list of values into a single row whereas writerows
is going to write multiple rows from a buffer that contains one or more lists.
How to use the delimiter parameter in csv.writer?
Notice that no delimiter has been mentioned while creating the csv.writer
object. In such cases, comma ,
is used as the default delimiter. You can also use a different delimiter by passing the delimiter
parameter.
Example
import csv
header = ['Name', 'Age', 'Height(cm)', 'Weight(kg)']
data = [ ['Phil', 42, 178, 76],
['Alex', 18, 140, 46],
['Claire', 40, 165, 54] ]
filename = "my_family.csv"
with open(filename, 'w') as output:
csvwriter = csv.writer(output, delimiter = '|')
# Write a single list
csvwriter.writerow(header)
# Writing a list of lists
csvwriter.writerows(data)
Output
Name|Age|Height(cm)|Weight(kg)
Phil|42|178|76
Alex|18|140|46
Claire|40|165|54
How to write a dictionary to a CSV file in Python?
You can write the dictionary into a CSV file using the DictWriter
method. The fieldnames
parameter is compulsory for passing the header information.
Example
import csv
header = ['Name', 'Age', 'Height(cm)', 'Weight(kg)']
data = [
{"Name":"Phil", "Age": 42, "Height(cm)":178, "Weight(kg)":76},
{"Name":"Claire", "Age": 40, "Height(cm)":165, "Weight(kg)":54},
{"Name":"Alex", "Age": 18, "Height(cm)":140, "Weight(kg)":46}
]
filename = "my_family.csv"
with open(filename, 'w') as output:
csvwriter = csv.DictWriter(output, fieldnames=header)
csvwriter.writeheader()
for row in data:
csvwriter.writerow(row)
Output
Name,Age,Height(cm),Weight(kg)
Phil,42,178,76
Claire,40,165,54
Alex,18,140,46
You can also use writerows
to write all the dictionaries to the CSV file at once.
Example
import csv
header = ['Name', 'Age', 'Height(cm)', 'Weight(kg)']
data = [
{"Name":"Phil", "Age": 42, "Height(cm)":178, "Weight(kg)":76},
{"Name":"Claire", "Age": 40, "Height(cm)":165, "Weight(kg)":54},
{"Name":"Alex", "Age": 18, "Height(cm)":140, "Weight(kg)":46}
]
filename = "my_family.csv"
with open(filename, 'w') as output:
csvwriter = csv.DictWriter(output, fieldnames=header)
csvwriter.writeheader()
csvwriter.writerows(data)
Output
Name,Age,Height(cm),Weight(kg)
Phil,42,178,76
Claire,40,165,54
Alex,18,140,46