Openpyxl tutorial: Handling Excel sheets in Python
Automate your Excel workflow by using Openpyxl module in Python 3.

Microsoft Excel is probably one of the highly used data storage applications. A huge proportion of small to medium size businesses fulfill their analytics requirement using Excel.
However, analyzing huge amount of data in Excel can become highly tedious and time-consuming. You could build customized data processing and analytics application using Visual Basic(VBA), the language that powers the Excel sheets. However, learning VBA could be difficult and perhaps, not worth it.
However, if you have a little knowledge of Python, you could build highly professional Business Intelligence using Excel data, without the need of a database. Using Python with Excel could be a game changer for your business.
Sections Covered
- Basic Information about Excel
- What is Openpyxl and how to install it?
- Reading data from Excel in Python
- Reading multiple cells from Excel in Python
- Find the max row and column number of an Excel sheet in Python
- How to iterate over Excel rows and columns in Python?
- Create a new Excel file with Python
- Writing data to Excel in Python
- Appending data to Excel in Python
- Manipulating Excel Sheets in Python
- Practical usage example of data analysis of Excel sheets in Python
Basic Information about Excel
Before beginning this Openpyxl tutorial, you need to keep the following details in mind.
- Excel files are called Workbooks.
- Each Workbook can contain multiple sheets.
- Every sheet consists of rows starting from 1 and columns starting from A.
- Rows and columns together make up a cell.
- Any type of data can be stored.
What is Openpyxl and how to install it?
The Openpyxl module in Python is used to handle Excel files without involving third-party Microsoft application software. It is arguably, the best python excel library that allows you to perform various Excel operations and automate excel reports using Python. You can perform all kinds of tasks using Openpyxl like:-
- Reading data
- Writing data
- Editing Excel files
- Drawing graphs and charts
- Working with multiple sheets
- Sheet Styling etc.
You can install Openpyxl module by typing pip install openpyxl in your command line.
pip install openpyxl
Reading data from Excel in Python
To import an excel file in Python, use the load_workbook method from Openpyxl library.
Let’s import an Excel file named wb1.xlsx
in Python using Openpyxl module. It has the following data as shown in the image below.

Step 1 - Import the load_workbook
method from Openpyxl.
from openpyxl import load_workbook
Step 2 - Provide the file location for the Excel file you want to open in Python.
wb = load_workbook('wb1.xlsx')
If your Excel file is present in the same directory as the python file, you don't need to provide to entire file location.
Step 3 - Choose the first active sheet present in the workbook using wb.active
attribute.
sheet = wb.active
The above points are a standard way of accessing Excel sheets using Python. You will see them being used multiple times through out this article.
Let’s read all the data present in Row 1 (header row).
Method 1 - Reading data through Excel cell name in Python
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
print(sheet["A1"].value)
print(sheet["B1"].value)
print(sheet["C1"].value)
print(sheet["D1"].value)
Output
ProductId
ProductName
Cost per Unit
Quantity
Method 2 - Reading data from Excel using cell() method in Python
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
print(sheet.cell(row=1, column=1).value)
print(sheet.cell(row=1, column=2).value)
print(sheet.cell(row=1, column=3).value)
print(sheet.cell(row=1, column=4).value)
Output
ProductId
ProductName
Cost per Unit
Quantity
Reading Multiple Cells from Excel in Python
You can also read multiple cells from an Excel workbook. Let’s understand this through various examples. Refer to the image of the wb1.xlsx
file above for clarity.
Method 1 - Reading a range of cells in Excel using cell names
To read the data from a specific range of cells in your Excel sheet, you need to slice your sheet object through both the cells.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells from A1 to D11
print(sheet["A1:D11"])
Output
((<Cell 'Sheet1'.A1>, <Cell 'Sheet1'.B1>, <Cell 'Sheet1'.C1>, <Cell 'Sheet1'.D1>),
(<Cell 'Sheet1'.A2>, <Cell 'Sheet1'.B2>, <Cell 'Sheet1'.C2>, <Cell 'Sheet1'.D2>),
(<Cell 'Sheet1'.A3>, <Cell 'Sheet1'.B3>, <Cell 'Sheet1'.C3>, <Cell 'Sheet1'.D3>),
(<Cell 'Sheet1'.A4>, <Cell 'Sheet1'.B4>, <Cell 'Sheet1'.C4>, <Cell 'Sheet1'.D4>),
(<Cell 'Sheet1'.A5>, <Cell 'Sheet1'.B5>, <Cell 'Sheet1'.C5>, <Cell 'Sheet1'.D5>),
(<Cell 'Sheet1'.A6>, <Cell 'Sheet1'.B6>, <Cell 'Sheet1'.C6>, <Cell 'Sheet1'.D6>),
(<Cell 'Sheet1'.A7>, <Cell 'Sheet1'.B7>, <Cell 'Sheet1'.C7>, <Cell 'Sheet1'.D7>),
(<Cell 'Sheet1'.A8>, <Cell 'Sheet1'.B8>, <Cell 'Sheet1'.C8>, <Cell 'Sheet1'.D8>),
(<Cell 'Sheet1'.A9>, <Cell 'Sheet1'.B9>, <Cell 'Sheet1'.C9>, <Cell 'Sheet1'.D9>),
(<Cell 'Sheet1'.A10>, <Cell 'Sheet1'.B10>, <Cell 'Sheet1'.C10>, <Cell 'Sheet1'.D10>),
(<Cell 'Sheet1'.A11>, <Cell 'Sheet1'.B11>, <Cell 'Sheet1'.C11>, <Cell 'Sheet1'.D11>))
You can see that by slicing the sheet
data from A1:D11
, it returned us tuples of row data inside a tuple. In order to read the values of every cell returned, you can iterate over each row and use .value
.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells from A1 to D11
for row in sheet["A1:D11"]:
print ([x.value for x in row])
Output
['ProductId', 'ProductName', 'Cost per Unit', 'Quantity']
[1, 'Pencil', '$0.5', 200]
[2, 'Pen', '$1', 500]
[3, 'Eraser', '$0.25', 100]
[4, 'Sharpner', '$0.75', 100]
[5, 'Files', '$3', 50]
[6, 'A4 Size Paper', '$9', 10]
[7, 'Pencil Box', '$12', 20]
[8, 'Pen Stand', '$5.5', 10]
[9, 'Notebook', '$2', 50]
[10, 'Marker', '$1', 75]
Method 2 - Reading a single row in Excel using cell name
To read a single row in your Excel sheet, just access the single row number from your sheet
object.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells in row 1
for data in sheet["1"]:
print(data.value)
Output
ProductId
ProductName
Cost per Unit
Quantity
Method 3 - Reading all rows in Excel using rows attribute
To read all the rows, use sheet.rows
to iterate over rows with Openpyxl. You receive a tuple element per row by using the sheet.rows
attribute.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells in row 1
for row in sheet.rows:
print([data.value for data in row])
Output
['ProductId', 'ProductName', 'Cost per Unit', 'Quantity']
[1, 'Pencil', '$0.5', 200]
[2, 'Pen', '$1', 500]
[3, 'Eraser', '$0.25', 100]
[4, 'Sharpner', '$0.75', 100]
[5, 'Files', '$3', 50]
[6, 'A4 Size Paper', '$9', 10]
[7, 'Pencil Box', '$12', 20]
[8, 'Pen Stand', '$5.5', 10]
[9, 'Notebook', '$2', 50]
[10, 'Marker', '$1', 75]
Method 4 - Reading a single column in Excel using cell name
Similar to reading a single row, you can read the data in a single column of your Excel sheet by its alphabet.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells in column A
for data in sheet["A"]:
print(data.value)
Output
ProductId
1
2
3
4
5
6
7
8
9
10
Method 5 - Reading all the columns in Excel using columns attribute
To read all the data as a tuple of the columns in your Excel sheet, use sheet.columns
attribute to iterate over all columns with Openpyxl.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all columns
for col in sheet.columns:
print([data.value for data in col])
Output
['ProductId', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
['ProductName', 'Pencil', 'Pen', 'Eraser', 'Sharpner', 'Files', 'A4 Size Paper', 'Pencil Box', 'Pen Stand', 'Notebook', 'Marker']
['Cost per Unit', '$0.5', '$1', '$0.25', '$0.75', '$3', '$9', '$12', '$5.5', '$2', '$1']
['Quantity', 200, 500, 100, 100, 50, 10, 20, 10, 50, 75]
Method 6 - Reading all the data in Excel
To read all the data present in your Excel sheet, you don’t need to index the sheet
object. You can just iterate over it.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells in Excel
for row in sheet:
print([data.value for data in row])
Output
['ProductId', 'ProductName', 'Cost per Unit', 'Quantity']
[1, 'Pencil', '$0.5', 200]
[2, 'Pen', '$1', 500]
[3, 'Eraser', '$0.25', 100]
[4, 'Sharpner', '$0.75', 100]
[5, 'Files', '$3', 50]
[6, 'A4 Size Paper', '$9', 10]
[7, 'Pencil Box', '$12', 20]
[8, 'Pen Stand', '$5.5', 10]
[9, 'Notebook', '$2', 50]
[10, 'Marker', '$1', 75]
Find the max row and column number of an Excel Sheet in Python
To find the max row and column number from your Excel sheet in Python, use sheet.max_row
and sheet.max_column
attributes in Openpyxl.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
print(f"Max row in the active sheet is {sheet.max_row}")
print(f"Max column in the active sheet is {sheet.max_column}")
Output
Max row in the active sheet is 11
Max column in the active sheet is 4
Note - If you update a cell with a value, the sheet.max_row
and sheet.max_column
values also change, even though you haven’t saved your changes.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
sheet = wb.active
sheet["A1"].value = "Lenin"
print(sheet.max_row)
sheet["A2"].value = "Mishra"
print(sheet.max_row)
# wb.save('pylenin.xlsx')
Output
1
2
How to iterate over Excel rows and columns in Python?
Openpyxl offers two commonly used methods called iter_rows
and iter_cols
to iterate over Excel rows and columns in Python.
iter_rows()
- Returns one tuple element per row selected.iter_cols()
- Returns one tuple element per column selected.
Both the above mentioned methods can receive the following arguments for setting boundaries for iteration:
- min_row
- max_row
- min_col
- max_col
Example 1 - iter_rows()
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells from between
# "Row 1 and Row 2" and "Column 1 and Column 3"
for row in sheet.iter_rows(min_row=1,
max_row=2,
min_col=1,
max_col=3):
print([data.value for data in row])
Output
['ProductId', 'ProductName', 'Cost per Unit']
[1, 'Pencil', '$0.5']
As you can see, only the first 3 columns of the first 2 rows are returned. The tuples are row based.
You can also choose to not pass in some or any arguments in iter_rows
method.
Code - Not passing min_col and max_col
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# min_col and max_col arguments are not provided
for row in sheet.iter_rows(min_row=1,
max_row=2):
print([data.value for data in row])
Output
['ProductId', 'ProductName', 'Cost per Unit', 'Quantity']
[1, 'Pencil', '$0.5', 200]
All the columns from the first 2 rows are being printed.
Example 2 - iter_cols()
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Access all cells from A1 to D11
for row in sheet.iter_cols(min_row=1,
max_row=2,
min_col=1,
max_col=3):
print([data.value for data in row])
Output
['ProductId', 1]
['ProductName', 'Pencil']
['Cost per Unit', '$0.5']
The tuples returned are column based on using iter_cols()
method.
You can also choose to not pass in some or any arguments in iter_cols()
method.
Code - Not passing any argument
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# min_col and max_col arguments are not provided
for row in sheet.iter_cols():
print([data.value for data in row])
Output
['ProductId', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
['ProductName', 'Pencil', 'Pen', 'Eraser', 'Sharpner', 'Files', 'A4 Size Paper', 'Pencil Box', 'Pen Stand', 'Notebook', 'Marker']
['Cost per Unit', '$0.5', '$1', '$0.25', '$0.75', '$3', '$9', '$12', '$5.5', '$2', '$1']
['Quantity', 200, 500, 100, 100, 50, 10, 20, 10, 50, 75]
Create a new Excel file with Python
To create a new Excel file in Python, you need to import the Workbook
class from Openpyxl library.
Code
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
sheet['A1'] = "Pylenin"
sheet['B1'] = "loves"
sheet['C1'] = "Python"
wb.save("pylenin.xlsx")
This should create a new Excel workbook called pylenin.xlsx
with the provided data.

Writing data to Excel in Python
There are multiple ways to write data to an Excel file in Python.
Method 1 - Writing data to Excel using cell names
Code
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
sheet['A1'] = "Pylenin"
sheet['B1'] = "loves"
sheet['C1'] = "Python"
wb.save("pylenin.xlsx")
Output

Method 2 - Writing data to Excel using the cell() method
Code
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
sheet.cell(row=1, column=1).value = "Pylenin"
sheet.cell(row=1, column=2).value = "loves"
sheet.cell(row=1, column=3).value = "Python"
wb.save("pylenin.xlsx")
Output

Method 3 - Writing data to Excel by iterating over rows
Code - Example 1
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
for row in sheet["A1:D3"]:
row[0].value = "Pylenin"
row[1].value = "loves"
row[2].value = "Python"
wb.save("pylenin.xlsx")
Output

You can also use methods like iter_rows()
and iter_cols()
to write data to Excel.
Code - Example 2 - using iter_rows()
method
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
for row in sheet.iter_rows(min_row=1,
max_row=3,
min_col=1,
max_col=3):
row[0].value = "Pylenin"
row[1].value = "loves"
row[2].value = "Python"
wb.save("pylenin.xlsx")
Output

Code - Example 3 - using iter_cols()
method
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
for col in sheet.iter_cols(min_row=1,
max_row=3,
min_col=1,
max_col=3):
col[0].value = "Pylenin"
col[1].value = "loves"
col[2].value = "Python"
wb.save("pylenin.xlsx")
Output

Appending data to Excel in Python
Openpyxl provides an append()
method, which is used to append values to an existing Excel sheet in Python.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
sheet = wb.active
data = (
("Pylenin", "likes", "icecream"),
("Pylenin", "likes", "Cricket")
)
for row in data:
sheet.append(row)
wb.save('pylenin.xlsx')
Output

Manipulating Excel Sheets in Python
Each Excel workbook can contain multiple sheets. To get a list of all the sheet names in an Excel workbook, you can use the wb.sheetnames
.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
print(wb.sheetnames)
Output
['Sheet']
As you can see, pylenin.xlsx
has only one sheet.
To create a new sheet in Python, use the create_sheet()
method from the Openpyxl library.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
wb.create_sheet('Pylenin')
wb.save('pylenin.xlsx')
Output

You can also create sheets at different positions in the Excel Workbook.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
# insert sheet at 2nd to last position
wb.create_sheet('Lenin Mishra', -1)
wb.save('pylenin.xlsx')
Output

If your Excel workbook contains multiple sheets and you want to work with a particular sheet, you can refer the title of that sheet in your workbook object.
Code
from openpyxl import load_workbook
wb = load_workbook('pylenin.xlsx')
ws = wb["Pylenin"]
ws["A1"].value = "Pylenin"
ws["A2"].value = "loves"
ws["A3"].value = "Python"
wb.save('pylenin.xlsx')
Output

Practical usage example of data analysis of Excel sheets in Python
Let’s perform some data analysis with wb1.xlsx
file as shown in the first image.
Objective
- Add a new column showing
Total Price per Product
. - Calculate the
Total Cost
of all the items bought.
The resulting Excel sheet should look like the below image.

Step 1 - Find the max row and max column of the Excel sheet
As mentioned before, you can use the sheet.max_row
and sheet.max_column
attributes to find the max row and max column for any Excel sheet with Openpyxl.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
print(f"Max row in the active sheet is {sheet.max_row}")
print(f"Max column in the active sheet is {sheet.max_column}")
Output
Max row in the active sheet is 11
Max column in the active sheet is 4

Step 2 - Add an extra column in Excel with Python

To add an extra column in the active Excel sheet, with calculations, you need to first create a new column header in the first empty cell and then iterate over all rows to multiply Quantity
with Cost per Unit
.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Add new column header
sheet.cell(row=1, column=sheet.max_column+1).value = "Total Price per Product"
wb.save("wb1.xlsx")
Output

Now that an extra column header has been created, the sheet.max_column
value will change to 5.
Now you can calculate the Total Price per Product
using iter_rows()
method.
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
# Calculate Total Price per Product
for id, row in enumerate(sheet.iter_rows(min_row=2,
max_row = sheet.max_row)):
row_number = id + 2 # index for enumerate will start at 0
product_name = row[1].value
cost_per_unit = row[2].value
quantity = row[3].value
print(f"Total cost for {product_name} is {cost_per_unit*quantity}")
# Update cell value in the last column
current_cell = sheet.cell(row=row_number, column=sheet.max_column)
current_cell.value = cost_per_unit*quantity
# Format cell from number to $ currency
current_cell.number_format = '$#,#0.0'
print("\nSuccesfully updated Excel")
wb.save('wb1.xlsx')
Output
Total cost for Pencil is 100.0
Total cost for Pen is 500
Total cost for Eraser is 25.0
Total cost for Sharpner is 75.0
Total cost for Files is 150
Total cost for A4 Size Paper is 90
Total cost for Pencil Box is 240
Total cost for Pen Stand is 55.0
Total cost for Notebook is 100
Total cost for Marker is 75
Succesfully updated Excel

Step 3 - Calculate sum of a column in Excel with Python
The last step is to calculate the Total Cost
of the last column in the Excel file.
Access the last column and add up all the cost.
You can read the last column by accessing the sheet.columns
attribute. Since it returns a generator, you first convert it to a python list
and access the last column.
last_column_data = list(sheet.columns)[-1]
# Ignore header cell
total_cost = sum([x.value for x in last_column_data[1:]])
Create a new row 2 places down from the max_row and fill in Total Cost
.
max_row = sheet.max_row
total_cost_descr_cell = sheet.cell(row = max_row +2, column = sheet.max_column -1)
total_cost_descr_cell.value = "Total Cost"
total_cost_cell = sheet.cell(row = max_row +2, column = sheet.max_column)
total_cost_cell.value = total_cost
Import Font
class from openpyxl.styles
to make the last row Bold.
# Import the Font class from Openpyxl
from openpyxl.styles import Font
bold_font = Font(bold=True)
total_cost_descr_cell.font = bold_font
total_cost_cell.font = bold_font
total_cost_cell.number_format = "$#,#0.0"
Final Code
Code
from openpyxl import load_workbook
wb = load_workbook('wb1.xlsx')
sheet = wb.active
last_column_data = list(sheet.columns)[-1]
# Ignore header cell
total_cost = sum([x.value for x in last_column_data[1:]])
max_row = sheet.max_row
total_cost_descr_cell = sheet.cell(row = max_row + 2, column = sheet.max_column -1)
total_cost_descr_cell.value = "Total Cost"
total_cost_cell = sheet.cell(row = max_row + 2, column = sheet.max_column)
total_cost_cell.value = total_cost
# Import the Font class from Openpyxl
from openpyxl.styles import Font
bold_font = Font(bold=True)
total_cost_descr_cell.font = bold_font
total_cost_cell.font = bold_font
total_cost_cell.number_format = "$#,#0.0"
print("\nSuccesfully updated Excel")
wb.save('wb1.xlsx')
When you run the above code, you should see all the relevant updates to your Excel sheet.
