From xlrd to openpyxl
The Problem:
I recently stumbled upon a problem updating a python project which would automatically create invoices based on an excel file. The python project was created using the xlrd library which does not work with xlsx files. In order to update the project to the openpyxl library I needed to rewrite more than 500 lines of code, until I found this quick workaround.
A little more context: the invoices are created using a third party's API. This API is quite old and only reads xml files, thus we translate the excel file to an xml file and send it to the API.
The Solution:
After looking some of the lines of code, I noticed a pattern: all reads to a specific cell in a sheet were written:
str(sheet.cell_value(row, column))
as per the xrld standard notation of `row_number, column_number', where the indexes start in 0.
Openpyxl uses a different notation for accessing the row and column of a sheet:
sheet['column_letter row_number']
# example
sheet['A5']
or
sheet(row=row_number, column=column_number)
# example
sheet(row=1, column=5) #corresponds to A5
This is when I decided to create a function to change from xrld notation to openpyxl notation:
def read_excel(sheet, w_row, w_column):
return sheet.cell(row=w_row+1, column=w_column+1).value
the + 1
in the row and column corresponds to the indexes starting at 1 instead of 0.
By creating this function all I had to do is search and replace:
- search:
sheet.cell_value(
- replace with:
read_excel(sheet,
I hope this helps someone struggling with changing from xlrd to openpyxl.