How to read/write Excel files in Python

Python-Read-Write-Excel-Files

In this tutorial, we’re gonna look at way to use openpyxl module to read, write Excel spreadsheet files in Python program.

Excel spreadsheet files

An Excel file is called a workbook which is saved on PC as .xlsx extension.
Each workbook contains one or more sheets.
Each sheet has columns (letters: A, B, C…) and rows (numbers: 1, 2, 3…).
A box addressed by [column, row] pair is called a cell.

read-write-excel-files-in-python-openpyxl-module-excel-file

Read/Write Excel files in Python

Install openpyxl module

Open cmd, then run:
pip install openpyxl

Once the installation is successful, we can see openpyxl folder at Python\Python[version]\Lib\site-packages.
(In this tutorial, we use openpyxl 2.5.12)

Now we can import the module by running import openpyxl.

Read Excel file

Open workbook

We use openpyxl.load_workbook(filename) function with input filename to get a Workbook object that represents the Excel file.


>>> import openpyxl
>>> gkzWb = openpyxl.load_workbook('ozenero.xlsx')
>>> type(gkzWb)

Get sheets

We call Workbook's get_sheet_names() method to get list of all the sheet names in the workbook.


>>> gkzWb.get_sheet_names()
['Python Tutorials', 'Java Tutorials']

Then we can obtain a Worksheet object with the sheet name using Workbook's get_sheet_by_name() method, or we can get the Workbook’s active sheet using get_active_sheet() method.


>>> pySheet = gkzWb.get_sheet_by_name('Python Tutorials')
>>> type(pySheet)

>>> pySheet.title
'Python Tutorials'

>>> gkzWb.get_active_sheet()

We can get size of Worksheet object using its max_row and max_column attributes.


>>> pySheet.max_row
11
>>> pySheet.max_column
3

Get cells

With a Worksheet object, we can access Cell objects inside.


>>> pySheet['A2']


>>> pySheet['A1'].value
'Title'
>>> pySheet['A2'].value
'How to parse HTML in Python using BeautifulSoup module'
>>> pySheet['B2'].value
'beautifulsoup, parse html, python, web crawling, web scraping\t'
>>> pySheet['C2'].value
datetime.datetime(2019, 1, 19, 0, 0)

After column Z, the columns continues with two letters: AA, AB... To make things simple, we can also get a Cell using Worksheet's cell() method with input integers for row and column arguments.

*Note: The first row or column integer is 1.


>>> pySheet.cell(row=1, column=1)

>>> pySheet.cell(row=1, column=2)

>>> pySheet.cell(row=3, column=3)

>>> pySheet.cell(3, 3)


>>> pySheet.cell(3, 27)

>>> pySheet.cell(3, 28)

We can also accessCell object location information with its row, column, and coordinate attributes.


>>> cellC1 = pySheet['C1']
>>> cellC1.column
'C'
>>> cellC1.row
1
>>> cellC1.coordinate
'C1'

We can also slice Worksheet object to get list of Cell objects in a rectangular area:


>>> pySheet['A2':'C5']
(
(, , ),
(, , ),
(, , ),
(, , )
)

>>> for row in pySheet['A2':'C5']:
...     for cell in row:
...             print(cell.coordinate, cell.value)
...     print('---')
...
A2 How to parse HTML in Python using BeautifulSoup module
B2 beautifulsoup, parse html, python, web crawling, web scraping
C2 2019-01-19 00:00:00
---
A3 How to copy, move, rename, delete files/folders in Python
B3 python, file
C3 2019-01-13 00:00:00
---
A4 How to read/write files in Python
B4 python, file, read file, write file
C4 2019-01-11 00:00:00
---
A5 Python Regular Expression to extract email from text
B5 python, regex, regular expression
C5 2019-01-06 00:00:00

For cells in a particular row or column, we can also use a Worksheet object's rows and columns attribute.


>>> list(pySheet.columns)
[
(, , ..., ),
(, , ..., ),
(, , ..., )
]

>>> column = list(pySheet.columns)[0]
>>> for cell in column:
...     print(str(cell.row) + ': ' + cell.value)
...
1: Title
2: How to parse HTML in Python using BeautifulSoup module
3: How to copy, move, rename, delete files/folders in Python
4: How to read/write files in Python
5: Python Regular Expression to extract email from text
6: Python Regular Expression to extract phone number
7: Python Regular Expression
8: Python String methods
9: Python Dictionary Data Structure
10: How to iterate over a List in Python
11: Python List functions

>>> row = list(pySheet.rows)[1]
>>> for cell in row:
...     print(cell.value)
...
How to parse HTML in Python using BeautifulSoup module
beautifulsoup, parse html, python, web crawling, web scraping
2019-01-19 00:00:00

Write Excel file

Create workbook

We use openpyxl.Workbook() function that returns a new blank Workbook object.


>>> gkzWb = openpyxl.Workbook()
>>> type(gkzWb)

>>> gkzWb.get_sheet_names()
['Sheet']

Create & Remove sheet

We use Workbook's create_sheet() method to create a new Worksheet object.
By default, it is set to be the last sheet in the workbook named SheetX.

We can also specify the index and name of the new sheet by passing them as arguments to code>create_sheet() method.


>>> gkzWb.create_sheet()

>>> gkzWb.create_sheet()

>>> gkzWb.get_sheet_names()
['Sheet', 'Sheet1', 'Sheet2']

>>> gkzWb.create_sheet(index=0, title='Python Tuts')

>>> gkzWb.create_sheet(index=2, title='Java Tuts')

>>> gkzWb.get_sheet_names()
['Python Tuts', 'Sheet', 'Java Tuts', 'Sheet1', 'Sheet2']

To remove a Worksheet object from a Workbook, we use remove_sheet() method and pass that Worksheet object as argument (not the sheet name).

>>> gkzWb.remove_sheet(gkzWb.get_sheet_by_name('Sheet'))
>>> gkzWb.get_sheet_names()
['Python Tuts', 'Java Tuts', 'Sheet1', 'Sheet2']

>>> gkzWb.remove_sheet(gkzWb.get_sheet_by_name('Sheet1'))
>>> gkzWb.remove_sheet(gkzWb.get_sheet_by_name('Sheet2'))
>>> gkzWb.get_sheet_names()
['Python Tuts', 'Java Tuts']

Write value to cells

It is just like writing values to keys in a Python dictionary. In this case, key are cell's coordinate.


>>> pySheet = gkzWb.get_sheet_by_name('Python Tuts')
>>> pySheet['A1'] = 'Title'
>>> pySheet['B1'] = 'Tags'
>>> pySheet['C1'] = 'Date'

Save workbook

All the changes we have made on Workbook object will not be saved until we call Workbook's save() method.

Whenever we edit a file, we should always save the new, edited one to a different filename than the original. Passing a different filename to save() method (such as 'ozenero_copy_1.xlsx') will save the changes to a copy of the file.


>>> gkzWb.save('ozenero.xlsx')

>>> pySheet['A2'] = 'How to parse HTML in Python using BeautifulSoup module'
>>> pySheet['A3'] = 'How to copy, move, rename, delete files/folders in Python'
>>> gkzWb.save('ozenero-copy-1.xlsx')

Setting cells

Font object

To customize font styles in cells, we import Font from the openpyxl.styles module.


>>> import openpyxl
>>> from openpyxl.styles import Font
>>> gkzWb = openpyxl.load_workbook('ozenero.xlsx')
>>> sheet = gkzWb.get_sheet_by_name('Python Tuts')

>>> fontObj = Font(name='Times New Roman', size=23, bold=True, italic=True)
>>> sheet['A1'].font = fontObj
>>> gkzWb.save('ozenero-copy.xlsx')

Formula

Adding formulas to cells programmatically is just like any normal value.


>>> import openpyxl
>>> sumWb = openpyxl.Workbook()
>>> sh = sumWb.get_active_sheet()
>>> sh['A1'] = 100
>>> sh['A2'] = 200
>>> sh['A3'] = 300
>>> sh['A4'] = '=SUM(A1:A3)'
>>> sumWb.save('sum.xlsx')

>>> sh['A4'].value
'=SUM(A1:A3)'

Size of cells

We can set row_dimensions and column_dimensions attributes value of Worksheet object to change row heights (using numbers) and column widths (using letters).


>>> import openpyxl
>>> gkzWb = openpyxl.load_workbook('ozenero.xlsx')
>>> sheet = gkzWb.get_active_sheet()
>>> sheet.row_dimensions[3].height = 68
>>> sheet.column_dimensions['B'].width = 42
>>> gkzWb.save('gkz-size.xlsx')

read-write-excel-files-in-python-openpyxl-module-excel-file-change-size-cell

Merge and Unmerge cells

We can use WorkSheet's merge_cells() method to merge a rectangular area into a single cell.


>>> import openpyxl
>>> wb = openpyxl.Workbook()
>>> sheet = wb.get_active_sheet()
>>> sheet.merge_cells('A1:C3')
>>> sheet['A1'] = 'ozenero Python Tutorials'
>>> wb.save('gkz-merged.xlsx')

read-write-excel-files-in-python-openpyxl-module-excel-file-change-merge-cells

0 0 votes
Article Rating
Subscribe
Notify of
guest
75 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments