Excel package for python

Best Python Packages for Excel

At some point, you’ll probably need to work with data from an Excel spreadsheet. How can you work with Excel data and files in Python? We review some of the best Python packages for Excel in this article.

An Excel spreadsheet is a very common way of storing tabular data. But Excel is not without its problems, as we discussed in the article Excel Alternative: What to Learn as a Data Analyst. For large datasets, you may need the functionality of a database. But when you’re working with smaller datasets, you may want the convenience of Excel. In this case, knowing how to work with Excel data in Python is an important skill to master.

The Python libraries we’ll discuss can allow you to do everything from reading, writing, and modifying existing Excel files to creating new Excel files. For some background reading, check out our article How to Read Excel Files in Python. Or to broaden your skills even further, we have a Working with Files and Directories in Python course, which will give you the ability to load data more efficiently and store or share the results.

So, let’s talk about these Python packages that make working with Excel possible. But first, we need to clear one thing up: the many file formats in Excel.

Excel File Formats

Until 2007, Excel used a file format with the extension .xls. For later versions of Excel, the default file format became the Excel Workbook, which has the extension .xlsx. Other formats have appeared to support more specific functionality. These include .xlsm (a macro-enabled file format,) and the binary file format .xlsb for Excel 2007 and Excel 2010. There are also template file formats, including .xltx and .xltm (a macro-enabled file format).

5 Libraries to Make Working with Excel in Python Easier

1. openpyxl

The first Python package for Excel which we’ll discuss is openpyxl. It’s possibly the most widely used package for working with Excel files in Python. This package is designed to read and write Excel 2010 files with formats including .xlsx, .xlsm, .xltx, and .xltm. As mentioned in the official documentation, openpyxl could be vulnerable to certain malicious attacks, but these can be guarded against.

If you want to create a new Excel file, start by importing the library and creating a workbook object:

>>> import openpyxl >>> wb_obj = openpyxl.Workbook()

Now you can get the active sheet and start assigning data to the cells. Finally, use the save() method to write the file:

>>> sheet = wb_obj.active >>> sheet['A1'] = 2 >>> sheet['A2'] = 3 >>> wb_obj.save('data.xlsx')

You can also read an existing file and modify it using this package.

2. XlsxWriter

The next Python package for working with Excel is XlsxWriter, which works with .xlsx files. It can create files, add data to tables in the workbook, and format the data. A particularly nice feature is the ability to use Python to add charts directly into the workbook. This package also gives you the ability to apply formulas to the workbook.

Читайте также:  Css templates it company

XlsxWriter cannot be used to read or modify an existing Excel file. To use it, we need to create a new xlsx file, add some data, and apply a formula:

>>> wb_obj = xlsxwriter.Workbook('formula.xlsx') >>> sheet = wb_obj.add_worksheet() >>> sheet.write('A1', 2) >>> sheet.write('A2', 3) >>> sheet.write_formula('A3', '') >>> wb_obj.close()

For more information and examples, see the XlsxWriter documentation.

3 and 4. pyxlsb and pyxlsb2

As the name of the next package suggests, pyxlsb specialises in parsing the binary file format .xlsb. The functionality is quite limited, but you can open a workbook, get a particular sheet, and read the rows. This can be achieved with the open_workbook() , get_sheet_by_index() , and rows() methods.

There is also some limited functionality for formatting dates to convert them to datetime objects. The updated version, pyxlsb2, offers some improvements over its predecessor. These include speeding up processing, loading worksheets and macrosheets, and extracting macro formulas.

5. pylightxl

The pylightxl package is a lightweight, zero-dependency package that can read and write Excel files. The zero-dependency factor could be a compelling feature if you’re developing bigger projects, since this will avoid any compatibility issues with other software and make version control easier. Also, regardless of which version of Python you’re running (from Python 2.7.18 onwards), pylightxl will be compatible for life.

After installing this library, you can import it and read in a file:

>>> import pylightxl as xl >>> db = xl.readxl('data.xlsx')

From here, you can use many methods on this database object to access and modify the worksheets and write out a new Excel file.

How to Excel at Python

We can’t talk about working with data in Python without mentioning pandas. This is an incredibly useful library. In the article Python Libraries Every Programming Beginner Should Know, we show an example of how to read Excel files in Python with pandas. This is a fundamental library and one of our Top 15 Python Libraries for Data Science.

Using Python to read Excel files is essential if you want to get good at working with data. To boost your motivation, here’s an explanation of the Benefits of Learning Python.

Читайте также:  Php call to undefined function imagejpeg

One of our 5 Tips for Learning Python From Scratch is to find a good resource. So, consider taking the How to Read and Write Excel Files in Python course. It includes 45 interactive exercises, so you get plenty of practical experience.

Источник

Working with Excel Files in Python

This site contains pointers to the best information available about working with Excel files in the Python programming language.

Reading and Writing Excel Files

There are python packages available to work with Excel files that will run on any Python platform and that do not require either Windows or Excel to be used. They are fast, reliable and open source:

openpyxl

The recommended package for reading and writing Excel 2010 files (ie: .xlsx)

xlsxwriter

An alternative package for writing data, formatting information and, in particular, charts in the Excel 2010 format (ie: .xlsx)

pyxlsb

This package allows you to read Excel files in the xlsb format.

pylightxl

This package allows you to read xlsx and xlsm files and write xlsx files.

xlrd

This package is for reading data and formatting information from older Excel files (ie: .xls)

xlwt

This package is for writing data and formatting information to older Excel files (ie: .xls)

xlutils

This package collects utilities that require both xlrd and xlwt , including the ability to copy and modify or filter existing excel files.

NB: In general, these use cases are now covered by openpyxl!

Writing Excel Add-Ins

The following products can be used to write Excel add-ins in Python. Unlike the reader and writer packages, they require an installation of Microsoft Excel.

PyXLL

PyXLL is a commercial product that enables writing Excel add-ins in Python with no VBA. Python functions can be exposed as worksheet functions (UDFs), macros, menus and ribbon tool bars.

xlwings

xlwings is an open-source library to automate Excel with Python instead of VBA and works on Windows and macOS: you can call Python from Excel and vice versa and write UDFs in Python (Windows only). xlwings PRO is a commercial add-on with additional functionality.

The Mailing List / Discussion Group

There is a Google Group dedicated to working with Excel files in Python, including the libraries listed above along with manipulating the Excel application via COM.

Commercial Development

The following companies can provide commercial software development and consultancy and are specialists in working with Excel files in Python:

Источник

openpyxl — A Python library to read/write Excel 2010 xlsx/xlsm files¶

openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.

It was born from lack of existing library to read/write natively from Python the Office Open XML format.

All kudos to the PHPExcel team as openpyxl was initially based on PHPExcel.

Security¶

By default openpyxl does not guard against quadratic blowup or billion laughs xml attacks. To guard against these attacks install defusedxml.

Читайте также:  Javascript style top margin

Mailing List¶

from openpyxl import Workbook wb = Workbook() # grab the active worksheet ws = wb.active # Data can be assigned directly to cells ws['A1'] = 42 # Rows can also be appended ws.append([1, 2, 3]) # Python types will automatically be converted import datetime ws['A2'] = datetime.datetime.now() # Save the file wb.save("sample.xlsx") 

Documentation¶

Support¶

This is an open source project, maintained by volunteers in their spare time. This may well mean that particular features or functions that you would like are missing. But things don’t have to stay that way. You can contribute the project Development yourself or contract a developer for particular features.

Professional support for openpyxl is available from Clark Consulting & Research and Adimian. Donations to the project to support further development and maintenance are welcome.

Bug reports and feature requests should be submitted using the issue tracker. Please provide a full traceback of any error you see and if possible a sample file. If for reasons of confidentiality you are unable to make a file publicly available then contact of one the developers.

The repository is being provided by Octobus and Clever Cloud.

How to Contribute¶

Any help will be greatly appreciated, just follow those steps:

1. Please join the group and create a branch (https://foss.heptapod.net/openpyxl/openpyxl/) and follow the Merge Request Start Guide. for each independent feature, don’t try to fix all problems at the same time, it’s easier for those who will review and merge your changes 😉

2. Hack hack hack

3. Don’t forget to add unit tests for your changes! (YES, even if it’s a one-liner, changes without tests will not be accepted.) There are plenty of examples in the source if you lack know-how or inspiration.

4. If you added a whole new feature, or just improved something, you can be proud of it, so add yourself to the AUTHORS file 🙂

5. Let people know about the shiny thing you just implemented, update the docs!

6. When it’s done, just issue a pull request (click on the large “pull request” button on your repository) and wait for your code to be reviewed, and, if you followed all theses steps, merged into the main repository.

For further information see Development

Other ways to help¶

There are several ways to contribute, even if you can’t code (or can’t code well):

  • triaging bugs on the bug tracker: closing bugs that have already been closed, are not relevant, cannot be reproduced, …
  • updating documentation in virtually every area: many large features have been added (mainly about charts and images at the moment) but without any documentation, it’s pretty hard to do anything with it
  • proposing compatibility fixes for different versions of Python: we support 3.6, 3.7, 3.8 and 3.9.

Источник

Оцените статью