Openpyxl python узнать количество строк

Содержание

Есть ли способ получить количество строк и столбцов, присутствующих в листе .xlsx, с помощью openpyxl?
Версия ~ = 3.0.5 Синтаксис
Версия 1.x.x Синтаксис
Версия 0.x.x Синтаксис
Is there any method to get the number of rows and columns present in .xlsx sheet using openpyxl?
Version ~= 3.0.5 Syntax
Version 1.x.x Syntax
Version 0.x.x Syntax
Is it possible to get an Excel document’s row count without loading the entire document into memory?
6 Answers 6
The solution suggested in this answer has been deprecated, and might no longer work.
Count total number of rows and columns in a sheet using Openpyxl
Count the total number of rows and columns of the excel sheet in Python
Steps to Count the total number of rows and columns in a sheet using Openpyxl

Есть ли способ получить количество строк и столбцов, присутствующих в листе .xlsx, с помощью openpyxl?

Для переменной sheet определение количества строк и столбцов может быть выполнено одним из следующих двух способов:

Версия ~ = 3.0.5 Синтаксис

rows = sheet.max_rows columns = sheet.max_column

Версия 1.x.x Синтаксис

rows = sheet.nrows columns = sheet.ncols

Версия 0.x.x Синтаксис

rows = sheet.max_row columns = sheet.max_column

Он содержал устаревшую и неточную информацию, поэтому был отредактирован. — person Charlie Clark; 15.02.2016

@virtualxtc это max_row не во множественном числе, вам также нужно проверить, используете ли вы v1 и, следовательно, вам нужны брови? — person Alexander Craggs; 29.06.2018

нет — это была просто глупая ошибка с моей стороны. Спасибо хоть. — person virtualxtc; 29.06.2018

У меня не работает. writer.sheets[‘Sheet1’] как не max_row и ни nrows — person newandlost; 21.11.2018

 number_of_rows = sheet_obj.max_row last_row_index_with_data = 0 while True: if sheet_obj.cell(number_of_rows, 3).value != None: last_row_index_with_data = number_of_rows break else: number_of_rows -= 1

На листе есть следующие методы: ‘dim_colmax’, ‘dim_colmin’, ‘dim_rowmax’, ‘dim_rowmin’ Ниже приведен небольшой пример:

import pandas as pd writer = pd.ExcelWriter("some_excel.xlsx", engine='xlsxwriter') workbook = writer.book worksheet = writer.sheets[RESULTS_SHEET_NAME] last_row = worksheet.dim_rowmax

Основываясь на решении Дэни и не имея достаточной репутации, чтобы там комментировать. Я отредактировал код, добавив элемент управления вручную, чтобы сократить время, затрачиваемое на поиск

## iteration to find the last row with values in it nrows = ws.max_row if nrows > 1000: nrows = 1000 lastrow = 0 while True: if ws.cell(nrows, 3).value != None: lastrow = nrows break else: nrows -= 1

Решение с использованием Pandas для получения количества строк и столбцов всех листов. Для подсчета используется df.shape .

import pandas as pd xl = pd.ExcelFile('file.xlsx') sheetnames = xl.sheet_names # get sheetnames for sheet in sheetnames: df = xl.parse(sheet) dimensions = df.shape print('sheetname', ' --> ', sheet) print(f'row count on "" is ') print(f'column count on "" is ') print('-----------------------------')

Источник

Is there any method to get the number of rows and columns present in .xlsx sheet using openpyxl?

Given a variable sheet , determining the number of rows and columns can be done in one of the following ways:

Version ~= 3.0.5 Syntax

rows = sheet.max_rows columns = sheet.max_column

Version 1.x.x Syntax

rows = sheet.nrows columns = sheet.ncols

Version 0.x.x Syntax

rows = sheet.max_row columns = sheet.max_column

@ElMaestroDeToMare definitely worked for me when I wrote this answer, but do note it was half a decade ago and the library has now gone through four major versions!

Building upon Dani’s solution and not having enough reputation to comment in there. I edited the code by adding a manual piece of control to reduce the time consumed on searching

## iteration to find the last row with values in it nrows = ws.max_row if nrows > 1000: nrows = 1000 lastrow = 0 while True: if ws.cell(nrows, 3).value != None: lastrow = nrows break else: nrows -= 1

Worksheet has these methods: ‘dim_colmax’, ‘dim_colmin’, ‘dim_rowmax’, ‘dim_rowmin’

import pandas as pd writer = pd.ExcelWriter("some_excel.xlsx", engine='xlsxwriter') workbook = writer.book worksheet = writer.sheets[RESULTS_SHEET_NAME] last_row = worksheet.dim_rowmax

 number_of_rows = sheet_obj.max_row last_row_index_with_data = 0 while True: if sheet_obj.cell(number_of_rows, 3).value != None: last_row_index_with_data = number_of_rows break else: number_of_rows -= 1

A solution using Pandas to get all sheets row and column counts. It uses df.shape to get the counts.

import pandas as pd xl = pd.ExcelFile('file.xlsx') sheetnames = xl.sheet_names # get sheetnames for sheet in sheetnames: df = xl.parse(sheet) dimensions = df.shape print('sheetname', ' --> ', sheet) print(f'row count on "" is ') print(f'column count on "" is ') print('-----------------------------')

import xlrd location = ("Filelocation\filename.xlsx") wb = xlrd.open_workbook(location) s1 = wb.sheet_by_index(0) s1.cell_value(0,0) #initializing cell from the cell position print(" No. of rows: ", s1.nrows) print(" No. of columns: ", s1.ncols)

When I need the number of non-empty cols, the more efficient I’ve found is Take care it gives the number of NON-EMPTY columns, not the total number of columns. When I say the more efficient, I mean the easiest way to achieve the goal, but not the fastest (I did not test execution speed). in the following, sheet is an instance of openpyxl.worksheet.worksheet.Worksheet :

values = list(sheet.values) #values is a list of tuple of same len nb_cols = len(values[0])

if I need the number of non-empty lines, I do this:

nb_lines = len([v for v in sheet.values if any(v)])

Notice this last instruction can fail : if a line has only 0, it is considered as empty.

Источник

Is it possible to get an Excel document’s row count without loading the entire document into memory?

I’m working on an application that processes huge Excel 2007 files, and I’m using OpenPyXL to do it. OpenPyXL has two different methods of reading an Excel file — one «normal» method where the entire document is loaded into memory at once, and one method where iterators are used to read row-by-row. The problem is that when I’m using the iterator method, I don’t get any document meta-data like column widths and row/column count, and i really need this data. I assume this data is stored in the Excel document close to the top, so it shouldn’t be necessary to load the whole 10MB file into memory to get access to it. So, is there a way to get ahold of the row/column count and column widths without loading the entire document into memory first?

I have a feeling that if you have huge Excel files, you’re likely using Excel for a task it is unsuited to.

In any case, I had a browse through openpyxl, and it doesn’t seem to load the column dimensions for IterableWorksheet. If you load the whole thing at once you can get the dimensions like worksheet.column_dimensions[«A»].width, however the column_dimensions dict is completely unpopulated for the iterable worksheet. :-/ It looks like the newer excel documents are just XML so you could in theory use that to look for your column elements and extract the info directly, but it’s a hassle.

@MadPhysicist 10MB is actually moderately big for xlsx files. Remember, they are compressed XML. So a 10MB xlsx could potentially unpack to >100mb when loaded (especially if it contains non-primitive objects). I have worked with XLSX in the 90MB range though.

6 Answers 6

Adding on to what Hubro said, apparently get_highest_row() has been deprecated. Using the max_row and max_column properties returns the row and column count. For example:

 wb = load_workbook(path, use_iterators=True) sheet = wb.worksheets[0] row_count = sheet.max_row column_count = sheet.max_column

but in this case, you’re counting None valued cells as well, I tried to loop through the columns instead, I know it is not the best way. but it is useful for me.

The solution suggested in this answer has been deprecated, and might no longer work.

Taking a look at the source code of OpenPyXL (IterableWorksheet) I’ve figured out how to get the column and row count from an iterator worksheet:

wb = load_workbook(path, use_iterators=True) sheet = wb.worksheets[0] row_count = sheet.get_highest_row() - 1 column_count = letter_to_index(sheet.get_highest_column()) + 1

IterableWorksheet.get_highest_column returns a string with the column letter that you can see in Excel, e.g. «A», «B», «C» etc. Therefore I’ve also written a function to translate the column letter to a zero based index:

def letter_to_index(letter): """Converts a column letter, e.g. "A", "B", "AA", "BC" etc. to a zero based column index. A becomes 0, B becomes 1, Z becomes 25, AA becomes 26 etc. Args: letter (str): The column index letter. Returns: The column index as an integer. """ letter = letter.upper() result = 0 for index, char in enumerate(reversed(letter)): # Get the ASCII number of the letter and subtract 64 so that A # corresponds to 1. num = ord(char) - 64 # Multiply the number with 26 to the power of `index` to get the correct # value of the letter based on it's index in the string. final_num = (26 ** index) * num result += final_num # Subtract 1 from the result to make it zero-based before returning. return result - 1

I still haven’t figured out how to get the column sizes though, so I’ve decided to use a fixed-width font and automatically scaled columns in my application.

Источник

Count total number of rows and columns in a sheet using Openpyxl

In this tutorial, we will learn how to get or count the number of rows and columns of a sheet using openpyxl in Python.

Count the total number of rows and columns of the excel sheet in Python

To manage Excel files without using additional Microsoft application software, Python’s Openpyxl module is utilized. It is perhaps the greatest Python Excel module that enables you to automate Excel reports and conduct a variety of Excel functions. Using Openpyxl, you may carry out a variety of operations, including:-

Analyzing data
Recording data
Edit Excel documents
Creating charts and graphs
Using numerous sheets
such as sheet styling

You may install the Openpyxl module by installing it in your terminal with the help of the following command:

Steps to Count the total number of rows and columns in a sheet using Openpyxl

The data of the excel file which we are using in this article,

Step 1: Import Openpyxl’s load workbook function.

from openpyxl import load_workbook

Step 2: Give the Python program the way of the Succeed document you wish to open.

file = load_workbook('file.xlsx')

Step 3: Pick the principal dynamic sheet present in the exercise manual utilizing the file.active characteristic.

Step 4: Use the sheet.max_row and sheet.max_column properties in Openpyxl to obtain the maximum number of rows and columns from the Excel sheet in Python.

So our final code will be:

from openpyxl import load_workbook file = load_workbook('file.xlsx') sheet = file.active print(f"Total number of row in the present sheet is ") print(f"Total number of column in the present sheet is ")

The output will be:

Total number of row in the present sheet is 10 Total number of column in the present sheet is 23 Process finished with exit code 0

Источник