Automate excel with python

How to Automate an Excel Sheet in Python?

Before you read this article and learn automation in Python….let’s watch a video of Christian Genco (a talented programmer and an entrepreneur) explaining the importance of coding by taking the example of automation.

You might have laughed loudly after watching this video and you surely, you might have understood the importance of automation in real life as well. Let’s come to the topic now…

We all know that Python is ruling all over the world, and we also know that Python is beginner’s friendly and it’s easy to learn in comparison to other languages. One of the best things you can do with Python is Automation.

Consider a scenario that you’re asked to create an account on a website for 30,000 employees. How would you feel? Surely you will be frustrated doing this task manually and repeatedly. Also, this is going to take too much time which is not a smart decision.

Now just imagine the life of employees who are into the data entry jobs. Their job is to take the data from tables such as Excel or Google Sheet and insert it somewhere else. They browse different websites and magazine, they collect the data from there, and then they insert it into the database. They also need to do the calculations for the entries.

Generally, the income is based on the performance in this job. More entries, higher salary (of course everyone wants a higher salary in their job).

But don’t you think that it’s boring to do the same kind of stuff repeatedly?

Now the question is….“How can I do it fast?”, “How can I automate my work?

Instead of doing these kinds of tasks manually, just spend an hour coding and automate these kinds of stuff to make your life easier. You can automate your tedious task by just writing fewer lines of code in Python.

In this blog, we will create a small project to learn automation in Python. If you’re a beginner then you may prefer to watch some videos to learn the automation in Python and reading this blog might be a boring task for you but here we will go through step by step to explain everything in detail and to make things easier for you.

It will be great if you already know the core concept of Python. We will take an example of an Excel sheet with some entries, and we will learn the automation process. We are going to write a Python program that can process thousands of spreadsheets in under a second. Excited. Let’s get started…

Читайте также:  Python embedded in java

Intro of The Task

Processing or updating thousands of spreadsheets manually will take too much time. It may take hours, days, or even months. We will write a Python program to automate this task. We will work on a spreadsheet given in the below picture.

In this spreadsheet, we have the record for all kinds of transactions, but let’s say due to an error (human error or system error), the price for the product listed in the third column is wrong. Let’s say we need to decrease the price by 10% (multiply the price by 0.9 and recalculate the value). You can do this task manually by using a mathematical formula in the fourth column but it will take too much time (maybe 1 week or two weeks) if there are thousands of records.

We will write a python program to automate this process. Also, we will add a chart to it. Our python program will do this task for us in a matter of seconds.

Let’s Get Into The Coding Zone

To work on this Excel sheet we are going to use a library openpyxl. Create a folder in your directory, give it a name and install the openpyxl package by executing the following command in your terminal.

Now we can import this package to work on our spreadsheet. Before that add the spreadsheet in your project folder. Now create a file app.py in your folder and write down the code given below.

Источник

Automating Excel Sheet in Python

Python is full of rich libraries, useful packages, and ready-to-use functions for automation that greatly facilitate testing. Python allows you to choose what is best for your project.

In this article, we are going to implement Python libraries and excel data to automate. It will help us gain a better understanding of Python and Excel.

Table of contents

Prerequisites

To follow along with this tutorial, you will need to have;

  • Basic knowledge of the Python programming language.
  • Knowledge of working with Python libraries.
  • Knowledge of MS Excel.

Overview

A brief understanding of Python and MS excel will help us navigate through the automation process. We will go into their history, how they came to be, and why we chose them for our project.

Python overview

Python is an intelligent, object-arranged programming language with many elements that utilize a mediator to handle code scraps. Since it works at the order brief, it very well may be utilized to play out a framework approach essentially to all working frameworks.

Читайте также:  Python dict format print

In case you have not installed Python here is a guide on how to install Python and test it in your operating system.

Why do we prefer Python over other programming languages:

  • It is a free and open-source programming language — you can download Python and install it from this website freely.
  • Python is an easy-to-use language — beginners with little knowledge in programming can easily learn Python syntax and understand it.
  • It additionally upholds object — situated programming empowers the client to compose decipherable and reusable codes.

MS Excel overview

Microsoft Excel is a bookkeeping sheet device created by Microsoft in 1987 for Windows, MacOS, Android, and iOS. It has filled in ubiquity throughout the years due to its effortlessness in directing ‘CRUD’ activities at the beginning of information.

It also supports many formulae while retrieving data from MS Excel files.

Analyzing the Excel dataset

In this section, we will analyze the dataset that I have prepared in Excel. You can download the data from [here] (https://github.com/taves-hub/Automating-excel-sheet-in-python/blob/main/Python/exam-results-list-excel-table.xlsx).

The dataset is already in the .csv extension, which we need to change to the “.xslx` format for us to be able to automate. We will use the data above to generate the data below:

data

Scheming pivot tables using Pandas

We will need to import the following Python libraries into our workspace to make the pivot table:

import pandas as pd import Openpyxl from Openpyxl import load_workbook from Openpyxl.sytles import Font from openpyxl.chart import BarChart, Reference import string 

The ‘Pandas’ library is used to analyze Excel files and create pivot tables in Excel. Python’s ‘Openpyxl’ module is intended to make Excel calculations and create graphs and spreadsheets.

In case you have not installed the Openpyxl library, run the command below in your command prompt to add it to your workspace:

Our Excel sheet should be in the same location as our Python script. Load_workbook function, imported from the Openpyxl library, will be used to design our report.

You can learn more about the Python library here

Viewing Excel data

To read our excel file, we will use pd.read_excel() function as demonstrated in the code snippet below:

 # excel file excel_file = pd.read_excel('student_results.xlsx')  # columns names in the file excel_file[['Student Name' , 'course' , 'Date' , 'points']] 

The following results will be displayed after running the code above:

Output

Pivot table

To create our pivot table, we are going to use the .pivot_table() function to showcase the student_results. To pivot the total points attained by students of different courses, we will run the code snippet below:

report_table = excel_file..pivot_table(index='Student Number', columns='Course' , values='points' , aggfun='sum').round(0) 

The report will be as shown below:

Pivot table

The above aggfun = ‘sum’ pivot_table function is used to calculate the sum of points in the columns grouped by the course. The .round (0) function is for null return.

Читайте также:  text-transform

The function _to_excel() will be used to export our excel file. Inside the brackets, we will specify the name of the output Excel file (report_2021.xlsx). It is followed by the sheet name we are going to generate (report), and the pivot table cell (5).

The code below will send the report table to an Excel file:

report_table.to_excel('report_2021.xlsx' , sheet_name='Report' , startrow=5) 

Generating the reports using Openpyxl library

In this section, we will use the load_workbook function that will be used to access the workbook and use its .save() function to save the workbook. The loading and saving of the workbook will be done every time we modify the workbook and it will be done once.

We are required to identify the maximum and minimum active columns and rows to ensure that even if we add more data to the Excel sheet, the code will keep working.

The code below illustrates the maximum and minimum rows and columns:

Minimum Columns: 4 Maximum Columns: 15 Minimum Rows: 1 Maximum Rows: 5 

This is verified by the original report.xlsx file that we exported shown below:

Maximum and minimum rows and columns

Automating the report using Python

After designing our report, the next and most important part is to automate it. In this part, we will compose all the code utilizing one function to make it simple to automate our report.

Putting the code under the same function ensures that next time we are required to automate the report, we will only have to specify the file name and run the code. Here is our complete code.

You can apply the function automate_excel(‘student_results.xlsx’) if you are only dealing with single data points. In the case of multiple data, the criteria remains the same.

We need to apply the functions one by one to get the required reports. For example, if we need to generate two reports, we use the code snippet below:

automate_excel('student_results1.xlsx') automate_excel('student_results2.xlsx') 

Scheduling Python scripts

At this point, we’ve finished up with all the code. Next, we need to use the task scheduler to run the Python script at various times based on the data needs. The data may need to be sent daily, weekly, or even monthly.

With that, we have successfully automated our excel file simply and understandably.

Conclusion

This article that give the reader a simple and direct guide on how to automate excel files in Python with clear and simple steps. You may share your experience of setting up the automation with excel in Python in the comment section below.

Peer Review Contributions by: Dawe Daniel

Источник

Оцените статью