Read matlab files in python

How to load Matlab .mat files in Python

Matlab is a really popular platform for scientific computing in the academia. I’ve used it my throughout my engineering degree and chances are, you will come across .mat files for datasets released by the universities.

This is a brief post which explains how to load these files using python, the most popular language for machine learning today.

The data

I wanted to build a classifier for detecting cars of different models and makes and so the Stanford Cars Dataset appeared to be a great starting point. Coming from the academia, the annotations for the dataset was in the .mat format. You can get the file used in this post here.

Loading .mat files

Scipy is a really popular python library used for scientific computing and quite naturally, they have a method which lets you read in .mat files. Reading them in is definitely the easy part. You can get it done in one line of code:

from scipy.io import loadmat
annots = loadmat('cars_train_annos.mat')

Well, it’s really that simple. But let’s go on and actually try to get the data we need out of this dictionary.

Formatting the data

The loadmat method returns a more familiar data structure, a python dictionary. If we peek into the keys, we’ll see how at home we feel now compared to dealing with a .mat file:

annots.keys()
> dict_keys(['__header__', '__version__', '__globals__', 'annotations'])

Looking at the documentation for this dataset, we’ll get to learn what this is really made of. The README.txt gives us the following information:

This file gives documentation for the cars 196 dataset.
(http://ai.stanford.edu/~jkrause/cars/car_dataset.html)
— — — — — — — — — — — — — — — — — — — —
Metadata/Annotations
— — — — — — — — — — — — — — — — — — — —
Descriptions of the files are as follows:
-cars_meta.mat:
Contains a cell array of class names, one for each class.
-cars_train_annos.mat:
Contains the variable ‘annotations’, which is a struct array of length
num_images and where each element has the fields:
bbox_x1: Min x-value of the…

Источник

How to read .mat files in Python?

How To Work With Mat Files In Python

A large number of datasets for data science and research, utilize .mat files. In this article, we’ll learn to work with .mat files in Python and explore them in detail.

Why do we use .mat files in Python?

The purpose of a .mat file may not seem obvious right off the bat. But when working with large datasets, the information contained within these files is absolutely crucial for data science/machine learning projects!

Читайте также:  Тег IMG

This is because the .mat files contain the metadata of every object/record in the dataset.

While the files are not exactly designed for the sole purpose of creating annotations, a lot of researchers use MATLAB for their research and data collection, causing a lot of the annotations that we use in Machine Learning to be present in the form of .mat files.

So, it’s important for a data scientist to understand how to use the .mat files for your projects. These also help you better work with training and testing data sets instead of working with regular CSV files.

How to read .mat files in Python?

By default, Python is not capable of reading .mat files. We need to import a library that knows how to handle the file format.

1. Install scipy

Similar to how we use the CSV module to work with .csv files, we’ll import the scipy libary to work with .mat files in Python.

If you don’t already have scipy, you can use the pip command to install the same

Now that we have scipy set up and ready to use, the next step is to open up your python script to finally get the data required from the file.

2. Import the scipy.io.loadmat module

In this example, I will be using the accordion annotations provided by Caltech, in 101 Object Categories.

from scipy.io import loadmat annots = loadmat('annotation_0001.mat') print(annots)

Upon execution, printing out annots would provide us with this as the output.

Starting off, you can see that this single .mat file provides information regarding the version of MATLAB used, the platform, the date of its creation, and a lot more.

The part that we should be focusing on is, however, the box_coord , and the obj_contour .

3. Parse the .mat file structure

If you’ve gone through the information regarding the Annotations provided by Caltech, you’d know that these numbers are the outlines of the corresponding image in the dataset.

In a little more detail, this means that the object present in image 0001, consists of these outlines. A little further down in the article, we’ll be sorting through the numbers, so, don’t worry about it for now.

Parsing through this file structure, we could assign all the contour values to a new Python list.

con_list = [[element for element in upperElement] for upperElement in annots['obj_contour']]

If we printed out con_list , we would receive a simple 2D array.

[[37.16574585635357, 61.94475138121544, 89.47697974217309, 126.92081031307546, 169.32044198895025, 226.03683241252295, 259.0755064456721, 258.52486187845295, 203.4604051565377, 177.58011049723754, 147.84530386740326, 117.0092081031307, 1.3738489871086301, 1.3738489871086301, 7.98158379373848, 0.8232044198894926, 16.24125230202577, 31.65930018416205, 38.81767955801104, 38.81767955801104], [58.59300184162066, 44.27624309392269, 23.90239410681403, 0.7753222836096256, 2.9779005524862328, 61.34622467771641, 126.87292817679563, 214.97605893186008, 267.83793738489874, 270.59116022099454, 298.6740331491713, 298.6740331491713, 187.9944751381216, 94.93554327808477, 90.53038674033152, 77.31491712707185, 62.44751381215474, 62.998158379373876, 56.94106813996319, 56.94106813996319]]

4. Use Pandas dataframes to work with the data

Now that you have the information and the data retrieved, how would you work with it? Continue to use lists? Definitely not.

Читайте также:  Сколько весит трехметровый питон

We use Dataframes as the structure to work with, in that it functions much like a table of data. Neat to look at, and extremely simple to use.

Now, to work with Dataframes, we’ll need to import yet another module, Pandas.

Pandas is an open source data analysis tool, that is used by machine learning enthusiasts and data scientists throughout the world. The operations provided by it are considered vital and fundamental in a lot of data science applications.

We’ll only be working with DataFrames in this article, but, keep in mind that the opportunities provided by Pandas are immense.

Working with the data we’ve received above can be simplified by using pandas to construct a data frame with rows and columns for the data.

# zip provides us with both the x and y in a tuple. newData = list(zip(con_list[0], con_list[1])) columns = ['obj_contour_x', 'obj_contour_y'] df = pd.DataFrame(newData, columns=columns)

Now, we have our data in a neat DataFrame!

obj_contour_x obj_contour_y 0 37.165746 58.593002 1 61.944751 44.276243 2 89.476980 23.902394 3 126.920810 0.775322 4 169.320442 2.977901 5 226.036832 61.346225 6 259.075506 126.872928 7 258.524862 214.976059 8 203.460405 267.837937 9 177.580110 270.591160 10 147.845304 298.674033 11 117.009208 298.674033 12 1.373849 187.994475 13 1.373849 94.935543 14 7.981584 90.530387 15 0.823204 77.314917 16 16.241252 62.447514 17 31.659300 62.998158 18 38.817680 56.941068 19 38.817680 56.941068

As you can see, we have the X and Y coordinates for the image’s outline in a simple DataFrame of two columns.

This should provide you with some clarity about the nature of the data in the file.

The process of creating DataFrames for each .mat file is different but, with experience and practice, creating them out of .mat files should come naturally to you.

That’s all for this article!

Conclusion

You now know how to work with .mat files in Python, and how to create dataframes in pandas with its content.

The next steps to work with this data would be to and create your own models, or employ existing ones for training or testing your copy of the dataset.

References

Источник

Reading mat files¶

Here are examples of how to read two variables lat and lon from a mat file called «test.mat».

= Matlab up to 7.1 = mat files created with Matlab up to version 7.1 can be read using the mio module part of scipy.io . Reading structures (and arrays of structures) is supported, elements are accessed with the same syntax as in Matlab: after reading a structure called e.g. struct , its lat element can be obtained with struct.lat , or struct.__getattribute__(‘lat’) if the element name comes from a string.

#!python #!/usr/bin/env python from scipy.io import loadmat x = loadmat('test.mat') lon = x['lon'] lat = x['lat'] # one-liner to read a single variable lon = loadmat('test.mat')['lon'] 

Matlab 7.3 and greater¶

Beginning at release 7.3 of Matlab, mat files are actually saved using the HDF5 format by default (except if you use the -vX flag at save time, see in Matlab). These files can be read in Python using, for instance, the PyTables or h5py package. Reading Matlab structures in mat files does not seem supported at this point.

#!python #!/usr/bin/env python import tables file = tables.openFile('test.mat') lon = file.root.lon[:] lat = file.root.lat[:] # Alternate syntax if the variable name is in a string varname = 'lon' lon = file.getNode('/' + varname)[:] 

Section author: Unknown[16], DavidPowell, srvanrell

© Copyright 2015, Various authors Revision 5e2833af .

Читайте также:  Boolean в строку php

Versions latest Downloads html On Read the Docs Project Home Builds Free document hosting provided by Read the Docs.

Источник

Read Matlab mat Files in Python

Read Matlab mat Files in Python

  1. Use the scipy.io Module to Read .mat Files in Python
  2. Use the NumPy Module to Read mat Files in Python
  3. Use the mat4py Module to Read mat Files in Python
  4. Use the matlab.engine Module to Read mat Files in Python

MATLAB is a programming platform that is widely used these days for numerical computation, statistical analysis, and generating algorithms. It is a very flexible language and allows us to integrate our work with different programming languages like Python.

The MATLAB workspace saves all its variables and contents in a mat file. In this tutorial, we will learn how to open and read mat files in Python.

Use the scipy.io Module to Read .mat Files in Python

The scipy.io module has the loadmat() function, which can open and read mat files. The following code shows how to use this function.

import scipy.io mat = scipy.io.loadmat('file.mat') 

Note that this method does not work for the MATLAB version below 7.3. We can either save the mat file in lower versions using the below command in MATLAB to avoid this.

Use the NumPy Module to Read mat Files in Python

It is discussed earlier how we cannot open files in MATLAB 7.3 using the scipy.io module in Python. It is worth noting that files in version 7.3 and above are hdf5 datasets, which means we can open them using the NumPy library. For this method to work, the h5py module needs to be installed, which requires HDF5 on your system.

The code below shows how to read mat files using this method.

import numpy as np import h5py f = h5py.File('somefile.mat','r') data = f.get('data/variable1') data = np.array(data) # For converting to a NumPy array 

Use the mat4py Module to Read mat Files in Python

This module has functions that allow us to write and read data to and from MATLAB files.

The loadmat() function reads MATLAB files and stores them in basic Python structures like a list or a dictionary and is similar to the loadmat() from scipy.io .

from mat4py import loadmat  data = loadmat('example.mat') 

Use the matlab.engine Module to Read mat Files in Python

For users who already have MATLAB can use the matlab.engine which is provided by MathWorks itself. It has a lot of functionality, which extends to more than just reading and writing “.mat” files.

The following code shows how to read MATLAB files using this method.

import matlab.engine eng = matlab.engine.start_matlab() content = eng.load("example.mat", nargout=1) 

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

Источник

Оцените статью