Python series filter by value

Pandas Series filter() Function

Pandas Series.filter() function is used to return the subset of values from Series that satisfies the condition. The filter() is applied with help of the index labels or on the values themselves. We can filter or subset the values of the pandas series using various functions.

In this article, I will explain filter() syntax, parameters, and how to filter the values or rows from the Pandas Series and also filtering values by using where(), isin(), loc[], and lambda functions.

1. Quick Examples of Series filter() Function

If you are in hurry below are some quick examples of the Pandas Series filter() function.

 # Below are a quick example # Example 1: use Series.filter() function to filter a pandas series ser2 = ser.filter(regex = '. .') # Example 2: filter() index by labels ser2 = ser.filter(items = ['Spark', 'Python']) # Example 3 : use loc[] and lambda to filter a pandas series ser2 = ser.loc[lambda x : x == 23000] # Example 4: use loc[] property & OR condition ser2 = ser.loc[lambda x : (x 28000)] # Example 5: use where() function to filter series ser2 = ser.where(ser < 25000).dropna() # Example 6: use isin() function to filter series ser2 = ser[ser.isin([23000,28000])] 

2. Syntax of Series.filter() Function

Following is the syntax of create Series.filter() function.

 # Syntax of Series.filter() function Series.filter(items=None, like=None, regex=None, axis=None) 

2.1 Parameter of filter()

  • items – Takes a list of axis labels that you wanted to filter.
  • like – Takes the axis string label that you wanted to filter.
  • regex – regular expression.
  • axis – , default None. When not specified it used columns.

2.2 Return Value of filter()

It returns the value of the filter() the same type as the input object.

3. Create Pandas Series

Pandas Series is a one-dimensional, Index-labeled data structure that is available only in the Pandas library. It can store all the datatypes such as strings, integers, float, and other python objects. We can access each element in the Series with the help of corresponding default indices.

Note : Series data structure is the same as the NumPy array data structure but only one difference that is arrays indices are integers and start with 0, whereas in series, the index can be anything even strings. The labels do not need to be unique but they must be of hashable type.

Now, let’s create pandas series using list of values.

 import pandas as pd # Create the Series ser = pd.Series([20000,25000,23000,28000,55000,23000]) # Create the Index index = ['Java','Spark','PySpark','Pandas','python NumPy','Python'] # Set the index ser.index = index print(ser) 
 # Output: Java 20000 Spark 25000 PySpark 23000 Pandas 28000 python NumPy 55000 Python 23000 dtype: int64 

4. Use Series.filter() Function To Filter a Pandas Series

By using Series.filter() function you can filter the Series by index labels or by values. When you use index labels to files you can use regular expressions by using “regex” . The following example filters values from the given series object whose index label name has a space.

 # Use Series.filter() function to filter a pandas series ser2 = ser.filter(regex = '. .') print(ser2) 
 # Output: python NumPy 55000 dtype: int64 

5. Filter Series by Index Labels

By default pandas.Series.filter() select the indexes by labels you specified using item , like , and regex parameters. The following example filters series with the list of index labels Spark and Python .

 # Filter() index by labels ser2 = ser.filter(items = ['Spark', 'Python']) print(ser2) 
 # Output: Spark 25000 Python 23000 dtype: int64 

6. Use loc[] & Lambda to Filter a Pandas Series

You can also filter the Pandas Series using Series.loc[] along with lambda function. The following example returns values from a series where values are equal to 23000.

 # Use loc[] and lambda to filter a pandas series ser2 = ser.loc[lambda x : x == 23000] print(ser2) 
 # Output: PySpark 23000 Python 23000 dtype: int64 

Alternatively, you can also apply an “OR” condition with the “loc[]” property. The following example filters values that are less than 23000 or values greater than 28000. For examples.

 # Use loc[] property & OR condition ser2 = ser.loc[lambda x : (x < 23000 or x >28000)] print(ser2) 
 # Output: Java 20000 python NumPy 55000 dtype: int64 

7. Use where() Function To Filter Series

We can also use where() function to filter a series by values using expressions.

 # Use where() function to filter series ser2 = ser.where(ser < 25000).dropna() print(ser2) 
 # Output: Pandas 28000.0 python NumPy 55000.0 dtype: float64 

8. Use isin() Function To Filter Series

By use isin() function is used to get the values from the series that are present in the list of values.

 # Use isin() function to filter series ser2 = ser[ser.isin([23000,28000])] print(ser2) 
 # Output: PySpark 23000 Pandas 28000 Python 23000 dtype: int64 

9. Complete Example For Series filter() Function

 import pandas as pd # Create the Series ser = pd.Series([20000,25000,23000,28000,55000,23000]) # Create the Index index_ = ['Java','Spark','PySpark','Pandas','python NumPy','Python'] # Set the index ser.index = index_ print(ser) # Use Series.filter() function to filter a pandas series ser2 = ser.filter(regex = '. .') print(ser2) # Filter() index by labels ser2 = ser.filter(items = ['Spark', 'Python']) print(ser2) # Use loc[] and lambda to filter a pandas series ser2 = ser.loc[lambda x : x == 23000] print(ser2) # Use loc[] property & OR condition ser2 = ser.loc[lambda x : (x 28000)] print(ser2) # Use where() function to filter series ser2 = ser.where(ser < 25000).dropna() print(ser2) # Use isin() function to filter series ser2 = ser[ser.isin([23000,28000])] print(ser2) 

10. Conclusion

In this article, you have learned the how to filter the Pandas Series by using filter() , where() , isin() , and loc[] with lambda function by using examples.

Читайте также:  Php обратится к форме

References

You may also like reading:

Источник

pandas.Series.filter#

Subset the dataframe rows or columns according to the specified index labels.

Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.

Parameters items list-like

Keep labels from axis which are in items.

Keep labels from axis for which “like in label == True”.

regex str (regular expression)

Keep labels from axis for which re.search(regex, label) == True.

The axis to filter on, expressed either as an index (int) or axis name (str). By default this is the info axis, ‘columns’ for DataFrame. For Series this parameter is unused and defaults to None .

Returns same type as input object

Access a group of rows and columns by label(s) or a boolean array.

The items , like , and regex parameters are enforced to be mutually exclusive.

axis defaults to the info axis that is used when indexing with [] .

>>> df = pd.DataFrame(np.array(([1, 2, 3], [4, 5, 6])), . index=['mouse', 'rabbit'], . columns=['one', 'two', 'three']) >>> df one two three mouse 1 2 3 rabbit 4 5 6 
>>> # select columns by name >>> df.filter(items=['one', 'three']) one three mouse 1 3 rabbit 4 6 
>>> # select columns by regular expression >>> df.filter(regex='e$', axis=1) one three mouse 1 3 rabbit 4 6 
>>> # select rows containing 'bbi' >>> df.filter(like='bbi', axis=0) one two three rabbit 4 5 6 

Источник

Pandas: как фильтровать серию по значению

Вы можете использовать следующие методы для фильтрации значений в серии pandas:

Метод 1: фильтрация значений на основе одного условия

#filter for values equal to 7 my_series. loc [ lambda x : x == 7] 

Метод 2: фильтрация значений с использованием условия «ИЛИ»

#filter for values less than 10 *or* greater than 20 my_series. loc [ lambda x : (x < 10) | (x >20)] 

Способ 3: фильтрация значений с использованием условия «И»

#filter for values greater than 10 *and* less than 20 my_series. loc [ lambda x : (x > 10) & (x < 20)] 

Способ 4: фильтрация значений, содержащихся в списке

#filter for values that are equal to 4, 7, or 23 my_series[my_series. isin([4, 7, 23])] 

В этом руководстве объясняется, как использовать каждый метод на практике со следующими сериями панд:

import pandas as pd #create pandas Series data = pd.Series([4, 7, 7, 12, 19, 23, 25, 30]) #view pandas Series print(data) 0 4 1 7 2 7 3 12 4 19 5 23 6 25 7 30 dtype: int64 

Пример 1. Фильтрация значений по одному условию

В следующем коде показано, как отфильтровать серию pandas для значений, равных 7:

#filter for values equal to 7 data. loc [ lambda x : x == 7] 1 7 2 7 dtype: int64 

Мы также можем фильтровать значения, не равные 7:

#filter for values not equal to 7 data. loc [ lambda x : x != 7] 0 4 3 12 4 19 5 23 6 25 7 30 dtype: int644 

Пример 2. Фильтрация значений с использованием условия «ИЛИ»

В следующем коде показано, как отфильтровать серию pandas для значений меньше 10 или больше 20:

#filter for values less than 10 *or* greater than 20 data. loc [ lambda x : (x < 10) | (x >20)] 0 4 1 7 2 7 5 23 6 25 7 30 dtype: int64 

Пример 3. Фильтрация значений с использованием условия «И»

В следующем коде показано, как отфильтровать серию pandas для значений больше 10 и меньше 20:

#filter for values greater than 10 *and* less than 20 data. loc [ lambda x : (x > 10) & (x < 20)] 3 12 4 19 dtype: int64 

Пример 4: значения фильтра, содержащиеся в списке

В следующем коде показано, как отфильтровать серию pandas для значений, содержащихся в списке:

#filter for values that are equal to 4, 7, or 23 data[data. isin([4, 7, 23])] 0 4 1 7 2 7 5 23 dtype: int64 

Дополнительные ресурсы

В следующих руководствах объясняется, как выполнять другие распространенные операции фильтрации в Python:

Читайте также:  Javascript get json name

Источник

pandas.Series.filter#

Subset the dataframe rows or columns according to the specified index labels.

Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.

Parameters : items list-like

Keep labels from axis which are in items.

Keep labels from axis for which “like in label == True”.

regex str (regular expression)

Keep labels from axis for which re.search(regex, label) == True.

The axis to filter on, expressed either as an index (int) or axis name (str). By default this is the info axis, ‘columns’ for DataFrame. For Series this parameter is unused and defaults to None .

Returns : same type as input object

Access a group of rows and columns by label(s) or a boolean array.

The items , like , and regex parameters are enforced to be mutually exclusive.

axis defaults to the info axis that is used when indexing with [] .

>>> df = pd.DataFrame(np.array(([1, 2, 3], [4, 5, 6])), . index=['mouse', 'rabbit'], . columns=['one', 'two', 'three']) >>> df one two three mouse 1 2 3 rabbit 4 5 6 
>>> # select columns by name >>> df.filter(items=['one', 'three']) one three mouse 1 3 rabbit 4 6 
>>> # select columns by regular expression >>> df.filter(regex='e$', axis=1) one three mouse 1 3 rabbit 4 6 
>>> # select rows containing 'bbi' >>> df.filter(like='bbi', axis=0) one two three rabbit 4 5 6 

Источник

Pandas: How to Filter Series by Value

You can use the following methods to filter the values in a pandas Series:

Method 1: Filter Values Based on One Condition

#filter for values equal to 7 my_series.loc[lambda x : x == 7] 

Method 2: Filter Values Using “OR” Condition

#filter for values less than 10 or greater than 20 my_series.loc[lambda x : (x < 10) | (x >20)] 

Method 3: Filter Values Using “AND” Condition

#filter for values greater than 10 and less than 20 my_series.loc[lambda x : (x > 10) & (x < 20)] 

Method 4: Filter Values Contained in List

#filter for values that are equal to 4, 7, or 23 my_series[my_series.isin([4, 7, 23])] 

This tutorial explains how to use each method in practice with the following pandas Series:

import pandas as pd #create pandas Series data = pd.Series([4, 7, 7, 12, 19, 23, 25, 30]) #view pandas Series print(data) 0 4 1 7 2 7 3 12 4 19 5 23 6 25 7 30 dtype: int64

Example 1: Filter Values Based on One Condition

The following code shows how to filter the pandas Series for values equal to 7:

#filter for values equal to 7 data.loc[lambda x : x == 7] 1 7 2 7 dtype: int64

We can also filter for values not equal to 7:

#filter for values not equal to 7 data.loc[lambda x : x != 7] 0 4 3 12 4 19 5 23 6 25 7 30 dtype: int644

Example 2: Filter Values Using “OR” Condition

The following code shows how to filter the pandas Series for values less than 10 or greater than 20:

#filter for values less than 10 or greater than 20 data.loc[lambda x : (x < 10) | (x >20)] 0 4 1 7 2 7 5 23 6 25 7 30 dtype: int64

Example 3: Filter Values Using “AND” Condition

The following code shows how to filter the pandas Series for values greater than 10 and less than 20:

#filter for values greater than 10 and less than 20 data.loc[lambda x : (x > 10) & (x < 20)] 3 12 4 19 dtype: int64

Example 4: Filter Values Contained in List

The following code shows how to filter the pandas Series for values that are contained in a list:

#filter for values that are equal to 4, 7, or 23 data[data.isin([4, 7, 23])] 0 4 1 7 2 7 5 23 dtype: int64 

Additional Resources

The following tutorials explain how to perform other common filtering operations in Python:

Читайте также:  Php увеличить время выполнения скрипта htaccess

Источник

Оцените статью