Count unique values in list python

Python: Count Unique Values in a List (4 Ways)

Python Count Unique Items in a List Cover Image

In this tutorial, you’ll learn how to use Python to count unique values in a list. You’ll also learn what the fastest way to do this is! You’ll learn how to accomplish this using a naive, brute-force method, the collections module, using the set() function, as well as using numpy . We’ll close off the tutorial by exploring which of these methods is the fastest to make sure you’re getting the best performance out of your script.

The Quick Answer: Use Python Sets

# Using sets to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] num_values = len(set(a_list)) print(num_values) # Returns 5

Why count unique values?

Python lists are a useful built-in data structure! One of the perks that they offer is the ability to have duplicate items within them.

There may be many times when you may to count unique values contained within a list. For example, if you receive data in a list that tracks the number of log in into a site, you could determine how many unique people actually logged in.

Using Collections to Count Unique Values in a List

The built-in collections module can be used to count unique values in a list. The module has a built-in object called Counter that returns a dictionary-like object with the unique values as keys and the number of occurrences for values.

Because of this, we can counts the number of keys to count the number of unique values.

Tip! Want to learn more about the Python collections module and its Counter class? Check out my in-depth tutorial here, where you’ll learn how to count occurrences of a substring in a string.

Let’s see how we can use the Counter object to count unique values in a Python list:

# Use Counter from collections to count unique values in a Python list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] from collections import Counter counter_object = Counter(a_list) keys = counter_object.keys() num_values = len(keys) print(num_values) # Returns 5

Let’s see what we’ve done here:

  1. We passed our list into the Counter object to create a unique object
  2. We get the keys using the .keys() attribute
  3. Finally, we get the length of that new object

We can make this much easier to write by simply chaining the process together, as shown below.

# Using Counter from collections to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] from collections import Counter num_values = len(Counter(a_list).keys()) print(num_values) # Returns 5

This process returns the same thing, but is much quicker to write!

Читайте также:  Загружается только html код

Using Sets to Count Unique Values in a Python List

Another built-in data structure from Python are sets. One of the things that separate sets from lists is that they can only contain unique values.

Python comes built with a set() function that lets you create a set based on something being passed into the function as a parameter. When we pass a list into the function, it turns the list into a set, thereby stripping out duplicate values.

Now, let’s see how we can use sets to count unique values in a list:

# Using sets to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] set = set(a_list) num_values = len(set) print(num_values) # Returns: 5
  1. Turned our list into a set using the built-in set() function
  2. Returned the number of values by counting the length of the set, using the len() function

We can also make this process a little faster by simply chaining our methods together, as demonstrated below:

# Using sets to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] num_values = len(set(a_list)) print(num_values) # Returns 5

This returns the same value but is a little faster to write out.

Want to learn more? Learn four different ways to append to a list in Python using this extensive tutorial here.

Use Numpy to Count Unique Values in a Python List

You can also use numpy to count unique values in a list. Numpy uses a data structure called a numpy array, which behaves similar to a list but also has many other helpful methods associated with it, such as the ability to remove duplicates.

Let’s see how we can use numpy to count unique values in a list:

# Use numpy in Python to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] import numpy as np array = np.array(a_list) unique = np.unique(array) num_values = len(unique) print(num_values)

Let’s see what we’ve done here:

  1. We imported numpy as np and created an array using the array() function
  2. We used the unique() function from numpy to remove any duplicates
  3. Finally, we calculated the length of that array

We can also write this out in a much faster way, using method chaining. Let’s see how this can be done:

# Use numpy in Python to count unique values in a list a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] import numpy as np num_values = len(np.unique(np.array(a_list))) print(num_values) # Returns 5

This returns the same result as before. Under the hood, this is the same approach that the Pandas unique method uses.

Читайте также:  Html вставка word документа

Use a For Loop in Python to Count Unique Values in a List

Finally, let’s take a look at a more naive method to count unique items in a list. For this, we’ll use a Python for loop to iterate over a list and count its unique items.

a_list = ['apple', 'orage', 'apple', 'banana', 'apple', 'apple', 'orange', 'grape', 'grape', 'apple'] unique_list = list() unique_items = 0 for item in a_list: if item not in unique_list: unique_list.append(item) unique_items += 1 print(unique_items)

Let’s see what we’ve done here:

  1. We create a new list called unique_list and an integer of 0 called unique_items
  2. We then loop over our original list and see if the current item is in the unique_list
  3. If it isn’t, then we append it to the list and add 1 to our counter unique_items

What Method is Fastest to Count Unique Values in a Python List?

Now that you’ve learned four unique ways of counting unique values in a Python list, let’s take a look at which method is fastest.

What we’ll do is create a Python decorator to time each method. We’ll create a function that executes each method and decorate it to identify how long its execution takes.

For out sample list, we’ll use the first few paragraphs of A Christmas Carol, where each word is a list, and multiply that list by 10,000 to make it a bit of a challenge:

import time def time_it(func): """Print the runtime of a decorated function.""" def wrapper_time_it(*args, **kwargs): start_time = time.perf_counter() value = func(*args, **kwargs) end_time = time.perf_counter() run_time = end_time - start_time print(f"Finished in seconds") return value return wrapper_time_it @time_it def counter_method(a_list): from collections import Counter return len(Counter(a_list).keys()) @time_it def set_method(a_list): return len(set(a_list)) @time_it def numpy_method(a_list): import numpy as np return len(np.unique(np.array(list))) @time_it def for_loop_method(a_list): unique_list = list() unique_items = 0 for item in a_list: if item not in unique_list: unique_list.append(item) unique_items += 1 return unique_items sample_list = ['Marley', 'was', 'dead:', 'to', 'begin', 'with.', 'There', 'is', 'no', 'doubt', 'whatever', 'about', 'that.', 'The', 'register', 'of', 'his', 'burial', 'was', 'signed', 'by', 'the', 'clergyman,', 'the', 'clerk,', 'the', 'undertaker,', 'and', 'the', 'chief', 'mourner.', 'Scrooge', 'signed', 'it:', 'and', 'Scrooge’s', 'name', 'was', 'good', 'upon', '’Change,', 'for', 'anything', 'he', 'chose', 'to', 'put', 'his', 'hand', 'to.', 'Old', 'Marley', 'was', 'as', 'dead', 'as', 'a', 'door-nail.', 'Mind!', 'I', 'don’t', 'mean', 'to', 'say', 'that', 'I', 'know,', 'of', 'my', 'own', 'knowledge,', 'what', 'there', 'is', 'particularly', 'dead', 'about', 'a', 'door-nail.', 'I', 'might', 'have', 'been', 'inclined,', 'myself,', 'to', 'regard', 'a', 'coffin-nail', 'as', 'the', 'deadest', 'piece', 'of', 'ironmongery', 'in', 'the', 'trade.', 'But', 'the', 'wisdom', 'of', 'our', 'ancestors', 'is', 'in', 'the', 'simile;', 'and', 'my', 'unhallowed', 'hands', 'shall', 'not', 'disturb', 'it,', 'or', 'the', 'Country’s', 'done', 'for.', 'You', 'will', 'therefore', 'permit', 'me', 'to', 'repeat,', 'emphatically,', 'that', 'Marley', 'was', 'as', 'dead', 'as', 'a', 'door-nail.', 'Scrooge', 'knew', 'he', 'was', 'dead?', 'Of', 'course', 'he', 'did.', 'How', 'could', 'it', 'be', 'otherwise?', 'Scrooge', 'and', 'he', 'were', 'partners', 'for', 'I', 'don’t', 'know', 'how', 'many', 'years.', 'Scrooge', 'was', 'his', 'sole', 'executor,', 'his', 'sole', 'administrator,', 'his', 'sole', 'assign,', 'his', 'sole', 'residuary', 'legatee,', 'his', 'sole', 'friend,', 'and', 'sole', 'mourner.'] sample_list *= 10000 counter_method(sample_list) set_method(sample_list) numpy_method(sample_list) for_loop_method(sample_list) # Returns # Finished 'counter_method' in 0.2321387500 seconds # Finished 'set_method' in 0.0463015000 seconds # Finished 'numpy_method' in 0.2570261250 seconds # Finished 'for_loop_method' in 7.1416198340 seconds

From this, we can see that while the Counter method and the Numpy methods are reasonably fast, the set method is the fastest of the bunch! This could be attributed to the fact that it doesn’t require the import of another method.

Читайте также:  Целочисленное деление python отрицательные числа

Conclusion

In this post, you learned how to count unique values in a Python list. You learned how to do this using built-in sets, using the collections module, using numpy , and finally using a for-loop. You then learned which of these methods is the fastest method to execute, to ensure you’re not bogging down your script unnecessarily.

To learn more about the Counter object in the collections module, you can check out the official documentation here.

Источник

Count Unique Values in Python List

Count Unique Values in Python List

  1. Use collections.counter to Count Unique Values in Python List
  2. Use set to Count Unique Values in Python List
  3. Use numpy.unique to Count the Unique Values in Python List

This article will introduce different methods to count unique values inside list. using the following methods:

Use collections.counter to Count Unique Values in Python List

collections is a Python standard library, and it contains the Counter class to count the hashable objects.

  1. keys() returns the unique values in the list.
  2. values() returns the count of every unique value in the list.

We can use the len() function to get the number of unique values by passing the Counter class as the argument.

Example Codes:

from collections import Counter  words = ['Z', 'V', 'A', 'Z','V']  print(Counter(words).keys()) print(Counter(words).values())  print(Counter(words)) 

Use set to Count Unique Values in Python List

set is an unordered collection data type that is iterable, mutable, and has no duplicate elements. We can get the length of the set to count unique values in the list after we convert the list to a set using the set() function.

Example Codes:

words = ['Z', 'V', 'A', 'Z','V'] print(len(set(words))) 

Use numpy.unique to Count the Unique Values in Python List

numpy.unique returns the unique values of the input array-like data, and also returns the count of each unique value if the return_counts parameter is set to be True .

Example Codes:

import numpy as np  words = ['Z', 'V', 'A', 'Z','V']  np.unique(words)  print(len(np.unique(words))) 

Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.

Related Article — Python List

Источник

Оцените статью