Python data class to dict

Python Dict vs Asdict

Python Dict vs Asdict

  1. the dataclasses Library in Python
  2. Why dict Is Faster Than asdict

The dataclasses library was introduced in Python 3.7, allowing us to make structured classes specifically for data storage. These classes have specific properties and methods to deal with data and its portrayal.

the dataclasses Library in Python

To install the dataclasses library, use the below command.

Unlike a normal class in Python, the dataclasses are implemented using the @dataclass decorators with classes. Also, attribute declaration is made using type hints, which specify data types for the attributes in the dataclass .

Below is a code snippet that puts the concept into practice.

# A bare-bones Data Class # Don't forget to import the dataclass module from dataclasses import dataclass @dataclass class Student():  """A class which holds a students data"""   # Declaring attributes  # Making use of type hints   name: str  id: int  section: str  classname: str  fatherName: str  motherName: str  # Below is a dataclass instance student = Student("Muhammad", 1432, "Red", "0-1", "Ali", "Marie") print(student) 
Student(name='Muhammad', section='Red', classname='0-1', fatherName='Ali', motherName='Marie') 

There are two points to note in the code above. First, a dataclass object accepts arguments and assigns them to relevant data members without an _init_() constructor.

This is so because the dataclass provides a built-in _init_() constructor.

The second point to note is that the print statement neatly prints the data present in the object without any function specifically programmed to do this. This means it must have an altered _repr_() function.

Читайте также:  Xml java как открыть

Why dict Is Faster Than asdict

In most cases, where you would have used dict without dataclasses, you certainly should continue using dict .

However, the asdict performs extra tasks during a copy call that might not be useful for your case. These extra tasks will have an overhead that you’d like to avoid.

Here’s what it does according to the official documentation. Each dataclass object is first converted to a dict of its fields as name: value pairs.

Then, the dataclasses , dicts , lists, and tuples are recursed.

For instance, if you need recursive dataclass dictification, go for asdict . Otherwise, all the extra work that goes into providing it is wasted.

If you use asdict in particular, then modifying the implementation of contained objects to use dataclass will change the result of asdict on the outer objects.

from dataclasses import dataclass, asdict from typing import List @dataclass class APoint:  x1: int  y1: int @dataclass class C:  aList: List[APoint] point_instance = APoint(10, 20) assert asdict(point_instance) == 'x1': 10, 'y1': 20> c = C([APoint(30, 40), APoint(50, 60)]) assert asdict(c) == 'aList': ['x1': 30, 'y1': 40>, 'x1': 50, 'y1': 60>]> 

Moreover, the recursive business logic can in no way handle circular references. If you use dataclasses to represent, well, let’s say, a graph, or some other data structure with circular references, the asdict will certainly crash.

@dataclasses.dataclass class GraphNode:  name: str  neighbors: list['GraphNode'] x = GraphNode('x', []) y = GraphNode('y', []) x.neighbors.append(y) y.neighbors.append(x) dataclasses.asdict(x) # The code will crash here as # the max allowed recursion depth would have exceeded # while calling the python object # in case you're running this on jupyter notebook notice # that the kernel will restart as the code crashed 

Furthermore, asdict builds a new dict , the __dict__ though directly accesses the object’s dict attribute.

Читайте также:  Php переданные параметры в функцию

It is important to note that the return value of asdict won’t, by any means, be affected by the reassignment of the original object’s attributes.

Also, considering that asdict uses fields if you add attributes to a dataclass object that don’t map to declared fields, the asdict won’t include them.

Lastly, although the docs don’t explicitly mention it, asdict will call deep-copy on anything that isn’t a dataclass instance, dict , list, or tuple.

return copy.deepcopy(instance) # a very costly operation ! 

Dataclass instance, dicts , lists, and tuples go through the recursive logic, which additionally builds a copy just with the recursive dictification applied.

If you are reasonably well versed in the object-oriented paradigm, then you’d know that deep-copy is a costly operation on its own as it inspects every object to see what needs to be copied; the lack of memo handling essentially means that asdict in all likelihood might create multiple copies of shared objects in nontrivial object graphs.

Beware of such a scenario:

from dataclasses import dataclass, asdict @dataclass class PointClass:  x1: object  y1: object obj_instance = object() var1 = PointClass(obj_instance, obj_instance) var2 = asdict(var1) print(var1.x1 is var1.y1) # prints true print(var2['x1'] is var2['y1']) # prints false print(var2['x1'] is var1.x1) # prints false 

Related Article — Python Dataclass

Источник

Оцените статью