Сформировать xml файл python

XML Processing Modules¶

Python’s interfaces for processing XML are grouped in the xml package.

The XML modules are not secure against erroneous or maliciously constructed data. If you need to parse untrusted or unauthenticated data see the XML vulnerabilities and The defusedxml Package sections.

It is important to note that modules in the xml package require that there be at least one SAX-compliant XML parser available. The Expat parser is included with Python, so the xml.parsers.expat module will always be available.

The documentation for the xml.dom and xml.sax packages are the definition of the Python bindings for the DOM and SAX interfaces.

The XML handling submodules are:

  • xml.etree.ElementTree : the ElementTree API, a simple and lightweight XML processor
  • xml.dom : the DOM API definition
  • xml.dom.minidom : a minimal DOM implementation
  • xml.dom.pulldom : support for building partial DOM trees
  • xml.sax : SAX2 base classes and convenience functions
  • xml.parsers.expat : the Expat parser binding

XML vulnerabilities¶

The XML processing modules are not secure against maliciously constructed data. An attacker can abuse XML features to carry out denial of service attacks, access local files, generate network connections to other machines, or circumvent firewalls.

The following table gives an overview of the known attacks and whether the various modules are vulnerable to them.

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

Vulnerable (1)

external entity expansion

  1. Expat 2.4.1 and newer is not vulnerable to the “billion laughs” and “quadratic blowup” vulnerabilities. Items still listed as vulnerable due to potential reliance on system-provided libraries. Check pyexpat.EXPAT_VERSION .
  2. xml.etree.ElementTree doesn’t expand external entities and raises a ParserError when an entity occurs.
  3. xml.dom.minidom doesn’t expand external entities and simply returns the unexpanded entity verbatim.
  4. xmlrpclib doesn’t expand external entities and omits them.
  5. Since Python 3.7.1, external general entities are no longer processed by default.
Читайте также:  Ячейка как ссылка

The Billion Laughs attack – also known as exponential entity expansion – uses multiple levels of nested entities. Each entity refers to another entity several times, and the final entity definition contains a small string. The exponential expansion results in several gigabytes of text and consumes lots of memory and CPU time.

quadratic blowup entity expansion

A quadratic blowup attack is similar to a Billion Laughs attack; it abuses entity expansion, too. Instead of nested entities it repeats one large entity with a couple of thousand chars over and over again. The attack isn’t as efficient as the exponential case but it avoids triggering parser countermeasures that forbid deeply nested entities.

external entity expansion

Entity declarations can contain more than just text for replacement. They can also point to external resources or local files. The XML parser accesses the resource and embeds the content into the XML document.

Some XML libraries like Python’s xml.dom.pulldom retrieve document type definitions from remote or local locations. The feature has similar implications as the external entity expansion issue.

Decompression bombs (aka ZIP bomb) apply to all XML libraries that can parse compressed XML streams such as gzipped HTTP streams or LZMA-compressed files. For an attacker it can reduce the amount of transmitted data by three magnitudes or more.

The documentation for defusedxml on PyPI has further information about all known attack vectors with examples and references.

The defusedxml Package¶

defusedxml is a pure Python package with modified subclasses of all stdlib XML parsers that prevent any potentially malicious operation. Use of this package is recommended for any server code that parses untrusted XML data. The package also ships with example exploits and extended documentation on more XML exploits such as XPath injection.

Источник

Building XML using Python

Building XML using Python programming language means creating an XML file or XML string using Python. We have seen how to parse or read an existing XML file or string using Python in my previous tutorial. Here we will see how to create an XML file or string using Python from scratch. We will not only create the XML file but also do pretty print the XML data. We have defined one method for pretty printing the XML elements.

Читайте также:  Python определить тип кортеж

Extensible Markup Language (XML) are the most widely used formats for data, because this format is very well supported by modern applications, and is very well suited for further data manipulation and customization. Therefore it is sometimes required to generate XML data using Python or other programming languages.

Prerequisites

Have Python installed in Windows (or Unix)
Pyhton version and Packages
Here I am using Python 3.6.6 version

Example with Source Code

Preparing Workspace

Preparing your workspace is one of the first things that you can do to make sure that you start off well. The first step is to check your working directory.

When you are working in the Python terminal, you need first navigate to the directory, where your file is located and then start up Python, i.e., you have to make sure that your file is located in the directory where you want to work from.

Let’s move on to the example…

Project Directory

In the below image you see I have opened a cmd prompt and navigated to the directory where I have to create Python script for building XML using Python.

python

Creating Python Script

Now we will create a python script that will read the attached XML file in the above link and display the content in the console.

XML is an inherently hierarchical data format, and the most natural way to represent it is with a tree. We will be parsing the XML data using xml.etree.ElementTree. ElementTree represents the whole XML document as a tree, and Element represents a single node in this tree. Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. Interactions with a single XML element and its sub-elements are done on the Element level.

Читайте также:  Couldn create java virtual machine

Here in the below Python XML builder script we import the required module. Then we define a method that does the task of pretty printing of the XML structure otherwise all will be written in one line and it would a bi difficult to read the XMl file.

Next we create the root element called bookstore with attribute speciality that has value novel. Then we create sub-element of the root element called book with attribute style that has value autobiography and so on.

Finally we write the whole XML document into a file under the current directory where the Python script resides. We also include XML declaration and encoding as the first line of the XML structure.

import xml.etree.ElementTree as ET #pretty print method def indent(elem, level=0): i = "\n" + level*" " j = "\n" + (level-1)*" " if len(elem): if not elem.text or not elem.text.strip(): elem.text = i + " " if not elem.tail or not elem.tail.strip(): elem.tail = i for subelem in elem: indent(subelem, level+1) if not elem.tail or not elem.tail.strip(): elem.tail = j else: if level and (not elem.tail or not elem.tail.strip()): elem.tail = j return elem #root element root = ET.Element('bookstore', ) #book sub-element book = ET.SubElement(root, 'book', ) author = ET.SubElement(book, 'author') firstName = ET.SubElement(author, 'first-name') firstName.text = 'Joe' lastName = ET.SubElement(author, 'last-name') lastName.text = 'Bob' award = ET.SubElement(author, 'award') award.text = 'Trenton Literary Review Honorable Mention' price = ET.SubElement(book, 'price') price.text = str(12) #magazine sub-element magazine = ET.SubElement(root, 'magazine', ) price = ET.SubElement(magazine, 'price') price.text = str(12) subscription = ET.SubElement(magazine, 'subscription', ) #write to file tree = ET.ElementTree(indent(root)) tree.write('bookstore2.xml', xml_declaration=True, encoding='utf-8')

Testing the Script

Now it’s time to test for the example on building XML using Python.

Simply run the above script you should see the generated bookstore2.xml file in the current directory. here is the below screen-shot of the output XML file.

building xml using python

That’s all. Hope, you got idea on building XML using Python.

Источник

Оцените статью