Connect to db2 python

TECH CHAMPION

To establish a connection to a db2 luw database within Python code and fetch data records from it, you would need to use the ibm_db driver API module. The module has been developed by IBM and is maintained under Apache Software License. The module can be downloaded easily and installed on your Python Base (ibm_db module download) You can use either pip or conda package manager to install the ibm_db module. Use one of the below if your Python base does not have the module installed yet.

pip install ibm_dbconda install -c conda-forge ibm_db

Using this module, one can connect and query not only DB2 LUW but also DB2 for zOS, DB2 for i5 and Informix Database (Informix Dynamic Server).

The ibm_db module is a wrapper implementation of the DB2 CLI APIs written in Python. The way to call the module API functions is similar to those of DB2 CLI.

The ibm_db driver implements an Object-Oriented approach. Once you load this module, you just have to call the inbuilt API functions for database-specific operations such as establishing a connection, executing a SQL query etc.

These functions also return the SQLSTATE information – a data structure that contains status information about the SQL or DB2 command executed. This state information is critical for writing scripts and programs in Python or any other programming language as they help us to evaluate whether a sql execution was successful or resulted in failure and based on the outcome, one generates the exception handling code.

Let us go through the steps required for the connection to a DB2 database. In our demonstration, we will use a database that resides on a Linux Virtual Machine on Oracle Virtual Box. In our demo case, the server IP is 192.168.43.221, the post number of the instance is 49152, the database name is SAMPLE and the user id/password is db2inst1/db2inst1.

If you are interested to know the installation of DB2 on Oracle Virtual Box, please check the article in the below link.

The basic steps involved in connecting to the database within the code would comprise the below-listed ones –

  • Import the module
  • Declare & assign the DSN variables
  • Generate the Connection String
  • Call the API function
  • Execute the SQL
  • Close the connection

IMPORT THE MODULE

The first step is to load the module in the code.

DSN VARIABLES

A DSN is a short form for DATA SOURCE NAME that contains network-identifiable information about a data source. It could be a regular database or HDFS or flat files and so on which can be uniquely identified over a network. In programming, it is considered good and standard practice to generate a connection string from variables pointing to the network parameters. You can choose any name for these variables and /or use the values directly in the API call.

# Declare the Connection Variables dsn_Host = "192.168.43.221" dsn_Port = str(49152) dsn_Protocol = "TCPIP" dsn_dbName = "SAMPLE" dsn_User = "db2inst1" dsn_UserPass = "db2inst1" dsn_ConnectionStr = ""

CONNECTION STRING

In programming, a connection is a piece of information that is required by the application to connect to the database. You must have the environment details of the database you wish to connect and query. These details include the following –

  • Hostname or IP address
  • The port number at which the db2 instance listens
  • Protocol (which is configured at the server; generally TCPIP)
  • Database name or its alias
  • Userid that would be used to connect to the database. the user id must have the authority to connect to the database.
  • The password of the above userid
Читайте также:  Reading pdf files in java

In the connection string, we are basically assigning values to certain pre-defined variables. These variables and their values are parsed internally by the ibm_db module to fetch the environment information about the target database. These variables are:-

Not all of them are essential to be specified as some have default values. The ATTACH=TRUE lets you connect to the server but not the database and in that case, specifying the database name is not needed.

If the database you want to connect is already cataloged on the server where the Python code will run then you can connect to the database by specifying only the local database alias along with the user id and the password.

If the database has not been cataloged, then you need to specify all the arguments (Hostname, port number, protocol etc.)

These variables have to be passed in the connection string by assigning the values to these variables. The assignment has to be within the API call itself. Also, the subsequent variable assignment needs to be separated with a semicolon without any space in between.

# Prepare the connection String dsn_ConnectionStr = "" dsn_ConnectionStr = dsn_ConnectionStr + "DRIVER=" dsn_ConnectionStr = dsn_ConnectionStr + ";ATTACH=FALSE" dsn_ConnectionStr = dsn_ConnectionStr + ";HOSTNAME=" + dsn_Host dsn_ConnectionStr = dsn_ConnectionStr + ";PORT=" + dsn_Port dsn_ConnectionStr = dsn_ConnectionStr + ";PROTOCOL=" + dsn_Protocol dsn_ConnectionStr = dsn_ConnectionStr + ";DATABASE=" + dsn_dbName dsn_ConnectionStr = dsn_ConnectionStr + ";UID="+ dsn_User dsn_ConnectionStr = dsn_ConnectionStr + ";PWD wp-block-code has-white-color has-text-color has-background" style="background-color:#2c3e50">print(dsn_ConnectionStr) DRIVER=;ATTACH=FALSE;HOSTNAME=192.168.43.221;PORT=49152;PROTOCOL=TCPIP;DATABASE=SAMPLE;UID=db2inst1;PWD=db2inst1

Make a call to API ibm_db.connect() to establish connection

ibm_db.connect is the API function that is used to make a connection to the database.

If you are making a connection to a cataloged database, then the call to the API function would be –

ibm_db.connect(database, userid, password)

If you are making a connection to a non-cataloged database using a DSN then the call would be –

ibm_db.connect(dsn, “” , “”)

The above statement would create a connection object which would remain active till it is explicitly closed using an ibm_db.close() statement.

conn = ibm_db.connect(myConnectionStr, '', '') # conn is now a connection object print(conn)

There are two related API functions that supplement connect() call. These are ibm_db.conn_errormsg() and ibm_db.conn_error() functions. The first one gives the SQLCODE information for the connect statement issued while the second one gives the SQLSTATE.

Generally, we would like to use the connection statement in a try /catch block so that if the connection fails for some reason, we are able to handle the failure and gracefully come out of the program.

try: conn = ibm_db.connect(dsn_ConnectionStr, '', '') except Exception: print("Error in connecting to the database " + str(dsn_dbName)) print("SQLSTATE : " + str( ibm_db.conn_error()) ) print("SQLCODE :" + str( ibm_db.conn_errormsg()) ) else: print("Successfully connected to the database " + str(dsn_dbName)) print("SQLSTATE : " + str(ibm_db.conn_error()) ) print("SQLCODE :" + str(ibm_db.conn_errormsg()) )

For a successful connection, we would get

Successfully connected to the database SAMPLE SQLSTATE : SQLCODE :

For a failure, we would receive

Error in connecting to the database SAMPLE SQLSTATE : 08001 SQLCODE=-30081CLI Driver] SQL30081N A communication error has been detected. Communication protocol being used: "TCP/IP". Communication API being used: "SOCKETS". Location where the error was detected: "192.168.43.221". Communication function detecting the error: "connect". Protocol specific error code(s): "10061", "*", "*". SQLSTATE=08001 

Executing a SQL

Once we have established a connection to the database, the next statement would be to execute SQL. There are two methods available for SQL execution – exec_immediate & execute. If a SQL statement needs to be called for more than once with different parameters, consider using execute (along with ibm_db.prepare() function ). This helps in re-using the same access plan at the database server-side and hence optimizing the SQL execution process.

Let us go through the case of exec_immediate first . The results of an executed statement are stored in a result set (an object of type IBM_DBStatement). You can get the rows from this result set using one of the four available fetch functions – fetch_tuple, fetch_assoc, fetch_both and result API functions. These functions fetch one row at a time hence you would need to use a loop to fetch all the rows.

See the below code snippets for better understanding.

fetch_tuple()

sqlStmt="select WORKDEPT, sum(SALARY) SALARY_DEPT from db2inst1.EMPLOYEE group by WORKDEPT order by 1" sqlRes = ibm_db.exec_immediate(conn, sqlStmt) if (sqlRes is not None): row = ibm_db.fetch_tuple(sqlRes) while(row): print(row) row = ibm_db.fetch_tuple(sqlRes) Output ------------------------------------------------------ ('A00', '354250.00') ('B01', '94250.00') ('C01', '308890.00') ('D11', '646620.00') ('D21', '358680.00') ('E01', '80175.00') ('E11', '317140.00') ('E21', '282520.00') ------------------------------------------------------

In the above code, sqlRes is the result set that contains the output rows of the SQL. If the result set is not empty, we fetch one row at a time by placing a pointer and looping it till the pointer reaches the end of the result set.

fetch_tuple return one tuple at a time where the tuple consists of column rows.

fetch_assoc()

Now let us see the fetch_assoc output, It outputs one dictionary at a time where column names are the keys and the column values are the value part of the key-value dictionary.

sqlStmt="select WORKDEPT, sum(SALARY) SALARY_DEPT from db2inst1.EMPLOYEE group by WORKDEPT order by 1" sqlRes = ibm_db.exec_immediate(conn, sqlStmt) if (sqlRes is not None): row = ibm_db.fetch_assoc(sqlRes) while(row): print(row) row = ibm_db.fetch_assoc(sqlRes) Output ------------------------------------------------------        ------------------------------------------------------

fetch_both()

fetch_both is similar to fetch_assoc. Apart from the column names, it also provides the relative position of the column position in the result set. This will be more clear from the output below.

sqlStmt="select WORKDEPT, sum(SALARY) SALARY_DEPT from db2inst1.EMPLOYEE group by WORKDEPT order by 1" sqlRes = ibm_db.exec_immediate(conn, sqlStmt) if (sqlRes is not None): row = ibm_db.fetch_both(sqlRes) while(row): print(row) row = ibm_db.fetch_both(sqlRes) Output ------------------------------------------------------

fetch_row() & result()

fetch_row is an iterator that points to the next row in the result set when placed in a loop. The ibm_db.result() can be used to extract the row from the pointer location in the result set. It expects 2 arguments – the result set object and the 0-indexed position of the column to be fetched. The below example would give a better insight.

sqlStmt="select WORKDEPT, sum(SALARY) SALARY_DEPT from db2inst1.EMPLOYEE group by WORKDEPT order by 1" sqlRes = ibm_db.exec_immediate(conn, sqlStmt) if (sqlRes is not None): while(ibm_db.fetch_row(sqlRes)): print(ibm_db.result(sqlRes,0) ,",", ibm_db.result(sqlRes,1) ) Output ------------------------------------------------------ A00 , 354250.00 B01 , 94250.00 C01 , 308890.00 D11 , 646620.00 D21 , 358680.00 E01 , 80175.00 E11 , 317140.00 E21 , 282520.00 ------------------------------------------------------ 

Let us see how sqls which might need to be executed several times with different parameters can be executed using execute() API. This involves a two-step approach. First, we need to prepare the statement using prepare() API call. Next, variable(s) that will contain multiple values need to be

CODE
sqlStmt="select EMPNO from db2inst1.EMPLOYEE where WORKDEPT = ? and SEX = ? " sqlRes = ibm_db.prepare(conn, sqlStmt) DEPTCODE='D11' SEX='M' ibm_db.bind_param(sqlRes, 1, DEPTCODE) ibm_db.bind_param(sqlRes, 2, SEX) ibm_db.execute(sqlRes) if (sqlRes is not None): row = ibm_db.fetch_tuple(sqlRes) while(row): print(row) row = ibm_db.fetch_tuple(sqlRes) print("Next Search") DEPTCODE='E11' SEX='F' ibm_db.bind_param(sqlRes, 1, DEPTCODE) ibm_db.bind_param(sqlRes, 2, SEX) ibm_db.execute(sqlRes) if (sqlRes is not None): row = ibm_db.fetch_tuple(sqlRes) while(row): print(row) row = ibm_db.fetch_tuple(sqlRes) 
OUTPUT
('000060',) ('000150',) ('000170',) ('000190',) ('000200',) ('000210',) ('200170',) Next Search ('000090',) ('000280',) ('000310',) ('200280',) ('200310',)

Источник

Оцените статью