This class provides a way to convert a csv file into an in-memory database. More...

Inheritance diagram for db.csv_database.CSVDatabase:

Public Member Functions
def	__init__
	Constructor. More...

def	csv_insert
	Inserts a row of data straight from a csv file. More...

def	csv_select_by_index
	Selects data from the database. More...

def	get_db_col_names
	Fetches a list of all of the DB column names from the DB table. More...

def	csv_select_by_name
	Selects data from the database. More...

def	csv_update_by_index
	Generates and executes a SQL update statement. More...

def	csv_delete_by_index
	Generates and executes a SQL delete statement. More...

def	csv_delete_by_name
	Generates and executes a SQL delete statement. More...

def	csv_update_by_name
	Generates and executes a SQL update statement. More...

def	get_db_col_name

def	add_column
	Adds a named column to the this database's single table. More...

def	write_to_file
	Writes out the data in this CSVDatabase contains to a csv file. More...

def	set_blanks_to_null
	Sets all values in all columns to NULL if they are equal to the empty string. More...

Public Member Functions inherited from db.mem_database.MemDatabase
def	__init__
	Constructor. More...

def	dump_to_file
	Writes this in-memory database to a *.db file on disk. More...

Public Member Functions inherited from db.database.Database
def	__init__
	Constructor. More...

def	close
	Closes the database connection, releasing any resources that it was using. More...

def	execute_stmt
	Executes a verbatim SQL statement. More...

def	execute_script
	Reads and executes a sql script file. More...

def	insert
	Inserts one or more rows into a table in the database. More...

def	select
	Selects rows from a table in the database. More...

def	delete
	Deletes rows from a table in the database. More...

def	update_timestamp_col
	Updates the value of a timestamp column to the current date/time. More...

def	update
	Updates rows in a table. More...

def	write_to_file
	Writes out all of the data in a table to a csv file. More...

Static Public Member Functions
def	create_with_rows
	Convenience method to create a database from a list of rows that have already been read from a csv file. More...

def	create_with_files
	Convenience method to create a database from a bunch of files that all have the same columns (and column order). More...

Public Attributes
	name_lookup

	header_row

	col_datatypes

Static Public Attributes
string	TABLE_NAME = 'data'

string	COL_PREFIX = 'col'

dictionary	DATA_TYPES

tuple	ORDER_DIRS = Enum(['ASC', 'DESC'], ['ASC', 'DESC'])

Private Member Functions
def	_create_data_table
	Creates the single table used by this in-memory database. More...

def	_get_col_indices
	Gets a list of the indices of columns you specify by name. More...

def	_build_where_clause
	Translates CSV column names to DB column names to build a SQL where clause like 'x = y'. More...

def	_remove_cols
	Takes a row (list) of data and removes the specified columns from the list. More...

Detailed Description

This class provides a way to convert a csv file into an in-memory database.

This allows you to do SQL-operations on the data, which in some circumstances can be much more convenient than implementing search (query), sort, or data modification methods in straight Python. The cost for doing this is a little extra time inserting all the data - however, this cost isn't usually too bad because in-memory databases are up to an order of magnitude faster than file-based ones.

Sample usage:

 #create a list of data types corresponding to the format of (a row in) your csv file
 data_types = [str, int, float, bool, str, str, str]
 
 #this will create a single in-memory table with the contents of multiple csv files (files must have the same columns)
 db = CSVDatabase.create_with_files(['myFile1.csv', 'myFile2.csv'], data_types)
 
 #select Age, Gender from table where Age > 12
 rows = db.csv_select_by_name(col_names=['Age', 'Gender'],
                        where_body='%s > ?',
                        where_cols=['Age'], params=[12])
 
 #you can append new named columns to the db that do not exist in the csv file
 db.add_column('Extra column', bool)
 
 #See also csv_insert(), csv_delete_by_name(), csv_update_by_name()
 
 #You can write out the database to a csv file like this:
 db.write_to_file('path/to/database.db', omit_col_indices=[0]) #this will omit the first column from the exported spreadsheet

Definition at line 39 of file csv_database.py.

Constructor & Destructor Documentation

def db.csv_database.CSVDatabase.__init__	(	self,
		header_row,
		col_datatypes
	)

Constructor.

Parameters

self
header_row	(list of Strings) List of names of the columns in the csv file (generally this is just the top row of the csv file
col_datatypes	(list) List of data type functions (these must be keys from the static CSVDatabase.DATA_TYPES dictionary) corresponding to the columns in your csv file.

Definition at line 56 of file csv_database.py.

Member Function Documentation

def db.csv_database.CSVDatabase._build_where_clause	(	self,
		where_body,
		where_cols
	)

private

Translates CSV column names to DB column names to build a SQL where clause like 'x = y'.

This will presumably be attached to a SQL statement to form '... where x = y'.

Parameters

self
where_body	(String) body of the where clause, containing placeholders (s) for the column names (Eg. 's = ?')
where_cols	(list) list of csv names of columns. These will be translated to DB column names and placed in the where_body (these go in the s placeholders).

Returns: (String) the clause. This contains db column names.

Definition at line 204 of file csv_database.py.

def db.csv_database.CSVDatabase._create_data_table ( self )

private

Creates the single table used by this in-memory database.

Parameters

self

Definition at line 105 of file csv_database.py.

def db.csv_database.CSVDatabase._get_col_indices	(	self,
		col_names
	)

private

Gets a list of the indices of columns you specify by name.

Parameters

self
col_names	(list) list of column names (Strings)

Returns: (list) list of the indices that correspond to the column names you specified

Definition at line 190 of file csv_database.py.

def db.csv_database.CSVDatabase._remove_cols	(	self,
		row,
		col_indices
	)

private

Takes a row (list) of data and removes the specified columns from the list.

This is a convenience method and does not affect the data in the DB in any way.

Parameters

self
row	(list) a row of values from the database or csv file
col_indices	(list) list of the column indices that you want to remove from the row

Returns: (list) a new row that does not contain any of the columns specified using col_indices.

Definition at line 382 of file csv_database.py.

def db.csv_database.CSVDatabase.add_column	(	self,
		name,
		datatype
	)

Adds a named column to the this database's single table.

The values in the newly added column are all set to NULL.

Parameters

self
name	(string) name of the column to add.
datatype	(fcn) a data-type function (must be in the keys of the static CSVDatabase.DATA_TYPES dictionary) corresponding to the type of data that this column will hold

Definition at line 364 of file csv_database.py.

def db.csv_database.CSVDatabase.create_with_files	(	file_list,
		col_datatypes
	)

static

Convenience method to create a database from a bunch of files that all have the same columns (and column order).

Note: This method will complain if it finds that any of the files you pass it have different column names.

Parameters

file_list	(List) list of paths (Strings) to the files whose data you want to put into the new database
col_datatypes	(list) List of data type functions (these must be keys from the static CSVDatabase.DATA_TYPES dictionary) corresponding to the columns in your csv file.

Returns: (CSVDatabase) a CSVDatabase object containing the rows from all of the files. This won't have a primary key, so if the data is large you may want to add some indices before you start querying.

Definition at line 82 of file csv_database.py.

def db.csv_database.CSVDatabase.create_with_rows	(	rows,
		col_datatypes
	)

static

Convenience method to create a database from a list of rows that have already been read from a csv file.

Parameters

rows	(list) List of lists, where inner lists are rows of csv data. First row must be column names.
col_datatypes	(list) List of data type functions (these must be keys from the static CSVDatabase.DATA_TYPES dictionary) corresponding to the columns in your csv file.

Definition at line 68 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_delete_by_index	(	self,
		where_cond = `None`,
		params = `[]`
	)

Generates and executes a SQL delete statement.

This method uses the indices of the columns in the csv file. See csv_select_by_index() for a description of the parameters.

Parameters

self
where_cond	(String=None)
params	(list)

Returns: (int) number of rows deleted

Definition at line 300 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_delete_by_name	(	self,
		where_body = `None`,
		where_cols = `None`,
		params = `[]`
	)

Generates and executes a SQL delete statement.

This method uses the names of the columns in the csv file. See csv_select_by_name() for a description of the parameters.

Parameters

self
where_body	(String=None)
where_cols	(list=None)
params	(list)

Returns: (int) number of rows deleted

Definition at line 318 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_insert	(	self,
		csv_row
	)

Inserts a row of data straight from a csv file.

The data must have the types given in the col_datatypes param that was passed to the constructor or this will cause an exception.

Parameters

self
csv_row	(list) list of data from a csv file (corresponding to a single row)

Returns: (list) list of all the primary key values of the inserted rows

Definition at line 118 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_select_by_index	(	self,
		col_indices = `None`,
		where_cond = `None`,
		params = `[]`,
		order_by_indices = `None`,
		order_by_dirs = `None`,
		dump_sql = `False`,
		fcn_indices = `None`,
		fcns = `None`,
		group_by_index = `None`
	)

Selects data from the database.

This method works using the indices of the columns in the CSV file that you'd like to select.

Parameters

col_indices	(list=None) a list of the indices of the columns you'd like to select.
where_cond	(String=None) see database.select()
params	(list=[]) see database.select()
order_by_indices	(list=None) list of indices of columns you'd like to order by
order_by_dirs	(list=None) list of strings (each must be either 'ASC', 'DESC') that indicate the direction in which you want to order the columns specified by order_by_indices.
dump_sql	(boolean=False) if True, prints the generated SQL statement to stdout
fcn_indices	(list=None) list of indices of columns you'd like to apply SQL aggregate functions to ('avg', 'sum', etc.).
fcns	(list=None) list of Strings (SQL functions like 'avg', 'sum', etc.) to apply to the columns specified by fcn_indices
group_by_index	(int=None) index of column you'd like to group by

Returns: (list) list of lists, one sub-list per row. Each sublist contains data in the order specified by col_indices.

Definition at line 146 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_select_by_name	(	self,
		col_names = `None`,
		where_body = `None`,
		where_cols = `None`,
		params = `[]`,
		order_by = `None`,
		order_by_dirs = `None`,
		dump_sql = `False`,
		fcn_col_names = `None`,
		fcns = `None`,
		group_by = `None`
	)

Selects data from the database.

This method works using the CSV names of the columns you want to select.

Parameters

col_names	(list=None) a list of the csv names of the columns you'd like to select.
where_cond	(String=None) a string like ('s = ?'). This will be translated into 'where age = 14' using the where_cols and params parameters (see below).
where_cols	(list) list of csv names of columns to put in the s placeholders in where_cond.
params	(list=[]) see database.select(). These values will be placed in the ? placeholders in where_cond.
order_by	(list=None) list of csv names of columns you'd like to order by
order_by_dirs	(list=None) list of strings (each must be either 'ASC', 'DESC') that indicate the direction in which you want to order the columns specified by order_by.
dump_sql	(boolean=False) if True, prints the generated SQL statement to stdout
fcn_col_names	(list=None) list of csv names of columns you'd like to apply SQL aggregate functions to ('avg', 'sum', etc.).
fcns	(list=None) list of Strings (SQL functions like 'avg', 'sum', etc.) to apply to the columns specified by fcn_col_names
group_by	(int=None) csv name of column you'd like to group by

Returns: (list) list of lists, one sub-list per row. Each sublist contains data in the order specified by col_names.

Definition at line 247 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_update_by_index	(	self,
		col_indices,
		where_cond = `None`,
		params = `[]`,
		dump_sql = `False`
	)

Generates and executes a SQL update statement.

This method uses the indices of the columns in the csv file. See csv_select_by_index() for a description of the parameters.

Parameters

self
col_indices	(list)
where_cond	(String=None)
params	(list)
dump_sql	(boolean=False)

Returns: (int) number of rows updated

Definition at line 280 of file csv_database.py.

def db.csv_database.CSVDatabase.csv_update_by_name	(	self,
		col_names,
		where_body = `None`,
		where_cols = `None`,
		params = `[]`,
		dump_sql = `False`
	)

Generates and executes a SQL update statement.

This method uses the names of the columns in the csv file. See csv_select_by_name() for a description of the parameters. Sample usage: col_names = ['File_Name', 'Child_Gender'] where_body = 's > ? and s > ?' where_cols = ('FAN, 'CHN') params = (23.3, 45.5, 20.0, 10.0) Using these parameters will generate the following SQL: 'update data set col0=23.3, col3=45.5 where col10 > 20.0 and col11 > 10.0;' Assuming 'File_Name' -> col0, 'Child_Gender' -> col3, 'FAN' -> col10, 'CHN' -> col11.

Parameters

self
where_body	(String=None)
where_cols	(list=None)
params	(list)
dump_sql	(boolean=False)

Definition at line 340 of file csv_database.py.

def db.csv_database.CSVDatabase.get_db_col_name	(	self,
		text_name
	)

Definition at line 356 of file csv_database.py.

def db.csv_database.CSVDatabase.get_db_col_names ( self )

Fetches a list of all of the DB column names from the DB table.

Parameters

self

Returns: (list)

Definition at line 217 of file csv_database.py.

def db.csv_database.CSVDatabase.set_blanks_to_null ( self )

Sets all values in all columns to NULL if they are equal to the empty string.

Parameters

self

Definition at line 414 of file csv_database.py.

def db.csv_database.CSVDatabase.write_to_file	(	self,
		path,
		omit_col_indices = `[]`
	)

Writes out the data in this CSVDatabase contains to a csv file.

A header row (column names) will be written at the top of the file.

Parameters

self
path	(String) path to the csv file to write to
omit_col_indices	(list=[]) list of indices of columns that you want to omit from the exported csv file.

Definition at line 395 of file csv_database.py.

Member Data Documentation

db.csv_database.CSVDatabase.col_datatypes

Definition at line 61 of file csv_database.py.

string db.csv_database.CSVDatabase.COL_PREFIX = 'col'

static

Definition at line 41 of file csv_database.py.

dictionary db.csv_database.CSVDatabase.DATA_TYPES

static

Initial value:

 = {
         str: 'TEXT',
         float: 'REAL',
         int: 'INTEGER',
         bool: 'INTEGER', #sqlite does not have a separate boolean type. It's no trouble to convert these to integers for DB operations.
         datetime: 'TEXT', #sqlite doesn't have a datetime type, but provides functions that allows us to interpret strings as dates.
         #These should be entered in the form 'yyyy-mm-dd hh:mm:ss.sss'
         }

Definition at line 42 of file csv_database.py.

db.csv_database.CSVDatabase.header_row

Definition at line 60 of file csv_database.py.

db.csv_database.CSVDatabase.name_lookup

Definition at line 59 of file csv_database.py.

tuple db.csv_database.CSVDatabase.ORDER_DIRS = Enum(['ASC', 'DESC'], ['ASC', 'DESC'])

static

Definition at line 50 of file csv_database.py.

string db.csv_database.CSVDatabase.TABLE_NAME = 'data'

static

Definition at line 40 of file csv_database.py.

The documentation for this class was generated from the following file:

C:/Users/Wayne/Documents/baby-lab/bll_app/db/csv_database.py

Public Member Functions

Static Public Member Functions

Public Attributes

Static Public Attributes

Private Member Functions

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation