API reference#

sqlite_utils.db.Database#

class sqlite_utils.db.Database(filename_or_conn=None, memory=False, memory_name=None, recreate=False, recursive_triggers=True, tracer=None, use_counts_table=False)#

Wrapper for a SQLite database connection that adds a variety of useful utility methods.

To create an instance:

# create data.db file, or open existing:
db = Database("data.db")
# Create an in-memory database:
dB = Database(memory=True)
Parameters:
  • filename_or_conn (Union[str, Path, Connection]) – String path to a file, or a pathlib.Path object, or a sqlite3 connection

  • memory (bool) – set to True to create an in-memory database

  • memory_name (str) – creates a named in-memory database that can be shared across multiple connections

  • recreate (bool) – set to True to delete and recreate a file database (dangerous)

  • recursive_triggers (bool) – defaults to True, which sets PRAGMA recursive_triggers=on; - set to False to avoid setting this pragma

  • tracer (Callable) – set a tracer function (print works for this) which will be called with sql, parameters every time a SQL query is executed

  • use_counts_table (bool) – set to True to use a cached counts table, if available. See Cached table counts using triggers

tracer(tracer=None)#

Context manager to temporarily set a tracer function - all executed SQL queries will be passed to this.

The tracer function should accept two arguments: sql and parameters

Example usage:

with db.tracer(print):
    db["creatures"].insert({"name": "Cleo"})

See Tracing queries.

Parameters:

tracer (Callable) – Callable accepting sql and parameters arguments

__getitem__(table_name)#

db[table_name] returns a Table object for the table with the specified name. If the table does not exist yet it will be created the first time data is inserted into it.

Parameters:

table_name (str) – The name of the table

Return type:

Union[Table, View]

register_function(fn=None, deterministic=False, replace=False)#

fn will be made available as a function within SQL, with the same name and number of arguments. Can be used as a decorator:

@db.register
def upper(value):
    return str(value).upper()

The decorator can take arguments:

@db.register(deterministic=True, replace=True)
def upper(value):
    return str(value).upper()

See Registering custom SQL functions.

Parameters:
  • fn (Optional[Callable]) – Function to register

  • deterministic (bool) – set True for functions that always returns the same output for a given input

  • replace (bool) – set True to replace an existing function with the same name - otherwise throw an error

register_fts4_bm25()#

Register the rank_bm25(match_info) function used for calculating relevance with SQLite FTS4.

attach(alias, filepath)#

Attach another SQLite database file to this connection with the specified alias, equivalent to:

ATTACH DATABASE 'filepath.db' AS alias
Parameters:
  • alias (str) – Alias name to use

  • filepath (Union[str, Path]) – Path to SQLite database file on disk

query(sql, params=None)#

Execute sql and return an iterable of dictionaries representing each row.

Parameters:
  • sql (str) – SQL query to execute

  • params (Optional[Union[Iterable, dict]]) – Parameters to use in that query - an iterable for where id = ? parameters, or a dictionary for where id = :id

Return type:

Generator[dict, None, None]

execute(sql, parameters=None)#

Execute SQL query and return a sqlite3.Cursor.

Parameters:
  • sql (str) – SQL query to execute

  • parameters (Optional[Union[Iterable, dict]]) – Parameters to use in that query - an iterable for where id = ? parameters, or a dictionary for where id = :id

Return type:

Cursor

executescript(sql)#

Execute multiple SQL statements separated by ; and return the sqlite3.Cursor.

Parameters:

sql (str) – SQL to execute

Return type:

Cursor

table(table_name, **kwargs)#

Return a table object, optionally configured with default options.

Parameters:

table_name (str) – Name of the table

Return type:

Union[Table, View]

quote(value)#

Apply SQLite string quoting to a value, including wrappping it in single quotes.

Parameters:

value (str) – String to quote

Return type:

str

quote_fts(query)#

Escape special characters in a SQLite full-text search query.

This works by surrounding each token within the query with double quotes, in order to avoid words like NOT and OR having special meaning as defined by the FTS query syntax here:

https://www.sqlite.org/fts5.html#full_text_query_syntax

If the query has unbalanced " characters, adds one at end.

Parameters:

query (str) – String to escape

Return type:

str

table_names(fts4=False, fts5=False)#

List of string table names in this database.

Parameters:
  • fts4 (bool) – Only return tables that are part of FTS4 indexes

  • fts5 (bool) – Only return tables that are part of FTS5 indexes

Return type:

List[str]

view_names()#

List of string view names in this database.

Return type:

List[str]

property tables: List[Table]#

List of Table objects in this database.

property views: List[View]#

List of View objects in this database.

property triggers: List[Trigger]#

List of (name, table_name, sql) tuples representing triggers in this database.

property triggers_dict: Dict[str, str]#

A {trigger_name: sql} dictionary of triggers in this database.

property schema: str#

SQL schema for this database.

property supports_strict: bool#

Does this database support STRICT mode?

property sqlite_version: Tuple[int, ...]#

Version of SQLite, as a tuple of integers for example (3, 36, 0).

property journal_mode: str#

Current journal_mode of this database.

https://www.sqlite.org/pragma.html#pragma_journal_mode

enable_wal()#

Sets journal_mode to 'wal' to enable Write-Ahead Log mode.

disable_wal()#

Sets journal_mode back to 'delete' to disable Write-Ahead Log mode.

enable_counts()#

Enable trigger-based count caching for every table in the database, see Cached table counts using triggers.

cached_counts(tables=None)#

Return {table_name: count} dictionary of cached counts for specified tables, or all tables if tables not provided.

Parameters:

tables (Optional[Iterable[str]]) – Subset list of tables to return counts for.

Return type:

Dict[str, int]

reset_counts()#

Re-calculate cached counts for tables.

create_table_sql(name, columns, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, hash_id=None, hash_id_columns=None, extracts=None, if_not_exists=False)#

Returns the SQL CREATE TABLE statement for creating the specified table.

Parameters:
  • name (str) – Name of table

  • columns (Dict[str, Any]) – Dictionary mapping column names to their types, for example {"name": str, "age": int}

  • pk (Optional[Any]) – String name of column to use as a primary key, or a tuple of strings for a compound primary key covering multiple columns

  • foreign_keys (Optional[Union[Iterable[str], Iterable[ForeignKey], Iterable[Tuple[str, str]], Iterable[Tuple[str, str, str]], Iterable[Tuple[str, str, str, str]]]]) – List of foreign key definitions for this table

  • column_order (Optional[List[str]]) – List specifying which columns should come first

  • not_null (Optional[Iterable[str]]) – List of columns that should be created as NOT NULL

  • defaults (Optional[Dict[str, Any]]) – Dictionary specifying default values for columns

  • hash_id (Optional[str]) – Name of column to be used as a primary key containing a hash of the other columns

  • hash_id_columns (Optional[Iterable[str]]) – List of columns to be used when calculating the hash ID for a row

  • extracts (Optional[Union[Dict[str, str], List[str]]]) – List or dictionary of columns to be extracted during inserts, see Populating lookup tables automatically during insert/upsert

  • if_not_exists (bool) – Use CREATE TABLE IF NOT EXISTS

Return type:

str

create_table(name, columns, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, hash_id=None, hash_id_columns=None, extracts=None, if_not_exists=False)#

Create a table with the specified name and the specified {column_name: type} columns.

See Explicitly creating a table.

Parameters:
  • name (str) – Name of table

  • columns (Dict[str, Any]) – Dictionary mapping column names to their types, for example {"name": str, "age": int}

  • pk (Optional[Any]) – String name of column to use as a primary key, or a tuple of strings for a compound primary key covering multiple columns

  • foreign_keys (Optional[Union[Iterable[str], Iterable[ForeignKey], Iterable[Tuple[str, str]], Iterable[Tuple[str, str, str]], Iterable[Tuple[str, str, str, str]]]]) – List of foreign key definitions for this table

  • column_order (Optional[List[str]]) – List specifying which columns should come first

  • not_null (Optional[Iterable[str]]) – List of columns that should be created as NOT NULL

  • defaults (Optional[Dict[str, Any]]) – Dictionary specifying default values for columns

  • hash_id (Optional[str]) – Name of column to be used as a primary key containing a hash of the other columns

  • hash_id_columns (Optional[Iterable[str]]) – List of columns to be used when calculating the hash ID for a row

  • extracts (Optional[Union[Dict[str, str], List[str]]]) – List or dictionary of columns to be extracted during inserts, see Populating lookup tables automatically during insert/upsert

  • if_not_exists (bool) – Use CREATE TABLE IF NOT EXISTS

Return type:

Table

create_view(name, sql, ignore=False, replace=False)#

Create a new SQL view with the specified name - sql should start with SELECT ....

Parameters:
  • name (str) – Name of the view

  • sql (str) – SQL SELECT query to use for this view.

  • ignore (bool) – Set to True to do nothing if a view with this name already exists

  • replace (bool) – Set to True to replace the view if one with this name already exists

m2m_table_candidates(table, other_table)#

Given two table names returns the name of tables that could define a many-to-many relationship between those two tables, based on having foreign keys to both of the provided tables.

Parameters:
  • table (str) – Table name

  • other_table (str) – Other table name

Return type:

List[str]

add_foreign_keys(foreign_keys)#

See Adding multiple foreign key constraints at once.

Parameters:

foreign_keys (Iterable[Tuple[str, str, str, str]]) – A list of (table, column, other_table, other_column) tuples

index_foreign_keys()#

Create indexes for every foreign key column on every table in the database.

vacuum()#

Run a SQLite VACUUM against the database.

analyze(name=None)#

Run ANALYZE against the entire database or a named table or index.

Parameters:

name – Run ANALYZE against this specific named table or index

init_spatialite(path=None)#

The init_spatialite method will load and initialize the SpatiaLite extension. The path argument should be an absolute path to the compiled extension, which can be found using find_spatialite.

Returns True if SpatiaLite was successfully initialized.

from sqlite_utils.db import Database
from sqlite_utils.utils import find_spatialite

db = Database("mydb.db")
db.init_spatialite(find_spatialite())

If you’ve installed SpatiaLite somewhere unexpected (for testing an alternate version, for example) you can pass in an absolute path:

from sqlite_utils.db import Database
from sqlite_utils.utils import find_spatialite

db = Database("mydb.db")
db.init_spatialite("./local/mod_spatialite.dylib")
Parameters:

path (Optional[str]) – Path to SpatiaLite module on disk

Return type:

bool

sqlite_utils.db.Queryable#

Table and View are both subclasses of Queryable, providing access to the following methods:

class sqlite_utils.db.Queryable(db, name)#
exists()#

Does this table or view exist yet?

Return type:

bool

count_where(where=None, where_args=None)#

Executes SELECT count(*) FROM table WHERE ... and returns a count.

Parameters:
  • where (Optional[str]) – SQL where fragment to use, for example id > ?

  • where_args (Optional[Union[Iterable, dict]]) – Parameters to use with that fragment - an iterable for id > ? parameters, or a dictionary for id > :id

Return type:

int

property count: int#

A count of the rows in this table or view.

property rows: Generator[dict, None, None]#

Iterate over every dictionaries for each row in this table or view.

rows_where(where=None, where_args=None, order_by=None, select='*', limit=None, offset=None)#

Iterate over every row in this table or view that matches the specified where clause.

Returns each row as a dictionary. See Listing rows for more details.

Parameters:
  • where (Optional[str]) – SQL where fragment to use, for example id > ?

  • where_args (Optional[Union[Iterable, dict]]) – Parameters to use with that fragment - an iterable for id > ? parameters, or a dictionary for id > :id

  • order_by (Optional[str]) – Column or fragment of SQL to order by

  • select (str) – Comma-separated list of columns to select - defaults to *

  • limit (Optional[int]) – Integer number of rows to limit to

  • offset (Optional[int]) – Integer for SQL offset

Return type:

Generator[dict, None, None]

pks_and_rows_where(where=None, where_args=None, order_by=None, limit=None, offset=None)#

Like .rows_where() but returns (pk, row) pairs - pk can be a single value or tuple.

Parameters:
  • where (Optional[str]) – SQL where fragment to use, for example id > ?

  • where_args (Optional[Union[Iterable, dict]]) – Parameters to use with that fragment - an iterable for id > ? parameters, or a dictionary for id > :id

  • order_by (Optional[str]) – Column or fragment of SQL to order by

  • select – Comma-separated list of columns to select - defaults to *

  • limit (Optional[int]) – Integer number of rows to limit to

  • offset (Optional[int]) – Integer for SQL offset

Return type:

Generator[Tuple[Any, Dict], None, None]

property columns: List[Column]#

List of Columns representing the columns in this table or view.

property columns_dict: Dict[str, Any]#

{column_name: python-type} dictionary representing columns in this table or view.

property schema: str#

SQL schema for this table or view.

sqlite_utils.db.Table#

class sqlite_utils.db.Table(db, name, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, batch_size=100, hash_id=None, hash_id_columns=None, alter=False, ignore=False, replace=False, extracts=None, conversions=None, columns=None)#

Bases: Queryable

Tables should usually be initialized using the db.table(table_name) or db[table_name] methods.

Parameters:
  • db (Database) –

  • name (str) –

  • pk (Optional[Any]) –

  • foreign_keys (Optional[Union[Iterable[str], Iterable[ForeignKey], Iterable[Tuple[str, str]], Iterable[Tuple[str, str, str]], Iterable[Tuple[str, str, str, str]]]]) –

  • column_order (Optional[List[str]]) –

  • not_null (Iterable[str]) –

  • defaults (Optional[Dict[str, Any]]) –

  • batch_size (int) –

  • hash_id (Optional[str]) –

  • hash_id_columns (Optional[Iterable[str]]) –

  • alter (bool) –

  • ignore (bool) –

  • replace (bool) –

  • extracts (Optional[Union[Dict[str, str], List[str]]]) –

  • conversions (Optional[dict]) –

  • columns (Optional[Dict[str, Any]]) –

last_rowid: Optional[int] = None#

The rowid of the last inserted, updated or selected row.

last_pk: Optional[Any] = None#

The primary key of the last inserted, updated or selected row.

property count: int#

Count of the rows in this table - optionally from the table count cache, if configured.

exists()#

Does this table or view exist yet?

Return type:

bool

property pks: List[str]#

Primary key columns for this table.

property use_rowid: bool#

Does this table use rowid for its primary key (no other primary keys are specified)?

get(pk_values)#

Return row (as dictionary) for the specified primary key.

Raises sqlite_utils.db.NotFoundError if a matching row cannot be found.

Parameters:

pk_values (Union[list, tuple, str, int]) – A single value, or a tuple of values for tables that have a compound primary key

Return type:

dict

property foreign_keys: List[ForeignKey]#

List of foreign keys defined on this table.

property virtual_table_using: Optional[str]#

Type of virtual table, or None if this is not a virtual table.

property indexes: List[Index]#

List of indexes defined on this table.

property xindexes: List[XIndex]#

List of indexes defined on this table using the more detailed XIndex format.

property triggers: List[Trigger]#

List of triggers defined on this table.

property triggers_dict: Dict[str, str]#

{trigger_name: sql} dictionary of triggers defined on this table.

property strict: bool#

Is this a STRICT table?

create(columns, pk=None, foreign_keys=None, column_order=None, not_null=None, defaults=None, hash_id=None, hash_id_columns=None, extracts=None, if_not_exists=False)#

Create a table with the specified columns.

See Explicitly creating a table for full details.

Parameters:
  • columns (Dict[str, Any]) – Dictionary mapping column names to their types, for example {"name": str, "age": int}

  • pk (Optional[Any]) – String name of column to use as a primary key, or a tuple of strings for a compound primary key covering multiple columns

  • foreign_keys (Optional[Union[Iterable[str], Iterable[ForeignKey], Iterable[Tuple[str, str]], Iterable[Tuple[str, str, str]], Iterable[Tuple[str, str, str, str]]]]) – List of foreign key definitions for this table

  • column_order (Optional[List[str]]) – List specifying which columns should come first

  • not_null (Optional[Iterable[str]]) – List of columns that should be created as NOT NULL

  • defaults (Optional[Dict[str, Any]]) – Dictionary specifying default values for columns

  • hash_id (Optional[str]) – Name of column to be used as a primary key containing a hash of the other columns

  • hash_id_columns (Optional[Iterable[str]]) – List of columns to be used when calculating the hash ID for a row

  • extracts (Optional[Union[Dict[str, str], List[str]]]) – List or dictionary of columns to be extracted during inserts, see Populating lookup tables automatically during insert/upsert

  • if_not_exists (bool) – Use CREATE TABLE IF NOT EXISTS

Return type:

Table

transform(*, types=None, rename=None, drop=None, pk=<sqlite_utils.db.Default object>, not_null=None, defaults=None, drop_foreign_keys=None, column_order=None)#

Apply an advanced alter table, including operations that are not supported by ALTER TABLE in SQLite itself.

See Transforming a table for full details.

Parameters:
  • types (Optional[dict]) – Columns that should have their type changed, for example {"weight": float}

  • rename (Optional[dict]) – Columns to rename, for example {"headline": "title"}

  • drop (Optional[Iterable]) – Columns to drop

  • pk (Optional[Any]) – New primary key for the table

  • not_null (Optional[Set[str]]) – Columns to set as NOT NULL

  • defaults (Optional[Dict[str, Any]]) – Default values for columns

  • drop_foreign_keys (Optional[Iterable]) – Names of columns that should have their foreign key constraints removed

  • column_order (Optional[List[str]]) – List of strings specifying a full or partial column order to use when creating the table.

Return type:

Table

transform_sql(*, types=None, rename=None, drop=None, pk=<sqlite_utils.db.Default object>, not_null=None, defaults=None, drop_foreign_keys=None, column_order=None, tmp_suffix=None)#

Return a list of SQL statements that should be executed in order to apply this transformation.

Parameters:
  • types – Columns that should have their type changed, for example {"weight": float}

  • rename – Columns to rename, for example {"headline": "title"}

  • drop – Columns to drop

  • pk – New primary key for the table

  • not_null – Columns to set as NOT NULL

  • defaults – Default values for columns

  • drop_foreign_keys – Names of columns that should have their foreign key constraints removed

  • column_order – List of strings specifying a full or partial column order to use when creating the table.

Return type:

List[str]

extract(columns, table=None, fk_column=None, rename=None)#

Extract specified columns into a separate table.

See Extracting columns into a separate table for details.

Parameters:
  • columns (Union[str, Iterable[str]]) – Single column or list of columns that should be extracted

  • table (Optional[str]) – Name of table in which the new records should be created

  • fk_column (Optional[str]) – Name of the foreign key column to populate in the original table

  • rename (Optional[Dict[str, str]]) – Dictionary of columns that should be renamed when populating the new table

Return type:

Table

create_index(columns, index_name=None, unique=False, if_not_exists=False, find_unique_name=False, analyze=False)#

Create an index on this table.

Parameters:
  • columns (Iterable[Union[str, DescIndex]]) – A single columns or list of columns to index. These can be strings or, to create an index using the column in descending order, db.DescIndex(column_name) objects.

  • index_name (Optional[str]) – The name to use for the new index. Defaults to the column names joined on _.

  • unique (bool) – Should the index be marked as unique, forcing unique values?

  • if_not_exists (bool) – Only create the index if one with that name does not already exist.

  • find_unique_name (bool) – If index_name is not provided and the automatically derived name already exists, keep incrementing a suffix number to find an available name.

  • analyze (bool) – Run ANALYZE against this index after creating it.

See Creating indexes.

add_column(col_name, col_type=None, fk=None, fk_col=None, not_null_default=None)#

Add a column to this table. See Adding columns.

Parameters:
  • col_name (str) – Name of the new column

  • col_type (Optional[Any]) – Column type - a Python type such as str or a SQLite type string such as "BLOB"

  • fk (Optional[str]) – Name of a table that this column should be a foreign key reference to

  • fk_col (Optional[str]) – Column in the foreign key table that this should reference

  • not_null_default (Optional[Any]) – Set this column to not null and give it this default value

drop(ignore=False)#

Drop this table.

Parameters:

ignore (bool) – Set to True to ignore the error if the table does not exist

guess_foreign_table(column)#

For a given column, suggest another table that might be referenced by this column should it be used as a foreign key.

For example, a column called tag_id or tag or tags might suggest a tag table, if one exists.

If no candidates can be found, raises a NoObviousTable exception.

Parameters:

column (str) – Name of column

Return type:

str

add_foreign_key(column, other_table=None, other_column=None, ignore=False)#

Alter the schema to mark the specified column as a foreign key to another table.

Parameters:
  • column (str) – The column to mark as a foreign key.

  • other_table (Optional[str]) – The table it refers to - if omitted, will be guessed based on the column name.

  • other_column (Optional[str]) – The column on the other table it - if omitted, will be guessed.

  • ignore (bool) – Set this to True to ignore an existing foreign key - otherwise a AlterError will be raised.

enable_counts()#

Set up triggers to update a cache of the count of rows in this table.

See Cached table counts using triggers for details.

property has_counts_triggers: bool#

Does this table have triggers setup to update cached counts?

enable_fts(columns, fts_version='FTS5', create_triggers=False, tokenize=None, replace=False)#

Enable SQLite full-text search against the specified columns.

See Full-text search for more details.

Parameters:
  • columns (Iterable[str]) – List of column names to include in the search index.

  • fts_version (str) – FTS version to use - defaults to FTS5 but you may want FTS4 for older SQLite versions.

  • create_triggers (bool) – Should triggers be created to keep the search index up-to-date? Defaults to False.

  • tokenize (Optional[str]) – Custom SQLite tokenizer to use, for example "porter" to enable Porter stemming.

  • replace (bool) – Should any existing FTS index for this table be replaced by the new one?

populate_fts(columns)#

Update the associated SQLite full-text search index with the latest data from the table for the specified columns.

Parameters:

columns (Iterable[str]) – Columns to populate the data for

Return type:

Table

disable_fts()#

Remove any full-text search index and related triggers configured for this table.

Return type:

Table

rebuild_fts()#

Run the rebuild operation against the associated full-text search index table.

detect_fts()#

Detect if table has a corresponding FTS virtual table and return it

Return type:

Optional[str]

optimize()#

Run the optimize operation against the associated full-text search index table.

Return type:

Table

search_sql(columns=None, order_by=None, limit=None, offset=None, where=None)#

” Return SQL string that can be used to execute searches against this table.

Parameters:
  • columns (Optional[Iterable[str]]) – Columns to search against

  • order_by (Optional[str]) – Column or SQL expression to sort by

  • limit (Optional[int]) – SQL limit

  • offset (Optional[int]) – SQL offset

  • where (Optional[str]) – Extra SQL fragment for the WHERE clause

Return type:

str

search(q, order_by=None, columns=None, limit=None, offset=None, where=None, where_args=None, quote=False)#

Execute a search against this table using SQLite full-text search, returning a sequence of dictionaries for each row.

Parameters:
  • q (str) – Terms to search for

  • order_by (Optional[str]) – Defaults to order by rank, or specify a column here.

  • columns (Optional[Iterable[str]]) – List of columns to return, defaults to all columns.

  • limit (Optional[int]) – Optional integer limit for returned rows.

  • offset (Optional[int]) – Optional integer SQL offset.

  • where (Optional[str]) – Extra SQL fragment for the WHERE clause

  • where_args (Optional[Union[Iterable, dict]]) – Arguments to use for :param placeholders in the extra WHERE clause

  • quote (bool) – Apply quoting to disable any special characters in the search query

Return type:

Generator[dict, None, None]

See Searching with table.search().

delete(pk_values)#

Delete row matching the specified primary key.

Parameters:

pk_values (Union[list, tuple, str, int, float]) – A single value, or a tuple of values for tables that have a compound primary key

Return type:

Table

delete_where(where=None, where_args=None, analyze=False)#

Delete rows matching the specified where clause, or delete all rows in the table.

See Deleting multiple records.

Parameters:
  • where (Optional[str]) – SQL where fragment to use, for example id > ?

  • where_args (Optional[Union[Iterable, dict]]) – Parameters to use with that fragment - an iterable for id > ? parameters, or a dictionary for id > :id

  • analyze (bool) – Set to True to run ANALYZE after the rows have been deleted.

Return type:

Table

update(pk_values, updates=None, alter=False, conversions=None)#

Execute a SQL UPDATE against the specified row.

See Updating a specific record.

Parameters:
  • pk_values (Union[list, tuple, str, int, float]) – The primary key of an individual record - can be a tuple if the table has a compound primary key.

  • updates (Optional[dict]) – A dictionary mapping columns to their updated values.

  • alter (bool) – Set to True to add any missing columns.

  • conversions (Optional[dict]) – Optional dictionary of SQL functions to apply during the update, for example {"mycolumn": "upper(?)"}.

Return type:

Table

convert(columns, fn, output=None, output_type=None, drop=False, multi=False, where=None, where_args=None, show_progress=False)#

Apply conversion function fn to every value in the specified columns.

Parameters:
  • columns (Union[str, List[str]]) – A single column or list of string column names to convert.

  • fn (Callable) – A callable that takes a single argument, value, and returns it converted.

  • output (Optional[str]) – Optional string column name to write the results to (defaults to the input column).

  • output_type (Optional[Any]) – If the output column needs to be created, this is the type that will be used for the new column.

  • drop (bool) – Should the original column be dropped once the conversion is complete?

  • multi (bool) – If True the return value of fn(value) will be expected to be a dictionary, and new columns will be created for each key of that dictionary.

  • where (Optional[str]) – SQL fragment to use as a WHERE clause to limit the rows to which the conversion is applied, for example age > ? or age > :age.

  • where_args (Optional[Union[Iterable, dict]]) – List of arguments (if using ?) or a dictionary (if using :age).

  • show_progress (bool) – Should a progress bar be displayed?

See Converting data in columns.

insert(record, pk=<sqlite_utils.db.Default object>, foreign_keys=<sqlite_utils.db.Default object>, column_order=<sqlite_utils.db.Default object>, not_null=<sqlite_utils.db.Default object>, defaults=<sqlite_utils.db.Default object>, hash_id=<sqlite_utils.db.Default object>, hash_id_columns=<sqlite_utils.db.Default object>, alter=<sqlite_utils.db.Default object>, ignore=<sqlite_utils.db.Default object>, replace=<sqlite_utils.db.Default object>, extracts=<sqlite_utils.db.Default object>, conversions=<sqlite_utils.db.Default object>, columns=<sqlite_utils.db.Default object>)#

Insert a single record into the table. The table will be created with a schema that matches the inserted record if it does not already exist, see Creating tables.

  • record - required: a dictionary representing the record to be inserted.

The other parameters are optional, and mostly influence how the new table will be created if that table does not exist yet.

Each of them defaults to DEFAULT, which indicates that the default setting for the current Table object (specified in the table constructor) should be used.

Parameters:
  • record (Dict[str, Any]) – Dictionary record to be inserted

  • pk – If creating the table, which column should be the primary key.

  • foreign_keys – See Specifying foreign keys.

  • column_order (Optional[Union[List[str], Default]]) – List of strings specifying a full or partial column order to use when creating the table.

  • not_null (Optional[Union[Set[str], Default]]) – Set of strings specifying columns that should be NOT NULL.

  • defaults (Optional[Union[Dict[str, Any], Default]]) – Dictionary specifying default values for specific columns.

  • hash_id (Optional[Union[str, Default]]) – Name of a column to create and use as a primary key, where the value of thet primary key will be derived as a SHA1 hash of the other column values in the record. hash_id="id" is a common column name used for this.

  • alter (Optional[Union[bool, Default]]) – Boolean, should any missing columns be added automatically?

  • ignore (Optional[Union[bool, Default]]) – Boolean, if a record already exists with this primary key, ignore this insert.

  • replace (Optional[Union[bool, Default]]) – Boolean, if a record already exists with this primary key, replace it with this new record.

  • extracts (Optional[Union[Dict[str, str], List[str], Default]]) – A list of columns to extract to other tables, or a dictionary that maps {column_name: other_table_name}. See Populating lookup tables automatically during insert/upsert.

  • conversions (Optional[Union[Dict[str, str], Default]]) – Dictionary specifying SQL conversion functions to be applied to the data while it is being inserted, for example {"name": "upper(?)"}. See Converting column values using SQL functions.

  • columns (Optional[Union[Dict[str, Any], Default]]) – Dictionary over-riding the detected types used for the columns, for example {"age": int, "weight": float}.

  • hash_id_columns (Optional[Union[Iterable[str], Default]]) –

Return type:

Table

insert_all(records, pk=<sqlite_utils.db.Default object>, foreign_keys=<sqlite_utils.db.Default object>, column_order=<sqlite_utils.db.Default object>, not_null=<sqlite_utils.db.Default object>, defaults=<sqlite_utils.db.Default object>, batch_size=<sqlite_utils.db.Default object>, hash_id=<sqlite_utils.db.Default object>, hash_id_columns=<sqlite_utils.db.Default object>, alter=<sqlite_utils.db.Default object>, ignore=<sqlite_utils.db.Default object>, replace=<sqlite_utils.db.Default object>, truncate=False, extracts=<sqlite_utils.db.Default object>, conversions=<sqlite_utils.db.Default object>, columns=<sqlite_utils.db.Default object>, upsert=False, analyze=False)#

Like .insert() but takes a list of records and ensures that the table that it creates (if table does not exist) has columns for ALL of that data.

Use analyze=True to run ANALYZE after the insert has completed.

Return type:

Table

upsert(record, pk=<sqlite_utils.db.Default object>, foreign_keys=<sqlite_utils.db.Default object>, column_order=<sqlite_utils.db.Default object>, not_null=<sqlite_utils.db.Default object>, defaults=<sqlite_utils.db.Default object>, hash_id=<sqlite_utils.db.Default object>, hash_id_columns=<sqlite_utils.db.Default object>, alter=<sqlite_utils.db.Default object>, extracts=<sqlite_utils.db.Default object>, conversions=<sqlite_utils.db.Default object>, columns=<sqlite_utils.db.Default object>)#

Like .insert() but performs an UPSERT, where records are inserted if they do not exist and updated if they DO exist, based on matching against their primary key.

See Upserting data.

Return type:

Table

upsert_all(records, pk=<sqlite_utils.db.Default object>, foreign_keys=<sqlite_utils.db.Default object>, column_order=<sqlite_utils.db.Default object>, not_null=<sqlite_utils.db.Default object>, defaults=<sqlite_utils.db.Default object>, batch_size=<sqlite_utils.db.Default object>, hash_id=<sqlite_utils.db.Default object>, hash_id_columns=<sqlite_utils.db.Default object>, alter=<sqlite_utils.db.Default object>, extracts=<sqlite_utils.db.Default object>, conversions=<sqlite_utils.db.Default object>, columns=<sqlite_utils.db.Default object>, analyze=False)#

Like .upsert() but can be applied to a list of records.

Return type:

Table

lookup(lookup_values, extra_values=None, pk='id', foreign_keys=None, column_order=None, not_null=None, defaults=None, extracts=None, conversions=None, columns=None)#

Create or populate a lookup table with the specified values.

db["Species"].lookup({"name": "Palm"}) will create a table called Species (if one does not already exist) with two columns: id and name. It will set up a unique constraint on the name column to guarantee it will not contain duplicate rows.

It will then insert a new row with the name set to Palm and return the new integer primary key value.

An optional second argument can be provided with more name: value pairs to be included only if the record is being created for the first time. These will be ignored on subsequent lookup calls for records that already exist.

All other keyword arguments are passed through to .insert().

See Working with lookup tables for more details.

Parameters:
  • lookup_values (Dict[str, Any]) – Dictionary specifying column names and values to use for the lookup

  • extra_values (Optional[Dict[str, Any]]) – Additional column values to be used only if creating a new record

  • pk (Optional[str]) –

  • foreign_keys (Optional[Union[Iterable[str], Iterable[ForeignKey], Iterable[Tuple[str, str]], Iterable[Tuple[str, str, str]], Iterable[Tuple[str, str, str, str]]]]) –

  • column_order (Optional[List[str]]) –

  • not_null (Optional[Set[str]]) –

  • defaults (Optional[Dict[str, Any]]) –

  • extracts (Optional[Union[Dict[str, str], List[str]]]) –

  • conversions (Optional[Dict[str, str]]) –

  • columns (Optional[Dict[str, Any]]) –

m2m(other_table, record_or_iterable=None, pk=<sqlite_utils.db.Default object>, lookup=None, m2m_table=None, alter=False)#

After inserting a record in a table, create one or more records in some other table and then create many-to-many records linking the original record and the newly created records together.

For example:

db["dogs"].insert({"id": 1, "name": "Cleo"}, pk="id").m2m(
    "humans", {"id": 1, "name": "Natalie"}, pk="id"
)

See Working with many-to-many relationships for details.

Parameters:
  • other_table (Union[str, Table]) – The name of the table to insert the new records into.

  • record_or_iterable (Optional[Union[Iterable[Dict[str, Any]], Dict[str, Any]]]) – A single dictionary record to insert, or a list of records.

  • pk (Optional[Union[Any, Default]]) – The primary key to use if creating other_table.

  • lookup (Optional[Dict[str, Any]]) – Same dictionary as for .lookup(), to create a many-to-many lookup table.

  • m2m_table (Optional[str]) – The string name to use for the many-to-many table, defaults to creating this automatically based on the names of the two tables.

  • alter (bool) – Set to True to add any missing columns on other_table if that table already exists.

analyze()#

Run ANALYZE against this table

analyze_column(column, common_limit=10, value_truncate=None, total_rows=None)#

Return statistics about the specified column.

See Analyzing a column.

Parameters:
  • column (str) – Column to analyze

  • common_limit (int) – Show this many column values

  • value_truncate – Truncate display of common values to this many characters

  • total_rows – Optimization - pass the total number of rows in the table to save running a fresh count(*) query

Return type:

ColumnDetails

add_geometry_column(column_name, geometry_type, srid=4326, coord_dimension='XY', not_null=False)#

In SpatiaLite, a geometry column can only be added to an existing table. To do so, use table.add_geometry_column, passing in a geometry type.

By default, this will add a nullable column using SRID 4326. This can be customized using the column_name, srid and not_null arguments.

Returns True if the column was successfully added, False if not.

from sqlite_utils.db import Database
from sqlite_utils.utils import find_spatialite

db = Database("mydb.db")
db.init_spatialite(find_spatialite())

# the table must exist before adding a geometry column
table = db["locations"].create({"name": str})
table.add_geometry_column("geometry", "POINT")
Parameters:
  • column_name (str) – Name of column to add

  • geometry_type (str) – Type of geometry column, for example "GEOMETRY" or "POINT" or ``"POLYGON"

  • srid (int) – Integer SRID, defaults to 4326 for WGS84

  • coord_dimension (str) – Dimensions to use, defaults to "XY" - set to "XYZ" to work in three dimensions

  • not_null (bool) – Should the column be NOT NULL

Return type:

bool

create_spatial_index(column_name)#

A spatial index allows for significantly faster bounding box queries. To create one, use create_spatial_index with the name of an existing geometry column.

Returns True if the index was successfully created, False if not. Calling this function if an index already exists is a no-op.

# assuming SpatiaLite is loaded, create the table, add the column
table = db["locations"].create({"name": str})
table.add_geometry_column("geometry", "POINT")

# now we can index it
table.create_spatial_index("geometry")

# the spatial index is a virtual table, which we can inspect
print(db["idx_locations_geometry"].schema)
# outputs:
# CREATE VIRTUAL TABLE "idx_locations_geometry" USING rtree(pkid, xmin, xmax, ymin, ymax)
Parameters:

column_name – Geometry column to create the spatial index against

Return type:

bool

sqlite_utils.db.View#

class sqlite_utils.db.View(db, name)#

Bases: Queryable

exists()#

Does this table or view exist yet?

drop(ignore=False)#

Drop this view.

Parameters:

ignore – Set to True to ignore the error if the view does not exist

enable_fts(*args, **kwargs)#

enable_fts() is supported on tables but not on views.

Other#

sqlite_utils.db.Column#

class sqlite_utils.db.Column(cid, name, type, notnull, default_value, is_pk)#

Describes a SQLite column returned by the Table.columns property.

cid

Column index

name

Column name

type

Column type

notnull

Does the column have a not null constraint

default_value

Default value for this column

is_pk

Is this column part of the primary key

sqlite_utils.db.ColumnDetails#

class sqlite_utils.db.ColumnDetails(table, column, total_rows, num_null, num_blank, num_distinct, most_common, least_common)#

Summary information about a column, see Analyzing a column.

table

The name of the table

column

The name of the column

total_rows

The total number of rows in the table

num_null

The number of rows for which this column is null

num_blank

The number of rows for which this column is blank (the empty string)

num_distinct

The number of distinct values in this column

most_common

The N most common values as a list of (value, count) tuples, or None if the table consists entirely of distinct values

least_common

The N least common values as a list of (value, count) tuples, or None if the table is entirely distinct or if the number of distinct values is less than N (since they will already have been returned in most_common)

sqlite_utils.utils#

sqlite_utils.utils.hash_record#

sqlite_utils.utils.hash_record(record, keys=None)#

record should be a Python dictionary. Returns a sha1 hash of the keys and values in that record.

If keys= is provided, uses just those keys to generate the hash.

Example usage:

from sqlite_utils.utils import hash_record

hashed = hash_record({"name": "Cleo", "twitter": "CleoPaws"})
# Or with the keys= option:
hashed = hash_record(
    {"name": "Cleo", "twitter": "CleoPaws", "age": 7},
    keys=("name", "twitter")
)
Parameters:
  • record (Dict) – Record to generate a hash for

  • keys (Optional[Iterable[str]]) – Subset of keys to use for that hash

sqlite_utils.utils.rows_from_file#

sqlite_utils.utils.rows_from_file(fp, format=None, dialect=None, encoding=None, ignore_extras=False, extras_key=None)#

Load a sequence of dictionaries from a file-like object containing one of four different formats.

from sqlite_utils.utils import rows_from_file
import io

rows, format = rows_from_file(io.StringIO("id,name\n1,Cleo")))
print(list(rows), format)
# Outputs [{'id': '1', 'name': 'Cleo'}] Format.CSV

This defaults to attempting to automatically detect the format of the data, or you can pass in an explicit format using the format= option.

Returns a tuple of (rows_generator, format_used) where rows_generator can be iterated over to return dictionaries, while format_used is a value from the sqlite_utils.utils.Format enum:

class Format(enum.Enum):
    CSV = 1
    TSV = 2
    JSON = 3
    NL = 4

If a CSV or TSV file includes rows with more fields than are declared in the header a sqlite_utils.utils.RowError exception will be raised when you loop over the generator.

You can instead ignore the extra data by passing ignore_extras=True.

Or pass extras_key="rest" to put those additional values in a list in a key called rest.

Parameters:
  • fp (BinaryIO) – a file-like object containing binary data

  • format (Optional[Format]) – the format to use - omit this to detect the format

  • dialect (Optional[Type[Dialect]]) – the CSV dialect to use - omit this to detect the dialect

  • encoding (Optional[str]) – the character encoding to use when reading CSV/TSV data

  • ignore_extras (Optional[bool]) – ignore any extra fields on rows

  • extras_key (Optional[str]) – put any extra fields in a list with this key

Return type:

Tuple[Iterable[dict], Format]

sqlite_utils.utils.TypeTracker#

class sqlite_utils.utils.TypeTracker#

Wrap an iterator of dictionaries and keep track of which SQLite column types are the most likely fit for each of their keys.

Example usage:

from sqlite_utils.utils import TypeTracker
import sqlite_utils

db = sqlite_utils.Database(memory=True)
tracker = TypeTracker()
rows = [{"id": "1", "name": "Cleo", "id": "2", "name": "Cardi"}]
db["creatures"].insert_all(tracker.wrap(rows))
print(tracker.types)
# Outputs {'id': 'integer', 'name': 'text'}
db["creatures"].transform(types=tracker.types)
wrap(iterator)#

Use this to loop through an existing iterator, tracking the column types as part of the iteration.

Parameters:

iterator (Iterable[dict]) – The iterator to wrap

Return type:

Iterable[dict]

property types: Dict[str, str]#

A dictionary mapping column names to their detected types. This can be passed to the db[table_name].transform(types=tracker.types) method.