- Get it from PyPI with
pip install dbxs
. - Read the documentation over at dbxs.readthedocs.org.
DBXS is query organizer. It puts your collection of SQL queries into a
traditional data-access layer, leveraging Python's built-in
typing.Protocol
provide type-safety and convenience.
DBXS aims to provide 5 properties to applications that use it:
DBXS is plain old SQL. Aside from interpolating placeholders based on your chosen driver's paramstyle, it does not parse or interpret the SQL statements that you provide it. It does not generate SQL. What you typed is what gets executed. Use whatever special features your database provides, with no layers in the middle.
It is difficult to accidentally write an SQL injection vulnerability with DBXS.
By requiring that SQL statements be written at import time, ahead of your
application code even executing, attacker-controlled data from your requests is
unlikely to be available. If your code is using DBXS idiomatically (i.e.: not
calling out to .execute(...)
), you can know it's safe.
Although the SQL is “raw”, your results are not. When you declare your queries with DBXS, you describe the shape of their results in terms of dataclasses, giving your application code structured, documented data objects to consume. Query parameters and results are all typed.
As you're the one generating your SQL, you're still responsible for testing that your query actually consumes and produces the types you have claimed it does, but once you've written and tested a query.
When you use DBXS, your queries are all collected into Protocol
classes that
define the interfaces to your data stores. This provides a small, clear set of
locations to inspect for database query issues, both for human readers and for
metaprogramming, rather than smearing database concerns across the entirety of
your application.
Additionally, by providing a simple async context manager for transaction
management, it is far easier to scan your code for transaction boundaries by
looking for async with transaction
than looking for calls to commit
and
rollback
.
The dbxs.testing
module provides support for testing your database interfaces
with minimal set-up. You can write a unit test that uses your database
interfaces in only a few lines of code, like so:
class MyDBXSTest(TestCase):
@immediateTest()
def test_myQuery(self, pool: MemoryPool) -> None:
async with transaction(pool) as c:
access = myAccessor(c)
await access.createFoo("1", "hello")
self.assertEqual((await access.getFoo("1")).name, "hello")
(As DBXS is still an alpha-stage project, this testing support is currently
restricted to SQLite and stdlib unittest
, but support for arbitrary database
drivers and pytest is definitely on the roadmap.)
Using it looks like this:
@dataclass
class Quote:
db: QuoteDB
id: int
contents: str
from dbxs import accessor, many, one, query, statement
class QuoteDB(Protocol):
@query(sql="SELECT id, contents FROM quote WHERE id={id}", load=one(Quote))
async def quoteByID(self, id: int) -> Quote: ...
@query(sql="SELECT id, contents FROM quote", load=many(Quote))
def allQuotes(self) -> AsyncIterable[Quote]: ...
@statement(sql="INSERT INTO quote (contents) VALUES ({text})")
async def addQuote(self, text: str) -> None: ...
from dbxs.dbapi import DBAPIConnection
def sqliteWithSchema() -> DBAPIConnection:
c = connect(":memory:")
c.execute(
"CREATE TABLE quote (contents, id INTEGER PRIMARY KEY AUTOINCREMENT)"
)
c.commit()
return c
from dbxs.adapters.dbapi_twisted import adaptSynchronousDriver
driver = adaptSynchronousDriver(sqliteWithSchema, paramstyle)
quotes = accessor(QuoteDB)
async def main() -> None:
from dbxs.async_dbapi import transaction
async with transaction(driver) as t:
quotedb: QuoteDB = quotes(t)
await quotedb.addQuote("hello, world")
async for quote in quotedb.allQuotes():
matched = (await quotedb.quoteByID(quote.id)) == quote
print(f"quote ({quote.id}) {quote.contents!r} {matched}")
When interfacing with SQL in Python, you have a few choices which predate DBXS.
You can use something low-level, specifically a DB-API 2.0
driver, directly. You call connect(...)
to get a connection, .cursor()
on that connection to get a cursor, and
.execute
on that cursor with some SQL and some parameters to tell the
database to do stuff. The benefits of this approach are clear; it's very
straightforward. If you want to do something, read the database documentation
for the SQL syntax to do the thing you want, then type in that SQL.
However, the downsides are also readily apparent, the two major ones being:
- there are no safeguards against SQL injection: while you can pass parameters to your SQL statements, if you ever have a weird edge case where you want to put something into a place in the query where parameters can't readily be used (say, if you want to let a user select a table name) nothing will prevent you from introducing this vulnerability.
- every query gives you flat lists of tuples, "dumb" data which must be interpreted at every query site, when what you probably want is some kind of typed value object that gives it convenient methods and named attributes, as well as cooperating with a type checker.
To mitigate these disadvantages, you might use a higher-level tool like the Django ORM or SQLAlchemy. These tools are very powerful, and allow for many things that DBXS does not, such as dynamically composing queries. They provide strong affordances to make the right thing (avoiding SQL injection) the easy thing, and they give you powerful higher-level types as query inputs and results.
However, those introduce new problems:
-
It's a big increase in complexity. Everyone who contributes to your project needs to understand both the SQL layer of the database and your ORM or expression layer. While DBXS comprises a handful of functions and classes (mostly
query
,one
,many
,statement
,transaction
, andaccessor
, withadaptSynchronousDriver
being the additional bit of glue most applications will need), an ORM might include dozens or even hundreds of additional functions and data structures. -
This problem compounds, because if your database has any feature that you want to use beyond the somewhat fuzzy boundaries of “standard” SQL, even something as simple as
INSERT…ON CONFLICT DO NOTHING
, you need to learn about that feature from the database's SQL dialect documentation, then learn it again as you learn from the specialized dialect support of your chosen library, assuming it wraps the feature you want. Some features may only be supported as additional libraries you need to depend on or you may have to wait for them to be implemented.
Finally, there is a shortcoming that all these approaches have in common: they all encourage queries to be defined on an ad-hoc basis, in the bodies of methods and functions as the application is running. Especially when using a higher-level library where the SQL is generated rather than manually specified, this can make it very difficult to go from an entry in a database log to the location in the application code that executed it.
Does this seem familiar? Although I hadn't heard about it at the time, I have since learned that this is a parallel invention of JDBI's declarative API. If you notice that the ideas are similar but the terminology is all ranodmly different, that's why.
DBXS has (as far as I know) mostly not been used in production, and thus should be considered alpha quality. While its simple implementation, small code size, and good test coverage should make productionizing it a small amount of effort, it does still have some major limitations compared to what I would consider a “final” release:
-
DBXS only supports async database interfaces. Many database applications expect a synchronous interface, and [it should have one] (#18).
-
DBXS only supports synchronous drivers wrapped using Twisted's threadpool. It should support asyncio, as well as some of the native database drivers for the asyncio ecosystem.