{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quickstart \n", "\n", "This page provides an introduction to how to use Carsus." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize a database \n", "\n", "Initializing a database is a matter of calling the `init_db` function. You *must* pass to the function a database url. You can also pass optional keyword arguments to establish various engine options, e.g. `echo=True`. `init_db` creates and returns a session object that is used to query the database. Let’s initialize a SQLite memory database:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initializing the database\n", "Ingesting basic atomic data\n" ] } ], "source": [ "from carsus import init_db\n", "session = init_db(\"sqlite://\")\n", "session.commit()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because the database was empty, basic atomic data (atomic numbers, symbols, etc.) was added to it. You should commit the session yourself if you want changes to be persisted to the database! Let’s query the database:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "\n", "\n" ] } ], "source": [ "from carsus.model import Atom\n", "q = session.query(Atom).all()\n", "for atom in q[:5]:\n", " print atom" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "118" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session.query(Atom).count()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The figure below illustrates the database schema. Atoms have some fundamental quantities, like atomic numbers and groups, and quantites that can depend on a data source. The latter are stored in the `AtomicQuantities` table. The `AtomicWeights` table is a subset table of `AtomicQuantities` and it represents a specific type of quantities - atomic weights. Although there is only one quantity type in this schema, generally there can be many.\n", "\n", "![atomic schema](../images/atomic_schema.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ingest data\n", "\n", "To ingest data from a source you need to create an ingestor for that source. In this example we will ingest atomic weights from the [NIST Atomic Weight and Isotopic Compositions](http://www.nist.gov/pml/data/comp.cfm) database. After you have created the ingestor, you need to call two methods: `download` and `ingest`. The first one will download data from the source and the second one will ingest it into the database. You *must* pass a Session object to the `ingest` method! You should commit the session after the data have been ingested." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading the data from http://physics.nist.gov/cgi-bin/Compositions/stand_alone.pl\n", "Ingesting atomic weights\n" ] } ], "source": [ "from carsus.io.nist import NISTWeightsCompIngester\n", "ingester = NISTWeightsCompIngester()\n", "ingester.download()\n", "ingester.ingest(session)\n", "session.commit()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Query the database\n", "\n", "Let’s do some queries. To select both atoms and atomic weights we need to join the `Atoms` table on the `AtomicWeights` table. We use `join()` to create an explicit JOIN. To specify the ON parameter we provide the relationship-bound attribute of the `Atom` class - `Atom.quantities` - and then use the `of_type()` helper method to narrow the criterion to atomic weights. This query selects the first five atoms with the values of their atomic weights:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[(, ),\n", " (, ),\n", " (, ),\n", " (, ),\n", " (, )]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from carsus.model import AtomicWeight, DataSource\n", "session.query(Atom, AtomicWeight).\\\n", " join(Atom.quantities.of_type(AtomicWeight)).\\\n", " filter(Atom.atomic_number <= 5).all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's select atoms that have atomic weight less than 15 u. We can do the query using the `.quantity` column in a single comparison. Notice that to interpret `.quantity` directly at the moment we need to use `.value` accessor: " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[(1, 1.007975),\n", " (2, 4.002602),\n", " (3, 6.967499999999999),\n", " (4, 9.0121831),\n", " (5, 10.8135),\n", " (6, 12.0106),\n", " (7, 14.006855)]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from astropy import units as u\n", "session.query(Atom.atomic_number,\n", " AtomicWeight.quantity.value).\\\n", " join(Atom.quantities.of_type(AtomicWeight)).\\\n", " filter(AtomicWeight.quantity < 15*u.u).all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Output dataframes\n", "\n", "Lets put out a table with z, weight in solar masses, z**2 + 5:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " z weight_solMass z**2+5\n", "0 1 8.414768e-58 6\n", "1 2 3.341449e-57 9\n", "2 3 5.816602e-57 14\n", "3 4 7.523543e-57 21\n", "4 5 9.027316e-57 30\n" ] } ], "source": [ "# First we need to write a query\n", "# Unit conversion is done on the DB side!\n", "\n", "q = session.query(Atom.atomic_number.label(\"z\"),\n", " AtomicWeight.quantity.to(u.solMass).value.label(\"weight_solMass\")).\\\n", " join(Atom.quantities.of_type(AtomicWeight))\n", "\n", "# Then we use pandas to read the query into a DataFrame\n", "from pandas import read_sql_query\n", "df = read_sql_query(q.selectable, session.bind)\n", "\n", "# Once we have the data we can compute things\n", "df[\"z**2+5\"] = df[\"z\"]**2 + 5\n", "\n", "print df.head(5)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }