TDM 40200: Project 4 — 2023
Motivation: Dashboards are everywhere — many of our corporate partners' projects are to build dashboards (or dashboard variants)! Dashboards are used to interactively visualize some set of data. Dashboards can be used to display, add, remove, filter, or complete some customized operation to data. Ultimately, a dashboard is really a website focused on displaying data. Dashboards are so popular, there are entire frameworks designed around making them with less effort, faster. Two of the more popular examples of such frameworks are shiny
(in R) and dash
(in Python). While these tools are incredibly useful, it can be very beneficial to take a step back and build a dashboard (or website) from scratch (we are going to utilize many powerfuly packages and tools that make this far from "scratch", but it will still be more from scratch than those dashboard frameworks).
Context: This is the third in a series of projects focused around slowly building a dashboard. Students will have the opportunity to: create a backend (API) using fastapi
, connect the backend to a database using aiosql
, use the jinja2
templating engine to create a frontend, use htmx
to add "reactivity" to the frontend, create and use forms to insert data into the database, containerize the application so it can be deployed anywhere, and deploy the application to a cloud provider. Each week the project will build on the previous week, however, each week will be self-contained. This means that you can complete the project in any order, and if you miss a week, you can still complete the following project using the provided starting point.
Scope: Python, dashboards
Questions
Interested in being a TA? Please apply: purdue.ca1.qualtrics.com/jfe/form/SV_08IIpwh19umLvbE |
Question 1
In the previous project we covered some of the most basic but important parts of a backend. It is time to take a break from fastapi
— before continuing on, it is important that we introduce probably the most critical component of a dashboard or web app — the database.
In this project we will be using sqlite3
as our database. sqlite3
is very simple to use, but still extremely powerful. Typically, however, an instance of postgresql
or mysql/mariadb
is more common. However, for this project, we will be using sqlite3
as it is the easiest to get started with. We will however explain the steps that would be needed if we were using either postgresql
or mariadb
.
Our code is Python code. There are a lot of Python packages that can be used to interact with the databases we’ve mentioned. We will be using aiosql
as it is a very straightforward package that allows us to write SQL queries in a .sql
file, and then use those queries in our Python code. This is quite different than most of the other Python tools. Most of the Python tools — like those used in Django, or sqlalchemy
or peewee
— require us to write our SQL queries in Python code, and use special methods to execute those queries. While this isn’t bad, and in fact, it can be very very good, however, it can be easier to maintain a project if we separate our SQL queries from our Python code — this is what aiosql
largely let’s us accomplish.
In this project, we will learn how to use aiosql
.
Get started by opening up VS Code and connecting to Anvil, just as we have in the previous projects. One database you’ve used many times before is our imdb
database. Please create a copy of this database in your $SCRATCH
directory. You can do this by running the following command.
mkdir $SCRATCH/p4
cp /anvil/projects/tdm/data/movies_and_tv/imdb.db $SCRATCH/p4
In addition, create two additional files.
touch $SCRATCH/p4/queries.sql
touch $SCRATCH/p4/project04.py
Finally, open project04.py
and add the following code.
def main():
print("Hello World!")
if __name__ == "__main__":
main()
Make sure things are working by loading up our Python environment and running project04.py
.
module use /anvil/projects/tdm/opt/core
module load tdm
module load python/f2022-s2023
cd $SCRATCH/p4
python3 project04.py
Capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.
-
Code used to solve this problem.
-
Output from running the code.
Question 2
Here are the official docs for aiosql
. Please reference them as needed.
The first step, no matter what database you are using, is to load up the queries from our (currently empty) queries.sql
file. Please glance over this page.
As you can see, aiosql
supports a number of different database drivers: sqlite3
, apsw
, psycopg
, psycopg2
, pymysql
, etc. Please follow the following steps to load up our queries, and establish a database connection.
-
Import
aiosql
andsqlite3
. In this case,sqlite3
is the database driver we are using. If you were instead usingpostgresql
, you would likely importpsycopg2
instead. -
Next, you need to make a call to
aiosql
from_path
method. This method takes two arguments — the first is a string that describes a path to the file containing our queries, in our case,queries.sql
. In this case, since ourproject04.py
module andqueries.sql
files are in the same directory, this value can simply be "queries.sql". The second argument is the database driver we are using. In our case, this issqlite3
. If we were usingpostgresql
andpsycopg2
, this would bepsycopg2
. You can name the resulting variable anything you want. For clarity, I tend to preferqueries
. This resultingqueries
object will contain a method for each and every query we have in ourqueries.sql
file. We will explain this more, later. -
Finally, when making a query using
aiosql
, we need to establish a database connection object and pass that object along to each query we call. To establish a database connection, you need to follow the instructions for your database driver. In our case, we are usingsqlite3
. So, I would search the internet for "establish connection sqlite3 python" and find the following results: docs.python.org/3/library/sqlite3.html. We can very clearly see, that to establish a connection, we can run the following code.import sqlite3 conn = sqlite3.connect("imdb.db")
Of course, we need to make sure that
imdb.db
is in the same directory as ourproject04.py
module. If it isn’t, we would need to adjust the path of the database accordingly. Here, the resultingconn
is our connection object. We will need to pass this object to every query we make usingaiosql
— it will always be the first argument.To create a connection using
psycopg2
, for example, this would look a bit different.import psycopg2 conn = psycopg2.connect(host="my.db.location.example.com", database="mydbname", user="myusername", password="mypassword", port=5432)
Here, we would have to specify more details as
postgresql
is a client/server database and we need to authenticate. In addition, we have to specify where (the host) the database is hosted and what port it is listening on.
Finally, its time to put all of this information to use! Carefully read this page. In your queries.sql
file, write a query called get-five-titles
that runs a SELECT
query returning 5 titles. Update your main
function to load your queries from the queries.sql
file, establishes a connection to the imdb.db
sqlite3
database, and executes the newly created query, printing the results. From the terminal, run the updated project04.py
module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.
If all went well you should end up with something like: output
[('tt0000001', 'short', 'Carmencita', 'Carmencita', 0, 1894, None, 1, 'Documentary,Short'), ('tt0000002', 'short', 'Le clown et ses chiens', 'Le clown et ses chiens', 0, 1892, None, 5, 'Animation,Short'), ('tt0000003', 'short', 'Pauvre Pierrot', 'Pauvre Pierrot', 0, 1892, None, 4, 'Animation,Comedy,Romance'), ('tt0000004', 'short', 'Un bon bock', 'Un bon bock', 0, 1892, None, 12, 'Animation,Short'), ('tt0000005', 'short', 'Blacksmith Scene', 'Blacksmith Scene', 0, 1893, None, 1, 'Comedy,Short')] |
-
Code used to solve this problem.
-
Output from running the code.
Question 3
Next, write a new query called get-title-by-id
that takes a single argument, title_id
, and returns the title (and only the primary_title
) with the matching title_id
. Update your main
function to load your queries from the queries.sql
file, establishes a connection to the imdb.db
sqlite3
database, and executes the newly created query, printing the results. From the terminal, run the updated project04.py
module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.
Here are some example queries with expected output.
expected output
[('Yellowstone',)]
expected output
[('Friends',)] |
-
Code used to solve this problem.
-
Output from running the code.
Question 4
Carefully read this section if you haven’t already. Now it is time to insert a new title into our titles
table!
Write a new query called insert-title
that takes the following arguments: title_id
, type
, primary_title
, original_title
, is_adult
, premiered
, ended
, runtime_minutes
, and genres
. The query should insert a new row into the titles
table with the provided values.
Use your new query to insert the following title into the titles
table: www.imdb.com/title/tt3581920/. Make sure title_id
is tt3581920
, however, if you can’t find any of the other pieces of data, feel free to make them up.
Test out your new query from within your main
function. From the terminal, run the updated project04.py
module. Be sure to use the get_title_by_id
method to fetch and print the newly added title to confirm your INSERT
worked properly. Capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.
Example output.
expected output
1
expected output
[('The Last of Us',)] |
-
Code used to solve this problem.
-
Output from running the code.
Question 5
Great job! I hope you will start to see the advantages of having all your queries in a single place. Write an additional query using a different operator than you’ve used so far. Demonstrate that your query functions as it should by executing it from within your main
function. From the terminal, run the updated project04.py
module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.
-
Code used to solve this problem.
-
Output from running the code.
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |