TDM 40200: Project 4 — 2023

Motivation: Dashboards are everywhere — many of our corporate partners' projects are to build dashboards (or dashboard variants)! Dashboards are used to interactively visualize some set of data. Dashboards can be used to display, add, remove, filter, or complete some customized operation to data. Ultimately, a dashboard is really a website focused on displaying data. Dashboards are so popular, there are entire frameworks designed around making them with less effort, faster. Two of the more popular examples of such frameworks are shiny (in R) and dash (in Python). While these tools are incredibly useful, it can be very beneficial to take a step back and build a dashboard (or website) from scratch (we are going to utilize many powerfuly packages and tools that make this far from "scratch", but it will still be more from scratch than those dashboard frameworks).

Context: This is the third in a series of projects focused around slowly building a dashboard. Students will have the opportunity to: create a backend (API) using fastapi, connect the backend to a database using aiosql, use the jinja2 templating engine to create a frontend, use htmx to add "reactivity" to the frontend, create and use forms to insert data into the database, containerize the application so it can be deployed anywhere, and deploy the application to a cloud provider. Each week the project will build on the previous week, however, each week will be self-contained. This means that you can complete the project in any order, and if you miss a week, you can still complete the following project using the provided starting point.

Scope: Python, dashboards

Learning Objectives

Continue to develop skills and techniques using fastapi to build a backend.

Make sure to read about, and use the template found here, and the important information about projects submissions here.

Questions

Interested in being a TA? Please apply: purdue.ca1.qualtrics.com/jfe/form/SV_08IIpwh19umLvbE

Question 1

In the previous project we covered some of the most basic but important parts of a backend. It is time to take a break from fastapi — before continuing on, it is important that we introduce probably the most critical component of a dashboard or web app — the database.

In this project we will be using sqlite3 as our database. sqlite3 is very simple to use, but still extremely powerful. Typically, however, an instance of postgresql or mysql/mariadb is more common. However, for this project, we will be using sqlite3 as it is the easiest to get started with. We will however explain the steps that would be needed if we were using either postgresql or mariadb.

Our code is Python code. There are a lot of Python packages that can be used to interact with the databases we’ve mentioned. We will be using aiosql as it is a very straightforward package that allows us to write SQL queries in a .sql file, and then use those queries in our Python code. This is quite different than most of the other Python tools. Most of the Python tools — like those used in Django, or sqlalchemy or peewee — require us to write our SQL queries in Python code, and use special methods to execute those queries. While this isn’t bad, and in fact, it can be very very good, however, it can be easier to maintain a project if we separate our SQL queries from our Python code — this is what aiosql largely let’s us accomplish.

In this project, we will learn how to use aiosql.

Get started by opening up VS Code and connecting to Anvil, just as we have in the previous projects. One database you’ve used many times before is our imdb database. Please create a copy of this database in your $SCRATCH directory. You can do this by running the following command.

mkdir $SCRATCH/p4
cp /anvil/projects/tdm/data/movies_and_tv/imdb.db $SCRATCH/p4

In addition, create two additional files.

touch $SCRATCH/p4/queries.sql
touch $SCRATCH/p4/project04.py

Finally, open project04.py and add the following code.

def main():
    print("Hello World!")

if __name__ == "__main__":
    main()

Make sure things are working by loading up our Python environment and running project04.py.

module use /anvil/projects/tdm/opt/core
module load tdm
module load python/f2022-s2023

cd $SCRATCH/p4
python3 project04.py

Capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.

Items to submit

Code used to solve this problem.
Output from running the code.

Question 2

Here are the official docs for aiosql. Please reference them as needed.

The first step, no matter what database you are using, is to load up the queries from our (currently empty) queries.sql file. Please glance over this page.

As you can see, aiosql supports a number of different database drivers: sqlite3, apsw, psycopg, psycopg2, pymysql, etc. Please follow the following steps to load up our queries, and establish a database connection.

Import aiosql and sqlite3. In this case, sqlite3 is the database driver we are using. If you were instead using postgresql, you would likely import psycopg2 instead.
Next, you need to make a call to aiosql from_path method. This method takes two arguments — the first is a string that describes a path to the file containing our queries, in our case, queries.sql. In this case, since our project04.py module and queries.sql files are in the same directory, this value can simply be "queries.sql". The second argument is the database driver we are using. In our case, this is sqlite3. If we were using postgresql and psycopg2, this would be psycopg2. You can name the resulting variable anything you want. For clarity, I tend to prefer queries. This resulting queries object will contain a method for each and every query we have in our queries.sql file. We will explain this more, later.

Finally, when making a query using aiosql, we need to establish a database connection object and pass that object along to each query we call. To establish a database connection, you need to follow the instructions for your database driver. In our case, we are using sqlite3. So, I would search the internet for "establish connection sqlite3 python" and find the following results: docs.python.org/3/library/sqlite3.html. We can very clearly see, that to establish a connection, we can run the following code.

import sqlite3
conn = sqlite3.connect("imdb.db")

Of course, we need to make sure that imdb.db is in the same directory as our project04.py module. If it isn’t, we would need to adjust the path of the database accordingly. Here, the resulting conn is our connection object. We will need to pass this object to every query we make using aiosql — it will always be the first argument.

To create a connection using psycopg2, for example, this would look a bit different.

import psycopg2
conn = psycopg2.connect(host="my.db.location.example.com", database="mydbname", user="myusername", password="mypassword", port=5432)

Here, we would have to specify more details as postgresql is a client/server database and we need to authenticate. In addition, we have to specify where (the host) the database is hosted and what port it is listening on.

Finally, its time to put all of this information to use! Carefully read this page. In your queries.sql file, write a query called get-five-titles that runs a SELECT query returning 5 titles. Update your main function to load your queries from the queries.sql file, establishes a connection to the imdb.db sqlite3 database, and executes the newly created query, printing the results. From the terminal, run the updated project04.py module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.

If all went well you should end up with something like:

output

[('tt0000001', 'short', 'Carmencita', 'Carmencita', 0, 1894, None, 1, 'Documentary,Short'), ('tt0000002', 'short', 'Le clown et ses chiens', 'Le clown et ses chiens', 0, 1892, None, 5, 'Animation,Short'), ('tt0000003', 'short', 'Pauvre Pierrot', 'Pauvre Pierrot', 0, 1892, None, 4, 'Animation,Comedy,Romance'), ('tt0000004', 'short', 'Un bon bock', 'Un bon bock', 0, 1892, None, 12, 'Animation,Short'), ('tt0000005', 'short', 'Blacksmith Scene', 'Blacksmith Scene', 0, 1893, None, 1, 'Comedy,Short')]

Items to submit

Code used to solve this problem.
Output from running the code.

Question 3

Next, write a new query called get-title-by-id that takes a single argument, title_id, and returns the title (and only the primary_title) with the matching title_id. Update your main function to load your queries from the queries.sql file, establishes a connection to the imdb.db sqlite3 database, and executes the newly created query, printing the results. From the terminal, run the updated project04.py module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.

Here are some example queries with expected output.

results = queries.get_title_by_id(conn, title_id="tt4236770")
print(results)

expected output

[('Yellowstone',)]

results = queries.get_title_by_id(conn, title_id="tt0108778")
print(results)

expected output

[('Friends',)]

Items to submit

Code used to solve this problem.
Output from running the code.

Question 4

Carefully read this section if you haven’t already. Now it is time to insert a new title into our titles table!

Write a new query called insert-title that takes the following arguments: title_id, type, primary_title, original_title, is_adult, premiered, ended, runtime_minutes, and genres. The query should insert a new row into the titles table with the provided values.

Use your new query to insert the following title into the titles table: www.imdb.com/title/tt3581920/. Make sure title_id is tt3581920, however, if you can’t find any of the other pieces of data, feel free to make them up.

Test out your new query from within your main function. From the terminal, run the updated project04.py module. Be sure to use the get_title_by_id method to fetch and print the newly added title to confirm your INSERT worked properly. Capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.

Example output.

result = queries.insert_title(conn, title_id="tt3581920", ...)
print(result)

expected output

result = queries.get_title_by_id(conn, title_id="tt3581920")
print(result)

expected output

[('The Last of Us',)]

Items to submit

Code used to solve this problem.
Output from running the code.

Question 5

Great job! I hope you will start to see the advantages of having all your queries in a single place. Write an additional query using a different operator than you’ve used so far. Demonstrate that your query functions as it should by executing it from within your main function. From the terminal, run the updated project04.py module and capture a screenshot of the resulting output and include it in your Jupyter Notebook for submission.

Items to submit

Code used to solve this problem.
Output from running the code.

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.