The backend uses the following technologies:
✨ Code Standards
In order to have a similar structure to the code, the backend repository has a pre-commit hook. Just use pre-commit install in the folder and then the linter and the formatter will check it before you commit your actual work.
We use black, flake8 and isort to keep our code tidy.
Usually nothing goes as planned, and you need to debug your shiny new feature. In the next paragraph, I will explain to you the two tools we have for debugging:
Add the following in the python code where you want a breakpoint:
import pdb; pdb.set_trace()
Attach to the backend service:
docker attach $(docker ps --filter name=backend -q)
Debug as normal in pdb!
When you're done debugging, continue execution (c) and press Ctrl-P followed by Ctrl-Q to detach from the container without stopping it.
In order to debug queries, start the backend container in dev mode. Then you can access silk under /api/silk. Silk is a live profiling and inspection tool for the Django framework. Silk intercepts and stores HTTP requests and database queries before presenting them in a user interface for further inspection.
There are a lot of folders in our backend. Here is a quick rundown on where you can find what.
Most of the application code is within the API folder using Django. In the following section, I will explain where you can find what
This exposes all the commands that you can use via the command line. If you want a new command, that's the place to add it.
Every time we change our models, we have to migrate the database. We use Django migration feature to create migration files to migrate without the headaches.
Here are the actual data types. If you want to figure out how a photo works or how faces are connected to persons, then this is your folder.
Here you can find our API implemented. They are separated, similar to the models. Views that expose the photos will be here in photos too.
You have your python model and want to somehow convert that to JSON. That's what the serializer does! There are two different types of serializers: normal ones and serpy serializer. Serpy serializer is faster, and we sometimes need it if we need to serialize a lot of data. The package is no longer maintained, though, and we are looking for a replacement or for refactoring to serialize fewer data at once.
We are currently in the process of splitting them up, similar to how we did it in views.
We use as a base framework PyTorch. If you find a cool machine learning model with PyTorch, we sure can add that too.
im2txt is an image captioning package which allows us to generate captions on demand. This sometimes creates useful output, but it is kind of old and there should be more recent models
We use dlib and face_recognition to detect faces. A very cool feature would be the automatic clustering of unknown faces, which we have not implemented yet.
places365 generates scene classifications for a given image. It generates the tags you see when you open the photo details in the UI.
Here you can find the code which allows us to search semantically for images like "trees in a valley".