The world’s leading publication for data science, AI, and ML professionals.

Securing our Prototypes

Locking down our Data Science Prototypes and proof of concepts for privacy

Photo by FLY:D on Unsplash
Photo by FLY:D on Unsplash

Cybersecurity and information protection are in sharp focus given what seems like a constant news feed of cyber-attacks and data breaches. Small prototypes or proof of concept applications can be targeted and be involved in a data breach. We continue a series of articles about helping job seekers match their resumes to job postings. The project could be dealing with both personal data and sensitive personal data. The first significant feature of my brand new NLP and Data Science-backed Resume service is security.

Here is the entire series so far, with plenty more to come. The articles cover the research and prototyping work required to incubate the idea.

  • Using NLP to Improve your Resume: Performing keyword matching and text analysis on job descriptions. A popular article with much engagement and discussion.
  • Using a Python backend: Python Flask and Vue.js for NLP. Building a Flask backend server with a Vue.js frontend and avoiding CORS and docker container orchestration issues.
  • Semantic Highlighting: Building and adding a User Interface for NLP and using the vue-text-highlight node package to highlight given text on the screen. The thought process was to highlight keywords or critical sentences for the user to help with the resume updates.
  • Exposing NLP to job hunters: A short demonstration of how to take a piece of text from the frontend, perform keyword extraction and highlight the keywords for the user.
  • Handling Unstructured Data: Thinking about volume and high demand, more wishful thinking, we discuss how to orchestrate a workflow and extract text from a more extensive array of documents using Apache Tika.
  • FastAPI versus Flask: What is all that excitement? Don’t we already have Django!. Thinking about web frameworks, microservices and backend API’s, what could be the best architecture to service the NLP routines?

This article will demonstrate a technique I frequently use to provide authentication in my prototypes. Authentication happens at the frontend, and only authorized users should call the backend API endpoints. The article Different ways to Authenticate a Web Application by @vivekmadurai provides an excellent explanation for authentication. However, I prefer the Third-party access(OAuth, API-token) approach using Auth0.

Web Security

Web application security is an essential topic with various processes and methods for protecting applications and APIs from attacks by bad actors. Authentication is one such process that verifies the identity of an individual. Authorization is another, and that is the process of controlling user access via assigned roles & privileges. We need to authenticate all users, check permissions carefully, and log out inactive sessions quickly when handling personal or sensitive personal data. There are many concerns with such data, and none the least is encryption at rest or in transit.

There are many articles written on authentication, and I tend to see the approaches in two distinct buckets:-

  • Create a user database, login, logout, and enrollment processes from scratch or use available extensions to specific web frameworks.
  • Use a 3rd party provider.

Developing an authentication method from scratch appears to be a complete goal displacement. However, bearing in mind, my interest is to build NLP services for my would-be customers. Therefore I use a 3rd party provider with the required SDK for the language I am using (Python and Node with Vue.js), which for me is Auth0.

Integrating Auth0 into FastAPI and Vue.js

Since I can hardly declare myself a security expert, competence is even a stretch; I used a blog post, available code examples, and the SDK for python.

Build and Secure a FastAPI Server with Auth0

There is a starter application for vue.js, like putting in a secure foundation and then building out the components, services, and views on top, which is very attractive. Science doesn’t reinvent the wheel!

The Complete Guide to Vue.js User Authentication with Auth0

Given the quality of these articles, I felt there was no need for me to write much other than to reflect on my experience as I tackled the first part.

Securing FastAPI endpoints for NLP operations

It didn’t take long before I hit a snag, and I lost a lot of time figuring out what I did wrong. The Github repository contains the code, and there are the files for three different backend approaches:-

  • Flask with vue through the pre-built dist folder (backend.py)
  • FastAPI with vue with the pre-built strategy (main.py)
  • FastAPI with Auth0 and the pre-built vue strategy (mainAuth0.py)

The problem is visible in the following image

Running main.py (FastAPI without authentication) - image by author
Running main.py (FastAPI without authentication) – image by author

Main.py has two routes defined. One that makes a FileResponse to a "/" call and then an unsecured endpoint "/API/public", which is supposed to return a JSON response object. But the server returned 404 Not Found for the unsecured endpoint, and it made no sense to me at all.

Photo by visuals on Unsplash
Photo by visuals on Unsplash

After a bit of trial and error, I established the problem

app.mount("/", StaticFiles(directory="./dist",html=True), name="static")

app mount appears to have had the effect of making the instance a sub-application and restricting that to a static file server. Even though the documentation suggested otherwise, as in the screenshot next.

Swagger UI view of main.py showing two endpoints - image by author
Swagger UI view of main.py showing two endpoints – image by author

I accepted this design and tried to reconcile the significant difference with Flask to get around the snag. In mainAuth0.py, I changed the sub-application concept to resolve the issue.

app = FastAPI()
appVue = FastAPI()
appVue.mount("/", StaticFiles(directory="./dist",html=True), name="static")
@appVue.get("/")async def home():    
  return FileResponse("index.html")

The sub-application, appVue, is reserved for serving the frontend HTML and associated static files, and that appears a neat approach. Next, a new sub-application, app, is defined, responsible for servicing front end calls.

@app.get("/api/public")
def public():    
 ....
 return JSONResponse(content=json_compatible)

Running mainAuth0.py and visiting the swagger UI

mainAuth0.py swagger UI view showing two endpoints
mainAuth0.py swagger UI view showing two endpoints

We see a public and private endpoint. So now we are ready to secure the private one. I followed the Auth0 code examples carefully and only hit one minor snag. I misspelt one of my secrets for Auth0 burning a lot more time. Here are some highlights

from fastapi.security import HTTPBearer  
token_auth_scheme = HTTPBearer()

Those lines help to get the authorization bearer token from the request header. If there is no header or bearer token, HTTPBearer returns ‘unauthorized.’

from utils import VerifyToken
result = VerifyToken(token.credentials).verify()

VerifyToken is a class directly from Auth0, and the code shows how to use that class to verify a given token.

Putting it together, we see a secured endpoint. You need an Auth0 account and must have followed the steps in the tutorial to register an API. I used curl to fetch a bearer token from Auth0 to test my secure endpoint.

First, I tried an invalid token to see where that got me!

curl -X 'GET' 
  --url 'http://127.0.0.1:8000/api/private' 
  --header 'Authorization: Bearer FastAPI is awesome'
The terminal window - using curl to send a wrong code to the secure endpoint
The terminal window – using curl to send a wrong code to the secure endpoint

Sending the correct token to the endpoint demonstrates the desired return.

I used curl to test a secure endpoint with the correct JWT bearer token.
I used curl to test a secure endpoint with the correct JWT bearer token.

The NLP project now uses Auth0, and we can secure FastAPI routes. In the following article, we will wire up the Vue.js frontend with Auth0 enabling users to sign-up, login, log out, retrieve the token and pass the token to the backend in the authorization header.

An unexpected 404 lost me a lot of time, but I learned a lot writing this one. Mostly copy&paste is better than typing in your secrets by hand.

Photo by Aron Visuals on Unsplash
Photo by Aron Visuals on Unsplash

Join Medium with my referral link – David Moore


Related Articles