Data for Change

This project was completed by Alex Hardy, Raya Abourjeily, and me, Ani Madurkar.
This story will discuss our project as a whole, including identifying the problem, qualifying assumptions, justifying methods, evaluating results, and more. It won’t contain code snippets so if you’re interested in the code/Tableau dashboard design then I recommend you check out our Github repo or download the respective Tableau dashboards on Tableau Public.
Table of Contents
- Problem Statement
- Partnering with Domain Experts
- Exploratory Data Analysis in Tableau
- Predicting Action/Disposition from LEMIS
- Clustering Shipments from Panjiva
- So What?
- Statement of Work
Problem Statement
Illegal wildlife trafficking has escalated into an international crisis in the past decade, posing a critical conservation concern and a threat to global security for a multitude of countries. Illegal wildlife trade is driven by high profit margins (estimated value of $7.8-$10 billion per year) and undermines security protocols across nations (The White House 2014).
Additionally, due to an increase in human population, the demand for wildlife has grown and this has pushed already vulnerable species closer to extinction. Illegal wildlife trafficking, like much illicit trade, is often done through black market channels that eventually get masked into a trusted supply chain route.
This allows black market dealers to make a profit off illegal activities and evade repercussions, while the final customer (zoos, museums, etc.) appear to be acquiring legally sourced wildlife products. Currently, the efforts to disrupt these illicit supply chains often rely on intuition rather than data-driven solutions that enable proactive measures to stop illicit trafficking.
Illicit wildlife trade networks are complex but through Data Science we plan to provide new insights into solutions for addressing this challenge. Advanced data analytics can enable us to draw connections between incidents and to flag risk areas where companies may be unknowingly enabling the transportation of illegal wildlife cargo can help make companies more accountable and to deliver on the US National Strategy for Combating Wildlife Trafficking.
Partnering with Domain Experts
Since this social good problem was a new one for all of us, we needed to partner with experts and get their feedback frequently. Partnering closely with domain experts enabled us to ensure our solution(s) was valuable and actionable.
The primary point of contact for us was Dr. Neil Carter; this excerpt is directly from his website, "Dr. Neil Carter’s interdisciplinary research examines the complex dynamics that characterize interactions between wildlife and people (e.g., provision of ecosystem services, conflicts) in a global change context. His work addresses local to global wildlife conservation issues, utilizes a multitude of spatial techniques and tools, engages different stakeholders, and informs policymaking".
To ensure we were properly suited to deliver value, we held weekly stand-ups to demo progress and strategize next steps with Neil. He brought awareness to some notable data work that has previously been done in this space:
- Global Trends and Factors Associated with the Illegal Killing of Elephants: A Hierarchical Bayesian Analysis of Carcass Encounter Data
- Dissecting the Illegal Ivory Trade: An Analysis of Ivory Seizures Data
- Detecting Illegal Timber Trade
Through frequent discussions, we quickly noticed the dearth of effective data solutions for this multi-million dollar problem. Although data work was likely being done somewhere, the results do not appear to be widely available and thus of limited value to law enforcement officers and conservation practitioners. We found this lack of knowledge sharing created a gap felt by researchers and officers alike.
Neil’s connections helped link us to Robert (Bob) Herndon, who works at the main United States UPS shipping headquarters in Kentucky and specializes in wildlife (animals and products). We conducted a semi-formal user research interview to understand the problem better, and get his perspective on it. Bob confirmed our suspicions that the gap of knowledge sharing is leaving him and his peers to rely a lot on intuition when assessing 1.2 million+ packages daily. Due to extensive experience, Bob is able to quickly scan documents with shipping information to ascertain suspicious shipments that might need to be searched for illicit materials. Our discussion with Bob gave us a lot of hope that data can be a useful tool for people in this domain, especially since he relies on his intuition and expert experience to asses a large quantity of data each day. Even so, he and his colleagues would really appreciate having automated and scalable solutions that are capable of extending their intuition.
We also were able to connect to one of Neil’s peers Dr. Meredith Gore. Dr. Gore’s website states, "I use risk concepts to build new understanding of human-environment relationships. My research is designed to build evidence for action. The majority of my activities can be described as convergence research on conservation issues such as wildlife trafficking, illegal logging, fishing and mining". Her vast knowledge spans across multiple domains which assisted us to identify two prominent needs for researchers & officers in this space:
- Broad data analysis that is easily accessible to assist domain experts ask the right questions
- Extensive set of big data solutions that provide interpretable and actionable results
We decided to focus our solutions on these two needs specifically. For our short-term project, there weren’t really any ethical considerations we had to be wary of. The main considerations we took into account is keeping the needs of the domain experts at the forefront of everything we do, and ensuring our assumptions are validated by their experience.
Our Approach

We leveraged two datasets to provide insight on (1) what features predict whether a shipment is successfully seized for having illicit wildlife products and (2) whether those seizure patterns can reveal other hidden illicit trade dynamics using a larger, publicly-available (via fee) database on shipments to the US. The LEMIS dataset includes labeled data on 15 years of the importation of wildlife and their derived products into the United States (2000–2014), originally collected by the United States Fish and Wildlife Service. The Panjiva dataset was manually downloaded through a paid Panjiva account and it includes unlabeled data for imported shipments (2007–2021) related to wildlife for HS codes 01, 02, 03, 04, and 05 as these represent animals & animal products. There is no sensitive data in either dataset. The labels/target variable in LEMIS represent what the outcome was for a given shipment: Abandoned, Cleared, Reexport, or Seized. Although Panjiva’s data was primarily being used to assess latent seizure patterns in illicit trade dynamics, unlabeled data forced us to use our own subjective interpretation and context to evaluate the results.
We analyzed both datasets in conjunction mainly due to LEMIS data only being dated from 2000–2014, while Panjiva allowed us to access recent imports into the US. Additionally, Panjiva data showed more information for the shipment’s ledger such supplementary data for the consignee, port, etc. LEMIS did not have much of that but did have supplementary data for the goods shipped such as the taxa, genus, etc.
We created two ETL scripts that read in files from each data source and output a cleaned file that make it easier for other downstream tasks. After we cleaned the files, we sent each into a Tableau dashboard that can be viewed on Tableau Public. The cleaned file also is used for a Machine Learning web-based application on Streamlit.
The Tableau dashboards were built to explore each dataset thoroughly. We created a variety of visualizations that allow the user to explore each dataset from multiple dimensions and perspectives. Although our data quality was limited, we wanted to create an open exploratory dashboard that assists domain experts in asking effective questions.
The Streamlit application was built to predict the action/disposition (target variable) from LEMIS and to cluster shipments from Panjiva. The user is able to quickly iterate on various supervised/unsupervised models by selecting different hyperparameters and then evaluate models using techniques like Lime, Feature Importances, WordClouds, and more.
Over time we hope to output our best model’s predictions into Tableau to enable use of our predictions that go beyond model evaluation.
LEMIS Tableau Dashboard :

Panjiva Tableau Dashboard:

Streamlit Application:

Exploratory Data Analysis in Tableau
Since we collaborated with stakeholders from various backgrounds on this project, it was extremely important that visualizations of the data could be easily explored and accessed by them. To achieve this, we created two interactive and publicly available Tableau dashboards, one for each of the datasets, that allow the end user to explore the data through various combinations of visualizations and filters. Each dashboard contains a home page which provides a big picture overview of the dataset as well as several other tabs which allow the user to analyze the data through different lenses. These include but aren’t limited to: where the shipment came from and went to, what type of Wildlife product was in the shipment, percentage of refused vs cleared shipments.
LEMIS Dashboard
- Overview Tab: A big picture overview of shipments in the LEMIS dataset.
- Country Tab: An in depth look at the country of origin, country of export, and port of import for shipments.
- Taxa & Description Tab: An in depth look at the description, taxa, and general name of the wildlife product in the shipments.
- Refused Shipments Tab: A focused look on the percent of total shipments that were refused based on certain attributes.
Panjiva Dashboard
- Overview Tab: A big picture overview of shipments in the Panjiva dataset.
- Country Tab: A focus on the shipment origin, shipment destination region, and port of lading and unlading regions.
- Consignee Tab: A focus on the details of the shipper and consignee of the shipments
All visualizations on each of the dashboards can act as a filter for the rest of the dashboard, which allows the dashboards to be customizable for each user depending on their needs. For the LEMIS dashboard in particular, the user can filter by whether the shipment was refused or accepted into the U.S. as an import and can compare how the attributes of each type of shipment differ.
As part of creating these dashboards, we consulted with subject matter experts and end users to ensure that the dashboards addressed their needs. In turn, we have received actionable feedback on how to improve the dashboards for their next iteration. We learned from Dr. Gore that it is critical to focus not only on what is shipped but on how it was shipped. One example of this is looking at the co-occurrence of different products being shipped together, which our dashboards do not currently address. Another important topic Dr. Gore recommended we address further is a trend analysis in comparison to various policies introduced regarding wildlife trade. She noted that there has been a "massive increase in fines and sanctions since 2018" and that it would be useful to understand if and how wildlife trade has been affected by this. Overall, by conducting user interviews we were able to ensure that our dashboards were user-friendly and addressed their objectives while collecting information on how to improve our dashboards in the future.
Both dashboards can be accessed on Tableau Public through the following links:
Predicting Action/Disposition from LEMIS
Goal
The goal of this analysis was to effectively predict the action/disposition for a given shipment in LEMIS data. The expected user of this analysis in the short-term is future researchers hoping to understand discriminative features of a shipment that contribute to it resulting in being Abandoned, Cleared, Reexported, or Seized. In time, we see the potential of this application being used in real-time to assist officers in automating their intuition while assessing millions of packages a day.
Data Cleaning & Manipulation
The first data cleaning step involved joining the main LEMIS file to a file that had codes as key/value pairs. This allowed us to unstack two columns (unit and value) into a series of columns (weight (kilograms), volume, etc.). After further analysis, we noted that a lot of the numerical measures from this dataset were null and eventually ended up getting dropped.

We also ended up combining the action and disposition fields together in this dataset as our target variable. Actions can only be Cleared and Refused, but Disposition represents what happens after where a given shipment (if Refused) can be Abandoned, Reexported, Seized, or Cleared. Creating one target column here set up a clear multi-classification problem.
Feature Engineering
LEMIS also has a variety of taxonomy features such as taxa, class, genus, species. There are numerous nulls in these columns, but due to the nature of our problem missing data is quite meaningful for us. For our use case, our dataset was already provided in a relatively clean status. This allowed us to assume that the missing data was not due to some system errors or something unwanted but mainly due to unknown information. So we replaced nulls with "unknown" and also created a "complete_percent" column which represents how many columns of the taxonomy are filled out. Our hypothesis is that this would assist our models to discriminate illicit shipments more effectively. This is because we assumed that someone shipping illicit goods would leave many fields blank (i.e., unknown) to obscure the origin and identity of the illicit products. Finally, we combined the taxonomy columns into one string which enabled us to use text vectorization.
Although majority of the numerical fields were null, we did find a decent amount of entries for value. The issue with the value feature is that there are a lot of outliers due to high value shipments. We solved this approach via winsorizing the field. This enabled us to set limits of 0.05 and .95 which bounded the outliers to the values at those limits.
Our last feature engineering step involved using SMOTE to balance our target classes. We had an overwhelming amount of Cleared samples in our dataset, so we upsampled the minority classes to ensure balance.

Before modeling, we one hot encoded the categorical variables, robust scaled the numerical variables, and count vectorized the text variable. One Hot Encoding made sense for the categorical labels as none had a sensible hierarchy to them that certain categories should be of higher value. Robust Scaler worked well for handling the outliers in our dataset as it removes the median and normalizes by the IQR. Lastly, Count Vectorization made sense for the text feature since it isn’t a true ‘free text’ field and simply counting occurrences of taxonomies would be sufficient.
To avoid as much data drift as possible, we fitted and transformed the training data and only transformed the validation/holdout set.
Model Development & Evaluation
For modeling, we attempted a series of 5 different supervised models. We started with simple LogisticRegression and found hurdles in model learning slowly. We pivoted to leveraging SGDClassifier which is capable of using Gradient Descent to optimize performance and speed in learning. Although this performed better, we weren’t able to achieve significantly better metrics in evaluation even after applying Grid Search Cross Validation on a variety of hyperparameters. We then pivoted to using tree-based methods such as Decision Tree, Random Forest, and lastly eXtreme Gradient Boosted Trees (XGBoost).
After repeated iterations and testing our models on a holdout set, we found XGBoost to yield the strongest results. This also holds true from what has been seen as the most successful model on most Kaggle boards, thus reaffirming our suspicions. Quoting the documentation from XGBoost site, "XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way".

Even with training on a balanced dataset, our model is actually performing quite well on detecting the other classes. We still find our models hitting a certain plateau due to such a large amount of Cleared shipments in our model. Due to the large amount of nulls (or unknowns) in the dataset and the varying rules or criteria for action/disposition by geographical location, we theorized our models were only going to get marginally better than this with the data we had. It makes sense that Cleared and Reexport confuse our model a bit due to those varying rules as well; since most shipping centers and ports won’t want to store a large amount of ‘suspicious’ shipments, it’s easier to Reexport in times of confusion than Abandoning or Seizing. Discussing with Bob told us that this decision boundary seems to dependent on a subjective intuition of the officer.

To properly evaluate our model performance and scalability, we leverage learning curves using Cross Validation as well as evaluation metrics.
This reflects similar theories we had regarding more data/examples seem to help our model effectively considering how many of them are Cleared. Even with upsampling our dataset, we only have a limited number of unique scenarios yielding our various target classes.
Extracting out the coefficients from our model, we can obtain global feature importances to see what the model is able to discern is important to discriminate between the classes. We can also split up the features by data type to see importances for each kind.


We start to see certain generic names go to the top such as ‘reptilia’ and ‘anthozoa’, while also having the foreign company listed as ‘unknown’ being a significant feature. The unknown foreign company is interesting as it lines up well with our intuition: we’d expect to find a shipment fairly suspicious if we don’t see a significant field such as a foreign company (consignee) listed on the manifest. We also see a number of ports picked up by our model like Del Rio, Juneau, Brownsville, and more. Our discussions with domain experts made it evident to us that each port or shipping center could have a range of guidelines and regulations that they follow effectively. Some could be easier to pass illicit shipments through and others may not. Although we can’t use these feature importances to conclude that these are "bad" ports, we can use this as educated inspiration for further investigation.
Finally, we use LIME for easy multiclassification model explainability. Loading our predictions into a LIME explainer, we generate a random false prediction to see what the model is learning.

One false prediction is not enough to draw sensible conclusions, but we can see certain features we saw in the global feature importances like reptilia and Juneau port are contributing to the model detecting it "should be" Seized instead of Cleared.
Future consultation with domain experts about these models will allow us to establish the validity of the results and point us toward next steps for model iteration. We can show the value of advanced data solutions in this space and clarify the immense value in having a robust & transparent system partnering with expert’s intuition to make decisions at scale with speed.
Predicting Action/Disposition from LEMIS in Streamlit can be accessed by the following link:
Clustering Shipments from Panjiva
Goal
The goal of this analysis was to find interesting patterns in the structure of Panjiva data. Similar to the previous objective, the expected user of the analysis is future researchers hoping to understand the relationships between shipments. Without labeled data, we hope to be able to provide commonalities amongst packages which illustrates insight into what is likely to be illicit or not.
Data Cleaning & Manipulation
An analysis of the categorical data shows that a lot of data was missing. Most of the data in the dataset appears to be free-form text so we made an early decision that missing data actually contains valuable information. Similar to what we did for the LEMIS dataset, we selectively relabeled prominent columns containing nulls with ‘unknown’. However, there were some columns in the dataset that were almost never filled in so we were forced to completely drop some that didn’t make sense to model on.
Feature Engineering
There were two feature sets that we wanted to explore. The first (manifest clustering) was dealing strictly with the information provided by the user, with minimal manipulation. The second (word embedding) was in regards to the free form text field describing the shipment contents.
Manifest Clustering: The idea here is that the manifest contains some factual information about the shipment (consignee, ports of entry, weight, etc.). This dataset contains mixed categorical and continuous data types, and we wanted to cluster based on this factual information to provide a relationship between the shipments. Categorical columns were changed to categorical codes while numerical data was standardized and normalized.
Each categorical column was checked against others for correlations. Columns that correlate highly with each other were dropped by manual checking. A final feature was created to represent the percentage of columns in each shipment that were missing. The thought here was that the amount of missing columns might play an important part in clustering packages together, since it is possible that illicit shipments might hide information. Additionally, missing data for an individual shipment was coded with a value of -2 instead of dropped (if missing data was dropped, there would be almost no rows of data left; as stated above, missing data does contain meaningful information in this domain).
Word Embedding: Panjiva data also contains a free form description field. We wanted to cluster packages together based on what the user claims is in the packages as well. From meeting with our domain experts, we learned that illicit trade is often hidden as real trade. So the user will have to lie about what is in the shipment. By clustering this free form data, and combining it with the shipping manifest information above, researchers can hopefully identify packages that are unusual. In order to do this, we created two word embeddings. We used a count vectorizer and a tf-idf vectorizer. Prior to vectorization the following pre-processing was performed to ensure we’re working with adequate text to vectorize:
- Stop word removal
- Digit removal
- Single letter removal
- Escape character removal (‘/n, /r, etc)
- Words containing any number inside removal
Steps 3,4,5 were all done after many iterations of the vectorization. Each vectorization iteration was manually inspected to see what the frequencies (term/document/etc) were being picked up on as important, and the tokenization was edited until meaningful words were being represented.
Model Development & Evaluation
Manifest Clustering: For the manifest feature sets, we wanted to explore models that mixed the categorical and continuous variables together (we also checked just continuous models but with discussion from domain experts we learned these are less valuable). Different model types were explored until settling on KPrototypes clustering for performance and ability to handle large datasets. KMeans was not used for any categorical data, as it will use the categorical codes as continuous representations which is mathematically not accurate. Using the elbow method we determined 5 clustering groups explained the relationship between the clusters sufficiently:

Separation between the clusters was also visually inspected, and while a 2 dimensional representation of a multidimensional space is challenging, we believe most of the representations do show a good separation between the clusters, indicating that our assumptions in the feature creation steps were solid.


Interestingly, the shipments were not sorted into equal numbers per cluster, as can be seen below:

This is not unexpected as the goal of this study was to find unique shipments for researchers to investigate. We believe this is due to the fact that we have a lot of similar items being shipped related to wildlife animals and products. It’s likely that even illicit shipments are being labeled as a "frequently" shipped item to avoid suspicion and flagging.
Word Embedding: Prior to model creation, we wanted to check the performance of the vectorizers themselves. For this, a cosine similarity metric was created. Many random packages were inspected with relation to their top 10 closest matches in the entire data set, using the NearestNeighbors module. Some samples of the results are below (first row in each example is the row being compared):



As can be seen, the vectorizer seems to be parsing the free form fields very well. From here, a KMeans model was built on the vectorized corpus to cluster the shipments. As in a) the elbow method was used to determine the appropriate number of clusters:

While there isn’t an immediate elbow like before, we do see diminishing returns of more clusters. We chose 50 clusters for the rest of this analysis, but we leave the actual choice open to the researcher via the Streamlit application.
The performance of the model was then contextually inspected for each cluster. WordClouds were built both to check model performance and to give the researchers an indication of what each cluster contained. Some samples can be seen below:


We hope to use these WordClouds for certain date ranges and geographical locations to narrow down the shipment manifest we’re focusing on. We also have requested that Bob provide a ‘suspicion’ label to our unlabeled Panjiva dataset which would provide significant insight into what clusters our models are able to discern.
Clustering Shipments from Panjiva in Streamlit can be accessed by the following link:
So What?
Alex, Raya, and I all believe that although the cutting-edge models, algorithms, and tools are fun and exciting, the main thing that matters is what value you’re providing to some aspect of the world with data. We happened to fall into the world of Global Wildlife Trafficking thanks to Dr. Neil Carter and him generously sharing his time and knowledge. In this world, we spent a majority of our time understanding the problems they’re facing and what their experiences really are. We found that their data needs primarily revolved around:
- Broad data analysis that is easily accessible to assist domain experts ask the right questions
- Extensive set of big data solutions that provide interpretable and actionable results
We hope that our Tableau dashboards & Streamlit application raise awareness for the prominent issues in this world. Providing easy-to-access advanced analytics at scale can help future researchers and practitioners ask better questions and use evidence to investigate illicit trafficking. We hope it can inspire them to openly share and use data as it can have immense value for shifting the current solutions from reactionary to prescriptive.
Statement of Work
ETL Scripts – Ani Madurkar
Data Manipulation – All Team Members
Data Visualization and Feature Engineering – All Team Members
Exploratory Data Analysis in Tableau – Raya Abourjeily
Predicting Action/Disposition from LEMIS – Ani Madurkar
Clustering Shipments from Panjiva – Alex Hardy
Data Science Team Lead – Ani Madurkar
Special thanks to Dr. Neil Carter for contributing so much of his time with us. Much of our progress was due to his domain expertise and eagerness to help.