
The rise of Large Language Models (LLMs) has ushered in a new era of text-based AI systems. Although these models are very good and highly capable, their training predominantly focuses on English. The largest commercial LLMs generate text well using "low resource" languages, while the smaller open-source models don’t fare well with non-European languages.
However, Meta trained the new Llama 3 model with a wider variety of languages, as they announced in a post when it was released[1].
To train the best language model, the curation of a large, high-quality training dataset is paramount. In line with our design principles, we invested heavily in pretraining data. … To prepare for upcoming multilingual use cases, over 5% of the Llama 3 pretraining dataset consists of high-quality non-English data that covers over 30 languages. However, we do not expect the same level of performance in these languages as in English. – Meta
Five percent doesn’t sound like much, but it’s more than its previous versions of Llama [2] and other small LLMs like Mistral [3]. To demonstrate this, I sent the same prompt in Korean to Llama 2 7B, Mistral 7B, and Lama 3. Note that the translations in parentheses are shown for convenience.
Rob: 팬픽션이란 무엇인가요? 한 문장으로 대답하세요. 한국어로 답장해주세요. (What is fan fiction? Answer using one sentence. Reply in Korean.)
Here’s how Llama 2 7B responded.
Llama 2 7B: Fan fiction,팬픽, is a type of creative writing that originated on the internet and involves fans imagining and writing stories about their favorite characters, often from movies, TV shows, books, or video games.
Although it showed the correct Korean phrase for Fan fiction, and the answer was good, it didn’t write it in Korean as prompted. This was a failure.
Here’s how Mistral 7B responded.
Mistral 7B: 팬픽션은 팬들이 작성한 팬아트로 작품의 팬던트를 나타내는 작품입니다. (Fan fiction is fan art written by fans that represents a work of art.)
It did respond in Korean, which is good. However, it has answered a different question: what is fan art? This was a failure, too.
Here’s how the Llama 3 model responded.
Llama 3: 팬픽션이란 팬덤에서 창작한 소설, 만화, 또는 다른 형태의 작품을 의미합니다. (Fanfiction refers to a novel, comic, or other form of work created by a fandom.)
This is much better. It responded in Korean, and the response was excellent. The phrase "form of work created by a fandom" is slightly off but still understandable.
Introducing FanFabler, my custom version of the Llama 3 model, fine-tuned to handle fan fiction queries in multiple languages. FanFabler is specifically trained to understand and generate content across 40 languages, ensuring accurate responses. With this model, I aimed to enhance language understanding and the ability to create rich, diverse storytelling across global audiences.
Here’s how FanFabler responded to the prompt:
Rob: 팬픽션이란 무엇인가요? 한 문장으로 대답하세요. 한국어로 답장해주세요. (What is fan fiction? Answer using one sentence.)
FanFabler: 팬픽션은 원작 속의 캐릭터들을 새로운 스토리로 조작하고 재해석하여 작성하는 문예 창작을 의미합니다. (Fanfiction refers to creative writing that manipulates and reinterprets original characters into a new story.)
First, notice how I didn’t have to prompt the model to respond in Korean. The fine-tuning taught it to always respond in the same language as the query. And this is an excellent answer to the question—much better than the answers from Mistral, Llama 2, and the original Llama 3.
Overview
In this article, I will guide you through the process of fine-tuning the Llama 3 model to enable users to create fan fiction in 40 languages. I’ll begin with a brief overview of fan fiction before diving into the technical details of how I customized the Llama 3 model for this purpose.
I’ll cover the data collection process, where I generated multilingual fan fiction queries and responses to build a comprehensive dataset. Next, I’ll explain the technical steps in fine-tuning the model, including pre-processing the data and setting up the training environment.
I’ll also describe how I configured the system to automatically retrieve relevant background information from Wikipedia during chat interactions, enhancing the model’s ability to generate accurate and context-rich responses in different languages.
Finally, I’ll share some insights into the performance of the fine-tuned model and show some results from interactions with the system.


Fan Fiction
Fan fiction is a genre of amateur writing in which fans create stories based on existing fictional works without official authorization. These stories can range from short snippets to full-length novels, utilizing characters, settings, and other elements from the original works. Fan fiction spans various subjects and media, including books, films, comics, and games. It is considered a form of fan labor.
The practice of writing fan fiction became notably defined in the 20th century as copyright laws evolved to differentiate between authorized and unauthorized uses of copyrighted characters. Despite its unofficial status, fan fiction varies widely in acceptance, with some original creators embracing it and others opposing it through legal means. The use of the term triggers ongoing debates over copyright, fair use, and the boundaries of creative expression within fan communities [4].
An early example of fan fiction is "New Adventures of Alice", written by John Rae in 1917, as shown above. This work was based on "Alice’s Adventures in Wonderland" by Lewis Carroll from 1865. The new story begins with a young girl named Betsy who longs for a new book about Alice. As she drifts into a dream, she discovers a book in the attic where Alice is depicted reading Mother Goose rhymes to her kittens, which sparks additional adventures.
Many fan fiction websites available today, like Archive of Our Own (AO3) and FanFiction.net, provide a platform for writers and readers to share and explore diverse fan-created content. These sites offer vast libraries of stories spanning multiple fandoms and genres, allowing fans to engage with familiar characters and settings in new and creative ways.
A notable fan fiction series, "Master of the Universe" by E. L. James, was originally set in the Twilight universe before evolving into the widely recognized Fifty Shades series. This transformation from fan fiction to a best-selling trilogy exemplifies the potential of fan-created works to achieve mainstream success.
FanFabler
Below is a diagram that shows the processes and components I used to fine-tune Llama 3 to help create fan fiction in multiple languages.

Using GPT-3.5 Turbo, I generated simulated conversations about writing fan fiction in 40 languages based on custom prompts. I collected 4,000 chat interactions and saved the results as a dataset. I then fine-tuned Llama 3 with 8 billion parameters using the data to create my FanaFabler model.
With the fine-tuned model, users are free to ask for assistance with fan fiction in any language, and the system will generate helpful answers and engage in chat sessions. If needed, the system will check Wikipedia for relevant information about primary sources to help with the creative process.
Choice of Languages
As mentioned in the blurb above, Meta trained their Llama 3 model primarily in English. However, they said over 5% of the training text "covered over 30 languages." They didn’t specify which languages they used in their post. To select the languages for this project, I looked at the List of Wikipedias and chose the top 40 languages based on usage.

The chart shows the number of Wikipedia articles by language. On the left, you can see a massive drop from 6.8 million in English to 2.9 million in German, continuing down to Hindi with 161,000 articles. I omitted the "Simplified English" Wikipedia from the list because of the redundancy and arbitrarily added Swahili with only 81 thousand articles to round out my set of 40 languages.
Generating Training Data
I generated the text for my multilingual fan fiction dataset using GPT-3.5 Turbo. To accomplish this, I created simulated chat interactions with user queries and responses from the chatbot. Here are the steps I used:
- I got a list of 20 fan fiction properties for each of the 40 languages (i.e., Star Wars, The Office, etc.).
- I created a series of 5 chat interactions for each property, starting with an initial question.
- I got answers to the first questions using Wikipedia language codes and search terms for background information.
- I got background text from Wikipedia or generated it if it was unavailable.
- I created follow-up questions for the user.
- I generated answers to the follow-up questions.
My dataset has 40 languages, 20 properties per language, and 5 interactions per property. If you do the math, there are 800 language/property pairs and 4,000 interactions in total.
Here are some details and Python code for how I generated the data.
Fan Fiction Properties
The first step in generating my training data was getting a list of properties to be used as the basis for creating fan fiction. These could be books, movies, TV shows, etc. Here’s the code I used to generate this data.
from openai import OpenAI
from google.colab import userdata
import json
api_key = userdata.get('OPEN_AI_KEY')
client = OpenAI(api_key=api_key)
def get_properties(language, script, wp_code):
prompt = f"""Here is info on a language:
Language: {language}
Script: {script}
Wikipedia Code: {wp_code}
Create a JSON list called "properties" of 20 properties that are popular with
{language} speakers who might want to write fan fiction.
The property type could could be books, tv shows, movies, video games, etc.
Show a mix of locally produced international properties for a total of 20.
List the common name for the property in the {language} language using the key
"local_name", even for international properties.
Add the English name to the list as "english_name".
Add the property type with the value in lowercase English as "property_type".
Add a Boolean "is_local" value to indicate if the property is locally produced
or not."""
response = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
messages=[
{
"role": "user",
"content": prompt
}
],
temperature=1
)
return response.choices[0].message.content
My function takes in a language, script type, and Wikipedia language code, for example, "Swedish," "latn," and "sv." I then prompted GPT 3.5 Turbo to create a JSON record with a list of 20 properties for fan fiction. Each record has a local name, English name, property type, and an indication of whether the property is local to the language. Here are three of the 20 results for Swedish.
{
"properties": [
{
"local_name": "Pippi Långstrump",
"english_name": "Pippi Longstocking",
"property_type": "books",
"is_local": true
},
{
"local_name": "Sagan om Ringen ",
"english_name": "The Lord of the Rings",
"property_type": "book",
"is_local": false
},
. . .
{
"local_name": "Mamma Mia!",
"english_name": "Mamma Mia! ",
"property_type": "musical",
"is_local": false
}
]
}
This sample of three shows includes one local property, "Pippi Longstocking", and two global properties, "The Lord of the Rings" and "Mamma Mia!", although arguably, the latter originated in Sweden, or at least the music did.
When working with an LLM like GPT-3, it’s best to specify the outputs to be returned in JSON format. It’s so much easier to parse the results this way.
The First Questions
After generating all 800 language/property pairs, I used the following code to generate five first questions a user might have about writing fan fiction.
def get_first_question(language, script, wp_code, local_name, property_type):
prompt = f"""Create a brief query to an AI writing assistant in the
{language} language using the {script} script.
The lanugage code is {wp_code}.
The query is about creating fan fiction for {local_name}, in the category
of {property_type}.
Keep it to one sentence. Use some variety.
"""
response = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
messages=[
{
"role": "user",
"content": prompt
}
],
temperature=1
)
return response.choices[0].message.content
This function takes these parameters: language, script, wp_code, local_name, and property_type. It constructs the prompt string and calls GPT-3.5 Turbo to get the response. I wrote the code to return text this time, not a JSON object.
The First Answers
My next step was to get GPT-3.5 Turbo to answer the five questions and provide a means to optionally look up and get some background information about the property in the user’s native language. Here’s the code:
def answer_first_question(first_question):
system_prompt = """You are a multi-lingual writing assistant for fan fiction
for various properties. Start with some brief, general advice.
Write the name of the property and Wikipedia language code at the end after
three greater-than characters like this:
>>>property_name
>>>wiki_code"""
response = client.chat.completions.create(
model="gpt-3.5-turbo-1106",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": first_question}
],
temperature=1
)
return response.choices[0].message.content
For this function, the only input is the initial question. The system will automatically detect the language. My prompt coaxes the language model to answer the question and provide a command to get some background information from Wikipedia. The command is in this form:
property_name wiki_code
Retrieving Background Texts
When my code sees the ">>>" characters in the outputs, it tries to retrieve some background information from Wikipedia. Using this method, the system may better help the user with properties unknown by the language model.
Here’s the code to retrieve information from Wikipedia using the search API.
import requests
import time
def get_snippets(wp_code, subject, limit=5):
search_url = f"https://{wp_code}.wikipedia.org/w/api.php"
search_params = {
"action": "opensearch",
"search": subject,
"limit": str(limit),
"namespace": "0",
"format": "json"
}
search_response = requests.get(search_url, params=search_params)
search_data = search_response.json()
titles = search_data[1]
results = []
for title in titles:
extracts_params = {
"action": "query",
"format": "json",
"prop": "extracts",
"exsentences": 20, # Estimate for 500 words
"titles": title,
"explaintext": True,
}
extracts_response = requests.get(search_url, params=extracts_params)
extracts_data = extracts_response.json()
pages = extracts_data['query']['pages']
for page_id in pages:
page = pages[page_id]
snippet = page.get('extract', 'Not available.').replace('n', ' ')
results.append({"title": title, "snippet": snippet})
return results
This code defines a function that interacts with the Wikipedia API. It retrieves short textual descriptions (snippets) of Wikipedia articles related to a given subject in a specified language. The function takes a Wikipedia language code (wp_code) and a search subject (subject). It returns a list of five results, each containing a title and a snippet of the article.
Here is the top result for calling in with wp_code set to "sv" and the subject set to "Pippi Lånstrump."
Pippi Långstrump - Pippi Långstrump, Pippilotta Viktualia Rullgardina
Krusmynta Efraimsdotter Långstrump, är en av Astrid Lindgrens mest kända
litterära figurer och även namnet på den första boken om Pippi och
hennes vänner, Tommy och Annika. ...
Here is the first result again, translated into English:
Pippi Longstocking - Pippi Longstocking, Pippilotta Viktualia
Rullgardina Krusmynta Efraimsdotter Longstocking, is one of Astrid
Lindgren's most famous literary characters and also the name of the
first book about Pippi and her friends, Tommy and Annika. ...
The system did a good job finding relevant Swedish Wikipedia pages for Pippi Longstocking. To provide a variety of text in the dataset, I used snippets from all five pages for fine-tuning Llama 3. In cases when fewer than five results were returned, I asked GPT-4 to generate background information on the property for the specified language. This way, all 800 language/property pairs have five pieces of background text and chat interactions.
The Second Questions and Answers
I followed the same pattern to generate each entry’s second question and answer. For each entry, I sent the first question, the first answer, and the background text to produce the second question. Then my dataset was complete.

The Multilingual-FanFic-Chat-4K Dataset
I saved the entire fan fiction dataset, including train, evaluation, and test splits, here on Huggingface. Feel free to peruse the entries.
robgonsalves/Multilingual-FanFic-Chat-4K · Datasets at Hugging Face
Llama 3 Model
Meta has been doing extensive research into LLMs. They introduced the original LLaMa models in February 2023 [1]. They followed up with Llama 2 in July 2023, where they introduced chat variations of the models [6]. In April 2024, they released Llama 3, with several improvements for efficiency and accuracy. Note that Meta increased the size of their models in their latest release. Llama 2 had models with 7B, 13B, and 70B parameters. Llama 3 has models with 8B and 80B parameters.
Here’s what they said about their new models [1].
Compared to Llama 2, we made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, we’ve adopted grouped query attention (GQA) across both the 8B and 70B sizes. … Llama 3 is pretrained on over 15T tokens that were all collected from publicly available sources. Our training dataset is seven times larger than that used for Llama 2 … – Meta
They improved the model’s architecture, increased the vocabulary size from 32K to 128K tokens, and adopted GQA for more efficient inferencing. They also trained the new models on a much larger dataset.
Fine-tuning
Although the Llama 3 8B model works well out of the box, fine-tuning can improve it for specific tasks.
There are several methods to fine-tune Llama 3. Meta shows how to use torchtune and Huggingface here. I found an excellent package called unsloth with a working Google Colab, which I used for this project. I mostly used the defaults from the original fine-tuning Colab. The only changes I made were to use an evaluation dataset and adjustments to some of the hyperparameters, as discussed below.
Here’s the code to load in the Llama 3 model.
from unsloth import FastLanguageModel
from transformers import TextStreamer
import torch
max_seq_length = 1024
dtype = None
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/llama-3-8b-Instruct-bnb-4bit",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
token = token
)
I used a 4-bit version of Llama 3 8B to keep the memory size low. I set the maximum sequence length to 1,024, which helps contain memory use.
I then loaded my dataset using this code.
from datasets import load_dataset
# Load in the training and evaluation datasets
train_dataset = load_dataset("robgonsalves/Multilingual-FanFic-Chat-4K",
split="train")
test_dataset = load_dataset("robgonsalves/Multilingual-FanFic-Chat-4K",
split="test")
# Format the data
train_dataset = train_dataset.map(formatting_prompts_func, batched=True,
remove_columns=train_dataset.column_names)
test_dataset = test_dataset.map(formatting_prompts_func, batched=True,
remove_columns=test_dataset.column_names)
# Shuffle the datasets
train_dataset = train_dataset.shuffle(seed=42)
test_dataset = test_dataset.shuffle(seed=42)
I loaded in the splits for the training and evaluation datasets, formatted the text, and shuffled the data. This randomizes the order of the entries to prevent the model from learning any unintended patterns that might exist in a sequence of languages and properties.
The system uses Low-Rank Adaptation (LoRA) for efficient training [7]. Instead of updating all of the model’s parameters while fine-tuning them, changes accumulate in separate, smaller matrices and are then applied to the model’s parameters. This effectively reduces the number of trainable parameters for both speed and efficiency.
Here is the code that sets up the model for training with LoRA.
model = FastLanguageModel.get_peft_model(
model,
r = 64,
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 64,
lora_dropout = 0,
bias = "none",
use_gradient_checkpointing = True,
random_state = 3407,
use_rslora = False,
loftq_config = None,
)
Through experimentation, I increased R from 16 to 64, which increased the size of the trainable parameter set. I also increased the LoRA Alpha from 16 to 64 to significantly influence the trainable set. Making these changes improved the quality of interactions with the finished model.
Here’s the code for fine-tuning the model, which uses Huggingface’s Supervised Fine-tuning Trainer (SFTT) as part of their Transformer Reinforcement Learning (TRL) package.
from trl import SFTTrainer
from transformers import TrainingArguments
trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
train_dataset = train_dataset,
eval_dataset = test_dataset,
dataset_text_field = "text",
max_seq_length = max_seq_length,
dataset_num_proc = 2,
packing = False,
args = TrainingArguments(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 5,
num_train_epochs = 1,
learning_rate = 2e-4,
fp16 = not torch.cuda.is_bf16_supported(),
bf16 = torch.cuda.is_bf16_supported(),
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
output_dir = "outputs",
evaluation_strategy="steps",
eval_steps=100
),
)
I mostly used the defaults in the unsloth Colab. The only change I made was to use my test dataset for evaluation during training. You can see an explanation of all of the training parameters here.
The training ran for 2 hours and 20 minutes on an Nvidia L2 GPU with 22 GB of RAM. Here are the graphs for training and evaluation loss during fine-tuning.

The training and evaluation loss graphs indicate a successful fine-tuning process, with a sharp initial decline in loss followed by a gradual decrease, reflecting ongoing learning. The parallel reduction in training and evaluation loss signifies the model’s improved accuracy without overfitting, demonstrating effective generalization to unseen data used for training.
My last step was to save the model to Huggingface using this line of code.
model.push_to_hub_merged("robgonsalves/fan-fabler-lora", tokenizer,
save_method = "lora", token = token)
Setting the save method to lora indicates that only the tiny set of trainable parameters will be saved. The complete Google Colab for training is here.
Using the Model
After I completed the fine-tuning and saved the model to Huggingface, I loaded it again using this code.
from unsloth import FastLanguageModel
from transformers import TextStreamer
import torch
max_seq_length = 1024
dtype = None
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "robgonsalves/fan-fabler-lora",
max_seq_length = max_seq_length,
dtype = dtype, load_in_4bit = load_in_4bit)
FastLanguageModel.for_inference(model)
Running this code loads the LoRA weights in the original Llama 3 model and prepares the model for use. The last line enables a 2X speed-up for inferencing.
Here’s the code that sets up the prompts.
system_prompt = """Respond to the user's query based on the conversation
and the context. Respond in the same language as the user's query.
To get more from context from the Wikipedia, indicate a Wiki page and
language code at the end of the repsonse using this format:
>>>search term
>>>language code"""
prompt = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{}<|begin_of_text|><|start_header_id|>user<|end_header_id|>
{}<|begin_of_text|><|start_header_id|>assistant<|end_header_id|>"""
This sets up the same system prompt I used during training and the prompt format that is used by Llama 3. Here’s the code to ask FanFabler about "Curb Your Enthusiasm".
question = "List three ideas for writing fan fiction about Curb Your
Enthusiasm where Larry stumbles upon a time machine."
text = prompt.format(system_prompt, question)
inputs = tokenizer([text], return_tensors = "pt").to("cuda")
text_streamer = TextStreamer(tokenizer, skip_prompt=True)
resulting_tokens = model.generate(**inputs, streamer = text_streamer,
max_new_tokens = 512, pad_token_id = tokenizer.eos_token_id,
do_sample = True)
I used the text streamer functionality, which shows the results in real-time as they come out of the model. Setting do_sample tells the model to create different answers every time it is run.
Results
Here are a couple of interactions with FanFabler.
Curb Your Enthusiasm (English)
Rob: List three ideas for writing fan fiction about Curb Your Enthusiasm where Larry stumbles upon a time machine.
_FanFabler:_Here are three ideas for writing fan fiction about Curb Your Enthusiasm where Larry stumbles upon a time machine:
- Larry travels back in time to one of his past mishaps and tries to rectify the situation, but his actions in the past end up having unintended consequences on his present and future. Chaos ensues as Larry navigates through different eras, creating further mishaps that put his relationships on the line.
- Larry uses the time machine to relive iconic moments from great periods in history, but his meddling with famous figures and events proves disastrous. Larry’s attempts to alter historical outcomes ultimately lead to unforeseen outcomes that put both him and those around him in a state of disarray.
- Larry becomes obsessed with the ability to use the time machine to relive and redo past romantic and professional interactions. As he repeatedly jumps between different timelines, he must grapple with the concept of regret and the moral implications of altering his own destiny and the lives of others.
Rob: What would a good time period be? And who is the primary historical figure?
FanFabler: For a time travel fan fiction story set in the Curb Your Enthusiasm universe, considering a period such as the 1920s during the Prohibition era in the United States could provide an interesting setting. Larry David, as the main character, finds himself transported through time and immersed in a world of flappers, jazz, and organized crime. The primary historical figure could be a character like Al Capone, representing the powerful figure of organized crime during this era. This historical figure could serve as the pivot point for Larry’s time travel adventures, creating an intriguing dynamic between the fictionalized version of Larry David and the historical figure of Al Capone. The cultural and social differences of the time period could add a new layer of humor and exploration to the story.
This is an interesting take. The choice of the 1920s Prohibition era offers a creative and humorous contrast between Larry’s antisocial sensibilities and the historical setting, where his antics can create absurd scenarios with figures like Al Capone. This dynamic has the potential for unique comedic tension, though balancing humor with the historical context is challenging and requires a sensitive approach.
Pippi Longstocking (Swedish)
Here is an interaction in Swedish about fan fiction for Pippi Longstocking. English translations are provided in parentheses.
Rob: Vilka är tre scenarier för fanfiction för Pippi Långstrump? (What are three scenarios for fan fiction for Pippi Longstocking?)
FanFabler: Här är tre möjliga fanfiction-scenarier för Pippi Långstrump:
- Pippi Långstrump möter Pippi Långstrump från en annan dimension, och de två Pingisarna börjar en spännande kapplöpning genom olika världar för att fånga en viktig hemlighet.
- Pippi Långstrump dyker in i en förvandling och upptäcker att hon har ett nytt sätt att kontrollera sitt magiska styrka, vilket utmanar alla att tro på henne.
- Pippi Långstrump stöter på ett mystiskt område kallat Mysteriösa Öarna, där alla har en förbannelse och Pippi måste hitta en sätt att bryta den för att rädda hennes vänner och alla människor i området.
(Here are three possible fanfiction scenarios for Pippi Longstocking:
- Pippi Longstocking meets Pippi Longstocking from another dimension, and the two Pippis begin an exciting race through different worlds to capture an important secret.
- Pippi Longstocking dives into a transformation and discovers she has a new way to control her magical power, challenging everyone to believe in her.
- Pippi Longstocking comes across a mysterious area called the Mysterious Islands, where everyone has a curse and Pippi must find a way to break it to save her friends and all the people in the area.)
Rob: Berätta mer om de två Pippis. Hur skiljer de sig åt? (Tell me more about the two Pippis. How are they different?)
FanFabler: Pippi Långstrump kan ha både gamla och yngre versioner av sig själv i fan fiction. Den gamla Pippi kan vara en pionjär och en spelflicka med en äventyrlig och rebellisk hållning. Hon kan ha genomlevt många äventyr och ha en stor erfarenhet av att stå emot kreativa och osannolika utmaningar. Den yngre Pippi eller Pippi-tjejen skulle kunna vara mer nyfiken och stödja tjejevänner och vänner i deras egna äventyr. Hon skulle ha en annan syn på världen och möjligen vara mer öppenhet inför nyheter och koncept. Deras olika personligheter skulle kunna skapa intressanta dialoger och interaktioner i ett fanfiction berättande. Dessutom skulle deras samverkan kunna skapa rik utveckling för karaktärerna och berättelse. Genom att skildra det kvinnliga sidan av äventyren och hitta västerländskt kvinnokännedomens kraft och kreativitet kunde man skapa en engagerande berättelse med djup inom denna fanfiction.
(Pippi Longstocking can have both old and younger versions of herself in fan fiction. Old Pippi can be a pioneer and a gamer with an adventurous and rebellious attitude. She may have lived through many adventures and have a great deal of experience in facing creative and unlikely challenges. The younger Pippi or Pippi girl could be more curious and support girlfriends and friends in their own adventures. She would have a different view of the world and possibly be more open to news and concepts. Their different personalities could create interesting dialogues and interactions in a fan fiction narrative. Additionally, their collaboration could create rich development for the characters and story. By portraying the female side of the adventures and finding the power and creativity of Western femininity, one could create an engaging story with depth within this fan fiction.)
It’s interesting to have interactions between alternate versions of the main character. An older, experienced Pippi is portrayed as a pioneer with an adventurous spirit, while a younger version brings curiosity and supports her friends in their adventures. The interplay between these two Pippis, with their different personalities and perspectives, could lead to a new spin on character development and dialogue within the fan fiction narrative.
You can experiment with FanFabler using the Google Colab here.
Conclusion
FanFabler builds on the advancements of large language models like Llama 3 to enhance multilingual fan fiction writing. It employs fine-tuning with a curated dataset and techniques like Low-Rank Adaptation (LoRA), tailoring it to support writing in 40 languages. The project underscores the importance of thoughtful data collection and advanced training methods, utilizing GPT 3.5 Turbo and Wikipedia for contextual information. Sharing the results aims to inspire further experimentation in creative AI, showcasing the potential of customized language models for specialized applications.
Source Code and Colab
This project’s code is available on GitHub. The dataset and fine-tuned model are available on Huggingface. I released all of these under the CC BY-SA license.

Acknowledgments
I want to thank Jennifer Lim for her help with this project.
References
[1] Meta, Introducing Meta Llama 3: The most capable openly available LLM to date (2024)
[2] H. Touvron et al., Llama 2: Open Foundation and Fine-Tuned Chat Models (2023)
[3] A. Q. Jiang, et. al, Mistral 7B (2023)
[4] B. Gribben, Fanfiction: A Legal Battle of Creativity (2016), Reporter Magazine
[5] D. M. Eberhard et al., Statistics (2024), Ethnologue: Languages of the World (27th ed.)
[6] H. Touvron et al., LLaMA: Open and Efficient Foundation Language Models (2023)
[7] Edward Hu et al., LoRA: Low-Rank Adaptation of Large Language Models (2021)