The world’s leading publication for data science, AI, and ML professionals.

Building a Matching Tool to Help Start-Up Founders Find the Best Incubators: an End-to-End…

A project walkthrough to propose the best incubators for start-up founders, using Python, Pinecone, FastAPI, Pydantic, and Docker

Help Start-Up Founders Find the Best Incubators: an end-to-end project.

Harness, a startup dedicated to assisting founders in their entrepreneurial journey, approached me to develop a tool that aids their community in finding the most suitable incubators: the Matching Tool.

In this article, we walk through the different stages of this project, from the solution design to the delivery.

Context

The company and its cofounders would like to create a tool that enables their community of start-up founders to find the best incubators & accelerators around the globe.

To do so, they manually collected data from incubator websites, including details such as location, various requirements, funding opportunities, and more. Additionally, they leveraged an engaged community of founders.

With the data from incubators and their community, they needed to find a way to retrieve the top-k incubators based on start-up information.

Challenge accepted.

Solution design

Overview

At first glance, the project looked like a Recommender system like Netflix or Amazon used to suggest the best series or products to their users. From user behavior such as clicks, reviews, or upvotes, a company can anticipate and recommend the most suitable product.

Yet, in this particular scenario, we lacked any prior data on a founder’s preferences. Thus, building a Recommender System was unfeasible in this case.

An alternative approach could have involved embedding incubator and startup data into a vector space for a similarity search. Put simply, this method entails measuring the distance between vectors to identify the closest incubators in proximity to a given startup.

But this approach had many defaults in this case.

Incubators have what I call hard criteria, factors that could result in immediate rejection for any startup that doesn’t meet the requirements. This might include not being located in the same city if the incubator requires a hybrid or in-person attendance or the absence of funding.

Those hard criteria would have made the usage of embeddings, the vectorial representation of the data, not a good approach in this case. For example, an incubator could match entirely a start-up, but if applications are not open, this incubator shouldn’t be recommended to the founder.

The presence of these hard criteria makes the use of embeddings on the entire dataset unsuitable in this case. For instance, even if an incubator perfectly aligns with a startup, recommending it to the founder would not be appropriate if applications are not currently open.

Finally, even if the majority of features could be transformed into numerical values (funding amount, previous funding amount accepted, start-up revenue expectation) or into categories (countries, attendance requirement, MVP ready), some features were just impossible to categorize due to their diversity:

  • funding vehicle: grant, 140k$, equity (SAFE), …
  • industry focus: medtech, AI, fintech, …

Additionally, these features had to be taken into account in the matching tool but might not have been deemed as hard criteria. For instance, a founder might choose an incubator specializing in health tech and still be open to accepting a biotech startup.

A hybrid approach

To solve these problems, let’s consider the best of both worlds.

If some incubator’s hard criteria would result in a mismatch, a possibility would be to filter those incubators based on the start-up information. After narrowing down the list of potential matches, we can perform a similarity search using the remaining soft criteria transformed into a unified text and embedding it into a vector.

And good news: Pinecone provides this feature to its vector database!

The Missing WHERE Clause in Vector Search | Pinecone

The project path is now clear:

  1. Incubators’ data need to be preprocessed to enable filtering with the hard criteria and similarity search with the soft criteria. The data is then stored on a Pinecone vector database.
  2. The filter object has to be built with respect to the Pinecone Python library. Also, it needs to stay flexible enough to let the client modify easily the criteria without modifying the algorithm.
  3. The soft criteria need to be unified and transform an embedding format, using an appropriate embedding model.
  4. Data being key here, we need to implement a Data validation step for the start-up information but also for upserting new incubator data to the vector database. We’ll use Pydantic.
  5. The algorithm will be served as an API in a docker container. We’ll use FastAPI and create a Dockerfile to ensure the code works no matter the environment.
  6. Bonus: Unitests and Integration tests will be set up to enable anyone to modify the code in a CI/CD manner.

All these points were discussed with the stakeholders and were accepted.

We’re ready to go!

Preprocessing the data

I received the incubators’ parsed information in a spreadsheet. At first sight, the data is quite chaotic: manual extraction without a clear process, string instead of boolean, lack of consistency within the same feature, …

There is a lot of work to make the data usable.

Regarding null values in the dataset, each feature was treated independently.

For example, attendance requirements could be in-person, hybrid, or remote. In this case, incubators for which this feature was missing were considered as requiring in-person attendance.

Another example was the incorporation of the start-up: incorporated or unincorporated. Instead of picking those 2 categories, it would be more logical to add a third category as a default value: regardless. This will be useful during the filtering stage to not only pick one of the main categories but also pick all the incubators that don’t precise it. We’ll talk about it in the Filtering section.

Finally, we transform the soft criteria into a single prompt to be embedded. For this, we simply used a prompt template. If later in the project new features need to be added, this prompt just needs to be updated.

# config.py
class Templates:
    embedding_template = """Industries accepted:
{industry_focus}

Funding vehicle:
{funding_vehicle}"""

Templates.embedding_template.format(
    industry_focus=industry_focus, 
    funding_vehicle=funding_vehicle
)

Once the incubator data was preprocessed, it was then exported to the Pinecone vector database.

Build the vector database with incubator data

Pinecone provides an easy-to-use Python SDK to insert, modify, and query data from the vector database.

In our case, we need to upsert (insert or update) a vector representing the soft criteria in addition to the hard criteria.

According to Pinecone, the data should respect the following format:

# List[(id, vector, metadata)]
[
  ("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], {"genre": "comedy", "year": 2020}),
  ("B", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2], {"genre": "documentary", "year": 2019}),
  ("C", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3], {"genre": "comedy", "year": 2019}),
  ("D", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4], {"genre": "drama"}),
  ("E", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], {"genre": "drama"})
]

Embedding

There are many models, open-source or not, to embed texts into a vectorial representation. In this case, we’ll use sentence-bert, a Python library designed to exploit open-source embedding models. You can check one of my previous articles where I describe how it works:

Semantic search using Sentence-BERT

The simplicity of this library makes it a good choice for building the first version of the matching tool.

# pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer

class SentenceTransformersEmbedding:
    """Embedding using the SentenceTransformers library (https://www.sbert.net)"""

    def __init__(
            self,
            model_name: str = "all-MiniLM-L6-v2"
        ) -> None:
        self.model = SentenceTransformer(model_name)

    def get_embeddding(self, texts: Union[str, List[str]]) -> List:
        # We need to return a list instead of an array for Pinecone
        return self.model.encode(texts).tolist()

Prepare and export the incubator data.

To upsert new incubators into the vector database, we prepare the data as introduced in the Pinecone documentation.

def prepare_from_payload(self, incubators: List[Incubator]) -> List[Tuple[str, List[float], Mapping[str, Any]]]:
        """Prepare payload containing incubators data to export to Pinecone vector database.

        Args:
            incubators (List[Incubator]): List of Incubator containing the incubator information that will be sent to Pinecone. 

        Returns:
            List[Tuple[str, List[float], Mapping[str, Any]]]: Prepared data for Pinecone. Check official documentation (https://docs.pinecone.io/docs/metadata-filtering#inserting-metadata-into-an-index). 
        """
        data = []
        for incubator in incubators:
            metadata = {key: value for key, value in incubator.model_dump(exclude={"incubator_id"}).items()}
            additional_information_text = Templates.embedding_template.format(incubator.industry_focus, incubator.funding_vehicle)
            embedding = self.embedding_generator.get_embeddding(additional_information_text)
            incubator_data = (incubator.incubator_id, embedding, metadata)
            data.append(incubator_data)
        return data

As you can see in the code, we created an Incubators object with Pydantic BaseModel .

from pydantic import BaseModel
from datetime import date

class Incubator(BaseModel):
    incubator_id: str
    name: str 
    application_open: int = 1
    next_deadline: date = date.max
    funding_amount: int = 0 # Maximal amount the incubator can fund
    attendance_requirement: Literal["in-person", "remote", "hybrid"] = "in-person"
    incorporation: Literal["incorporated", "unincorporated"] = "regardless"
    minimum_cofounders: int = 0
    minimum_employees: int = 0
    previous_funding_accepted: int = 1
    ...

class Incubators(BaseModel):
    incubators: List[Incubator]

This BaseModel class has two main benefits. Not only does it ensure the data is in the correct format for our algorithm and queries, but it also defines a default schema for the incubator data.

print(Incubator(
  incubator_id="id", 
  name="incubator_on_fire",
  industry_focus="Health tech",
  funding_vehicle="Grant"
))

# Output
{
    'id': 'id'
    'name': 'incubator_on_fire', 
    'application_open': 1, 
    'next_deadline': datetime.date(9999, 12, 31), 
    'funding_amount': 0, 
    'attendance_requirement': 'in-person', 
    'incorporation': 'regardless', 
    'minimum_cofounders': 0, 
    'minimum_employees': 0, 
    'woman_founders': 0,
    'student_founders': 0,
    'industry_focus': 'Health tech',
    'funding_vehicle': 'Grant'
     ...
}

The incubator data was then exported to the vector database using the Pinecone Python library. To allow other developers to implement this code within the overall architecture of the application, we used FastAPI:

import os

from fastapi import FastAPI, HTTPException
from app.models import Incubators

from features import FeatureEngine
from embedding import SentenceTransformersEmbedding

PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
ENVIRONMENT = os.getenv("ENVIRONMENT")

app = FastAPI()

@app.post("/upsert")
def upsert(incubators: Incubators):
    try:
        embedding_generator = SentenceTransformersEmbedding()
        feature_engine = FeatureEngine(embedding_generator=embedding_generator)
        data = feature_engine.prepare_from_payload(incubators=incubators.incubators)
        vectors = [pinecone.Vector(id=id, values=values, metadata=metadata) for id, values, metadata in data]
        pinecone.init(api_key=PINECONE_API_KEY, environment=ENVIRONMENT)
        index = pinecone.Index(index_name=VectorDatabaseConfig.index_name)
        index.upsert(vectors=vectors)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Once the data was exported, we were able to start querying the vector database using the start-up information.

Build the matching algorithm

The algorithm executes the top-k incubators retrieving in two steps:

  1. Filter the irrelevant incubator,
  2. Perform the similarity search with the embedding vectors.

We also need to ensure that the algorithm stays flexible enough to add or change any data later in the project without touching the core of the algorithm.

But how to do it?

Here’s the solution I came up with:

Pinecone uses the same language as MongoDB to filter the database [source]. It looks like this:

import pinecone

pinecone.init(api_key=PINECONE_API_KEY, environment=ENVIRONMENT)
index = pinecone.Index("example-index")
index.query(
    vector=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
    filter={
        "genre": {"$eq": "documentary"},
        "year": 2019
    },
    top_k=5,
    include_metadata=True
)

The filter map can also be more elaborate:

# $in statement
{
  "genre": { "$in": ["comedy", "documentary", "drama"] }
}

# Multi criteria
{
  "genre": { "$eq": "drama" },
  "year": { "$gte": 2020 }
}

# $or statement
{
  "$or": [{ "genre": { "$eq": "drama" } }, { "year": { "$gte": 2020 } }]
}

By implementing the start-up information in the query, we were able to retrieve the incubator whose requirements match: $gtegreater than, $eqequal, etc

But some cases were more complex.

For example, the location and attendance requirements work in pairs. If an incubator accepts only hybrid or in-person, the start-up should logically be located in the same city/country as the incubator. But the matching tool should also present all incubators that accept remote, no matter where the start-up is located.

Another example: let’s say the start-up is led by women founders, or the start-up has built an MVP. Thus, start-ups with this statement True should be proposed incubators that accept women founders only, or that require an MVP, in addition to including all other incubators.

As you can see from these examples, criteria can be categorized into different "templates" called Criterion. These criteria templates will serve the building of the filter_object , which is the filter mapping used by Pinecone/MongoDB.

Using Python classes, it looks like this:

class Criterion(ABC):
        """Incubators criterion template used to build the filter object.
        Each subclass of this class is a specific rule case used incubators and start-ups data.

        Args:
                name (str): incubators metadata name as it is in the vectordatabase.
        """
        def __init__(
                self,
                name: str,
        ) -> None:
                self.name = name

class NormalCriterion(Criterion):
    """Basic rule for creating to filter data based on this criterion.
    It takes this form:

    ```python
    criterion.name = {criterion.condition_type: payload[criterion.startup_correspondance]}
With `payload` the start-up information.

Example:
```python
max_funding_amount = {$gte: 10000}
```
This will filter all incubators with a maximal funding capacity greater than 10000.

Args:
    condition_type (str): comparison element like "$eq" (equal), "$lte" (lower than or equal), "$gt" (greater than)

The complete list is available on the pinecone documentaton (https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language). startup_correspondance (str): start-up correspondance from the payload """ def init( self, name: str, condition_type: str, startup_correspondance: str ) -> None: self.condition_type = condition_type self.startup_correspondance = startup_correspondance super().init(name=name)


The parent class object `Criterion` is used to build several sub-classes, representing each case. If we take the _woman founders/MVP_ case introduced above:

```python
class InclusiveCriterion(Criterion):
    """If condition validated, considers all.

    Example:

    Being women founders should match women-founders-only incubators, but also the other incubators.
    Same for MVP, Ready_to_pay, Students founders, etc...
if woman_founders_startup (False) != condition (True):
    {"woman_founders_incubator": {"$eq": woman_founders_startup_value (false)}}

Args:
    condition_type (str): comparison element like "$eq" (equal), "$lte" (lower than or equal), "$gt" (greater than)
The complete list is available on the pinecone documentaton (https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language).
    startup_correspondance (str): start-up correspondance from the payload (see matching_tool/app/models.py)
    condition (bool): if condition validated, consider the criterion
"""
def __init__(
        self,
        name: str,
        condition_type: str,
        startup_correspondance: str,
        condition: bool
    ) -> None:
    self.condition_type = condition_type
    self.startup_correspondance = startup_correspondance
    self.condition = condition
    super().__init__(name)

Those `Criterion` classes are used along their respective method to build the `filter_object` :

```python
def normal_case(
    payload: Mapping, 
    criterion: NormalCriterion, 
    filter_object: Dict
) -> Dict:
    """Simplest case: take the start-up value (funding amount, previous funding, etc...) and filter the vectordatabase in respect of
    the condition_type ($eq, $lte, $gte, $gt, ...)

    Args:
        payload (Mapping): start-up information
        criterion (NormalCriterion): normal criterion 
        filter_object (Dict): the metadata filter during the vectordatabase query

    Returns:
        Dict: 
        ```python
        {metadata_name: {condition_type: startup_value}}
"""
filter_object[criterion.name] = {
    criterion.condition_type: payload[criterion.startup_correspondance]
}
return filter_object

---

```python
def inclusive_case(
        payload: Mapping,
        criterion: InclusiveCriterion,
        filter_object: Dict
) -> Dict:
    """Inclusive case: prepare filter for inclusive case: women founder, student founders, MVP, other costs...
    If condition respected (women founders in startup == 1), therefore don't consider the criterion for filter => Take everything (incubators acccepting women only and all the others)
    Else: consider only incubators with not women founders => {women_founders: {"$eq: 0}} 

    Args:
        payload (Mapping): start-up information
        criterion (NormalCriterion): normal criterion 
        filter_object (Dict): the metadata filter during the vectordatabase query
    """

    if payload[criterion.startup_correspondance] != criterion.condition:
        filter_object[criterion.name] = {criterion.condition_type: payload[criterion.startup_correspondance]}
    return filter_object

All these Criterion classes are stored inside another class object we call Criteria . This class acts as a repository of all the criteria to consider for filtering the database and can be easily modified to add or remove any criterion.

class Criteria:
    """Filter using Criterion templates.

    Add or remove any criterion you want with the adequate Criterion template.
    """
    country = DependendantCriterion(
        name="country",
        condition_type="$eq",
        startup_correspondance="country"
    )
    city = DependendantCriterion(
        name="city",
        condition_type="$eq",
        startup_correspondance="city"
    )
    attendance_requirement = ConditionalCriterion(
        name="attendance_requirement",
        condition=["remote"],
        true_criteria=[],
        else_criteria=[country, city]
    )
    minimum_cofounders = NormalCriterion(
        name="minimum_cofounders",
        condition_type="$lte",
        startup_correspondance="n_cofounders"
    )
    working_product_requirement = InclusiveCriterion(
        name="working_product_requirement",
        condition_type="$eq",
        startup_correspondance="working_product",
        condition=True
    )
    woman_founders = InclusiveCriterion(
        name="woman_founders",
        condition_type="$eq",
        startup_correspondance="woman_founders",
        condition=True
    )
...

Once all the criteria are added to the Criteria object, we iterate over it and build the filter_object based on the start-up information. For each Criterion case, we add a filter element to the filter_object .

class Matcher:
    "Retrieve incubators that match a start-up information from the vector database."

    def __init__(
        self,
        index: Index,
        criteria: Criteria = Criteria(),
        embedder: Embedding = SentenceTransformersEmbedding(),
    ) -> None:
        """
        Args:
            index (Index): vector database index / table
            criteria (Criteria, optional): Incubators metadata to perform the search. Defaults to Criteria().
            embedder (Embedding, optional): Embedding method to transform text in a vectorial representation for
        semantic search. Defaults to SentenceTransformersEmbedding().
        """
        self.index = index
        self.criteria = criteria
        self.embedder = embedder

  def _get_filter(
          self,
          payload: Dict[str, Any],
      ) -> Mapping[str, Any]:
          """Build the dictionnary for filtering metadata on Pinecone.

          The filter objecy should respect the following format. Check the official Pinecone documentation to know more about it:
          https://docs.pinecone.io/docs/metadata-filtering

          Args:
              payload (Dict[str, Any]): start-up information

          Returns:
              Mapping[str, Any]: filter object

          ```bash
          filter={
              'application_open': 1,
              '$or': [{'attendance_requirement': {'$in': ['remote']}}, {'country': {'$eq': 'estonia'}, 'city': {'$eq': 'tallinn'}}],
              'funding_amount': {'$gte': 12000},
              'other_costs': {'$eq': 0},
              'previous_funding_accepted': {'$eq': 1},
              'working_product_requirement': {'$eq': 0}
          }
      """
      # Initial filter
      filter_object = {"application_open": 1}

      criteria = self.criteria.get_criteria()
      for criterion in criteria:
          if isinstance(criterion, NormalCriterion):
              if check_correspondance_in_payload(payload, criterion):
                  filter_object = normal_case(
                      payload=payload,
                      criterion=criterion,
                      filter_object=filter_object,
                  )
          if isinstance(criterion, InclusiveCriterion):
              if check_correspondance_in_payload(payload, criterion):
                  filter_object = inclusive_case(
                      payload=payload,
                      criterion=criterion,
                      filter_object=filter_object,
                  )
        if isinstance(criterion, ConditionalCriterion):
            if check_dependencies(payload, conditional_criterion=criterion):
                filter_object = conditional_case(
                    payload=payload,
                    criterion=criterion,
                    filter_object=filter_object,
                )
          if isinstance(criterion, DefaultCriterion):
              if check_correspondance_in_payload(payload, criterion):
                  filter_object = default_case(
                      payload=payload,
                      criterion=criterion,
                      filter_object=filter_object,
                  )
      return filter_object

As you can see in the code, we built four different `Criterion` templates to consider many cases: `NormalCriterion` , `InclusiveCriterion` , `ConditionalCriterion` , and `DefaultCriterion` .

In the future of the project, more categories can be added without changing the algorithm core, making it **customizable**.

Once the `filter_object` is created with the `_get_filter()` method, the vector database can be queried with the Pinecone `index.query()` method:

```python
matches = self.index.query(
            vector=embedding, 
            filter=filter_object, 
            include_metadata=True, 
            top_k=top_k
        )

The matching tool algorithm is created. We then served it through an API endpoint using FastAPI and Pydantic.

@app.post("/match")
def search(payload: StartUp, top_k: int = 5) -> Mapping:
    LOGGER.info("Start matching.")
    try:
        payload = preprocess_payload(dict(payload))
        pinecone.init(api_key=PINECONE_API_KEY, environment=ENVIRONMENT)
        index = pinecone.Index(index_name=VectorDatabaseConfig.index_name)
        matching_tool = Matcher(index=index)
        matches = matching_tool.match(payload=payload, top_k=top_k)
        return matches
    except Exception as e:
        LOGGER.error(f"{str(e)}")
        raise HTTPException(status_code=500, detail=str(e))

As Incubator built with Pydantic, we created the object Startup object to ensure the start-up data comes in the right format:

class StartUp(BaseModel):
    country: Optional[str] = None
    city: Optional[str] = None
    funding_amount: Optional[int] = None
    n_cofounders: Optional[int] = None
    n_employees: Optional[int] = None
    woman_founders: Optional[bool] = None
    industry_focus: str = ""
    funding_vehicle: str = ""
    ...

An advantage of using Pydantic with FastAPI is that the API payload (here the start-up information) doesn’t have to be complete. For example, if there is missing information, Pydantic will automatically replace it with its default value, or not consider it at all in the algorithm (defined by the None statement).

The core of the API is now set up. We can now make the code ready for shipment using Docker and CI/CD with Pytest.

Delivering the API

Integration test with Pytest

During the development of the code, unitests and integration tests were created to ensure no modifications would break the algorithm.

Furthermore, creating the test algorithms not only provides a CI/CD process but also gives my client indications about how the code is supposed to work.

To build an integration test with FastAPI, we used the TestClient provided within the library. It uses the httpx library instead of requests making a call to the API.

The data used as validation of the code is stored in an external JSON file data/integration_test_data.json

# integration_test.py

# pip install httpx
from fastapi.testclient import TestClient

URL = "/match"
client = TestClient(app)

DATA_PATH = Path(os.path.realpath(__file__)).parent / "data/integration_test_data.json"
with open(DATA_PATH, 'r') as data: 
    DATA = json.load(data)

def test_match():
    for test in DATA["match_tests"]:
        response = client.post(URL, json=test["payload"])
        assert response.status_code == 200
        payload: Dict = json.loads(response.content)
        match_ids = [match["incubator_id"] for match in payload.values()]
        for expected_id in test["expected"]:
            assert expected_id in match_ids

Once all tests passed, we created the Dockerfile to containerize the code.

Docker

To create a Docker container, we simply create a Dockerfile within the repository:

FROM python:3.9

WORKDIR /src

ENV PYTHONPATH=/src

COPY requirements.txt requirements.txt
COPY matching_tool/ .

RUN pip install -r requirements.txt

EXPOSE 8001

CMD ["uvicorn", "app.api:app", "--host", "0.0.0.0", "--port", "8001"]

Here’s what each line does:

  • FROM import the docker image from the hub with all the basic elements required to run Python 3.9 in this case.
  • WORKDIR specifies the location of the code within the container
  • ENV PYTHONPATH = /src specifies which directory Python has to look into to import internal modules.
  • COPY copies the files in the attributed directory.
  • RUN is triggered during the Docker image creation, and before the Docker container build. This way, pip install -r requirements.txt only runs once.
  • EXPOSE exposes a container port of our choice, here’s the port 8001. The API port should match the container port.
  • CMD ['uvicorn", "app.api.app", " - host", "0.0.0.0", " - port 8001]runs the FastAPI API. It is important here to indicate the host as 0.0.0.0 to enable calls from outside the container.

We then created the Docker image by running in the CLI:

 docker build -t matching-tool:latest -f Dockerfile .

Finally, to run the container, one has just to write:

docker run -p 8001:8001 --name matching-tool matching-tool

Once the container is running, anyone can call the API through the port 8001. It is also possible to deploy the Docker container to any cloud provider, making the Matching Tool instantly functional.

The project was ready to be delivered.

Conclusions

In this article, I shared a real project I carried out for an American start-up.

From the data I was provided with, in addition to several iterations with the stakeholders, I developed a tool for start-up founders to find the best incubators regarding their needs. I explained step by step the process I followed and the different strategies I used to solve this problem.

The next step will be to embed this algorithm into the overall application and start collecting the user’s data. This will initiate the flywheel necessary in any machine learning feature. Indeed, from this data representing their users’ preferences, it will be possible to build a Recommender System that will learn over time, and propose the best output for each present and future founders.

It was a pleasure to work with Harness on this project. I wish them the best. They know they can call me for future collaborations.


If you like this article, feel free to join my newsletter. I share my content about NLP, MLOps, and entrepreneurship.

You can reach out to me on Linkedin, or check my Github.

If you’re a business and want to implement Machine Learning into your product, you can also book a call.

See you around and happy coding!


Related Articles