
What if, despite all safety measures like advanced configurations, asymmetric cryptography, and port knocking, an attacker gains access to one of my cloud servers?
How and when would I find out? Wouldn’t I want to know as early as possible that one of my servers was compromised?
I kept asking myself these questions and decided to create a solution to monitor active user sessions with Prometheus and Grafana. The goal I had in mind was to get a notification as soon as a new user session via, i.e., SSH was created on one of my servers. Of course, this wouldn’t prevent attackers from gaining access to the system but would at least expose their activities.
You can see the final result in the image below. This is a screenshot of my Grafana instance. The graph shows a red line whenever I logged into my system because an alarm was triggered and shows a green line after I logged off. Every of these state changes triggered a notification via email.

So here’s the story about the newest addition to my Open Source Software (OSS) family. A Prometheus exporter for active users on UNIX systems.
If you’re already familiar with Prometheus and Grafana or want to jump directly to the implementation details of the exporter, click here.
The problem
Our company started to embrace elastic computing around four years ago, in 2017. I was lucky enough to be among the first users of our – at that time – disrupting plan to use cloud infrastructure for our daily business. And it was not until I set up my own private cloud servers that I finally realized that, yes, public cloud servers tend to be publicly available – to anyone on the internet.
This wasn’t a huge deal at work because consuming a service is different from offering it. I didn’t have to care about the security of our virtual machines at work because there was another team responsible for it. Amazon Web Services (AWS) obviously learned this a long time ago and published what it calls the shared responsibility model. This model is about the responsibilities of AWS and its customers in a world where Amazon happily offers access to its servers and services. And while AWS takes care of, i.e., physical access to servers, customers have to take care of virtual access to data stored on their virtual machines.
Now, because we cannot – for obvious reasons – get physical access to AWS’s servers and plug an ethernet cable into it, we use tools and protocols like SSH to access our cloud servers via the public internet. And although there are ways to create private and hybrid networks on cloud providers like AWS or Azure, these features might only be an option for enterprise customers willing to spend time and money to build such networks.
Being a Software Architect with an affinity for software development, I decided last year, around this time in summer, to start building my first Software-as-a-Service (SaaS) product. Not being interested in maintaining physical hardware, I decided to purchase some cloud servers too.
So, I read several articles about configuring OpenSSH and fail2ban to drop all traffic from IPs that tried and failed to log in with OpenSSH. However, I still had a bad feeling about the exposure of my servers. So, I planned to receive some form of email notification for the long term whenever a user logged in via SSH. But monitoring systems were not yet in place.
Time for a side-hustle is limited for everyone that has a full-time job, and you probably have experienced this, too. So it was not until this year in March that I was finally able to set up the monitoring infrastructure with Prometheus and Grafana.
The solution
Prometheus and Grafana are a classic combination of OpenSource tools for monitoring and alerting that abstract away the complexity of storing and visualizing metrics. Together, both systems can collect and receive metrics from services (Prometheus) and visualize the current and historical state of metrics, aggregations, and derivations (Grafana). In addition, both systems can actually define certain thresholds and conditions that will trigger notifications via email, Slack, Discord, etc. What they need to provide their functionality is data.
A data provider can be an application instrumented with a Prometheus library or an exporter in the Prometheus ecosystem. An exporter is a standalone application that collects metrics from another tool or even the host machine and exposes these metrics via an HTTP endpoint.
Because I wanted to collect metrics from the operating system, writing an exporter was the way to go. The missing piece was a UNIX command that returned some information about the currently logged-in users. A quick Google search resulted in various commands like w, who, and users.
Out of these, I found the w
**** command most appealing as it returned not only the name of the currently logged-in users but also – per user –
- the IP address,
- the time of login,
- the name of the current process.
That’s some pretty helpful information for the goal I wanted to achieve and the last missing puzzle piece. So, to receive notifications from the monitoring system whenever a user logged into my servers, I needed to
- call the w command and parse the output, and
- export the metrics with a simple server application over HTTP.
The implementation
As Prometheus itself was built with Go, a lot of exporters are built with Go, too. However, I’m much more familiar with NodeJs, so I chose to implement this exporter with NodeJs.
Parsing the output of the w
command was as trivial as iterating over each line of the output and matching the value of each column to the corresponding column of the header row. The function you can see below expects to receive the command’s output as a string and returns an Array that contains zero or more objects for each logged-in user.
Wanting to extract the number of active sessions per user, I further reduced the Array to an Object. An object that contained a key for each active username, with the value being all sessions associated with this user.
To expose the gathered metrics to the Prometheus server, I chose to use the opentelemetry prometheus-exporter package. This package already contained a PrometheusExporter
server implementation, an HTTP endpoint and a MeterProvider
to create metrics and update their values.
Then I had to glue all components together with
- the exporter server,
- configuration and command-line options,
- and the gathering and parsing of the
w
command output in place.
The final result you can inspect here:
The finishing touches
During my first tests on my cloud machines, I realized I didn’t focus enough on one significant aspect, though. Time.
I mentioned above that Prometheus typically collects metrics from known servers, services, and exporters. It does so by calling the configured metrics endpoint for each job. A job configuration must, therefore, at least contain
- the hostname of the target machine,
- the port of the target application,
- and the path of the metrics endpoint.
Additionally, users can configure the scrape interval that defines how often Prometheus will fetch metrics from a configured endpoint. By default, this interval is one minute.
I also used a configurable interval in the exporter. This interval defined how often the exporter will call the w
command and parse its output. And then, there’s also the duration of each active user session.
To capture as many user sessions, the scrape interval of the exporter and Prometheus itself needed to be as small as possible. Ideally, it would also be possible to stream the output of the w
command to get a new result as soon as it’s available instead of polling it.
I decided to do the following:
- Lower the default scrape interval of the exporter to catch user sessions that last longer than five seconds.
- Cache each collected user session for 60 seconds – ignoring the time the user logged off – to give Prometheus enough time to fetch metrics from the exporter.
And that’s it. I’ve been using the exporter for a few weeks in production, and it’s working fine. Active SSH sessions – only sessions I initiated so far – do indeed result in email notifications.
You can check the source code of the exporter here. If you’re interested in using the exporter, please check the README of the repository for installation instructions. Every release of the exporter contains executables for Linux and Alpine Linux and respective SHA256 checksums.
And even though I implemented the exporter with NodeJS, you don’t have to worry about installing and updating the runtime. Instead, you can download the binary from the latest release and use it right away, because I compiled the exporter to a single executable for Linux and Alpine Linux.
The Verdict
A few days of work, but well worth the time to gain some more transparency about activity on my cloud servers. I added the exporter to all my cloud servers and created a panel and an alarm for all of them in my Grafana dashboard. As I compiled it into a single executable installing it on a UNIX system is a matter of a few lines of code.
I published the source code and binaries on my GitHub account, so feel free to check it out.
GitHub – stfsy/prometheus-what-active-users-exporter: Prometheus exporter that scrapes currently…
Thanks for reading. If you have any feedback or further ideas you can reach out to me via Twitter @stfsy