Monitor With Prometheus, Loki, And Grafana
Introduction
This guide explains how to monitor FusionAuth logs and metrics with the open-source tools Prometheus, Loki, and Grafana, and receive alerts when problems occur.
Please read the FusionAuth monitoring overview for details on FusionAuth metrics, the activities in a complete monitoring workflow, and what Prometheus, Loki, and Grafana are. Review the alternative monitoring services in the overview to ensure that Prometheus is the right tool for your needs.
This guide will show you how to set up Prometheus in Docker containers on your local machine. However, a paid, cloud-hosted alternative is also available from Grafana Cloud.
Initial Architecture
Running FusionAuth and PostgreSQL in Docker usually looks like the diagram below (you might also run OpenSearch in another Docker container).
This diagram shows three components that could die and need monitoring: the PostgreSQL database, FusionAuth, and the app (web server) that directs users to FusionAuth for login. In this guide, you’ll focus on monitoring FusionAuth by adding Prometheus to your setup. Prometheus will poll your FusionAuth instance for errors every fifteen seconds.
These instructions are for monitoring FusionAuth with Prometheus when you are self-hosting. FusionAuth Cloud deployments with more than one compute node round-robin requests to the Prometheus endpoint. Using this endpoint to monitor such a deployment is not recommended. Learn more about this issue here.
Run Prometheus To Monitor FusionAuth
Clone the sample FusionAuth kickstart repository with the command below.
git clone https://github.com/FusionAuth/fusionauth-example-docker-compose.git
cd fusionauth-example-docker-compose/light
Add the following code to docker-compose.yaml
near the end, before the networks:
section, to define a new service. The service uses the Ubuntu Docker image from Docker Hub for Prometheus.
prometheus:
image: ubuntu/prometheus:2.52.0-22.04_stable
platform: linux/amd64
container_name: faProm
depends_on:
- fa
ports:
- 9090:9090
networks:
- db_net
volumes:
- ./prometheusConfig.yml:/etc/prometheus/prometheus.yml
- ./prometheusDb:/prometheus
This service definition specifies that Prometheus starts after FusionAuth, is accessible on port 9090, and saves its database and configuration file in persistent directories on your machine.
Create a prometheusConfig.yml
configuration file containing the content below.
global:
evaluation_interval: 30s
scrape_configs:
- job_name: FusionAuth
scrape_interval: 15s
scheme: http
metrics_path: api/prometheus/metrics
static_configs:
- targets: ["fa:9011"]
basic_auth:
username: "apikey"
password: "33052c8a-c283-4e96-9d2a-eb1215c69f8f-not-for-prod"
This configures Prometheus to collect metrics from FusionAuth every 15 seconds and evaluate the metrics every 30 seconds. Prometheus uses the superuser API key, created by the FusionAuth kickstart configuration files, as password
. For improved security in production, create an API key that has only GET
permissions for the /api/prometheus/metrics
endpoint.
If you prefer to allow unauthenticated access to the Prometheus metrics endpoint in FusionAuth from any local scraper, you can set fusionauth-app.local-metrics.enabled=true
. See the FusionAuth configuration reference for more information.
To learn more about configuring Prometheus, see the documentation.
Run all the containers with docker compose up
. You should be able to log in to FusionAuth at http://localhost:9011 with email address admin@example.com
and password password
, and to Prometheus at http://localhost:9090.
To check that Prometheus has accepted your configuration file as valid, enter the container and use promtool
to validate the YAML file.
docker exec -it faProm /bin/bash
promtool check config /etc/prometheus/prometheus.yml
exit
The metrics FusionAuth exposes to Prometheus change over time. Some basic Java Virtual Machine (JVM) metrics are listed here. You can see exactly what metrics are available on your FusionAuth instance by running the command below.
curl -u "apikey:33052c8a-c283-4e96-9d2a-eb1215c69f8f-not-for-prod" 0.0.0.0:9011/api/prometheus/metrics
# Output:
# HELP HikariPool_1_pool_MinConnections Generated from Dropwizard metric import (metric=HikariPool-1.pool.MinConnections, type=com.zaxxer.hikari.metrics.dropwizard.CodaHaleMetricsTracker$$Lambda$292/0x0000000100449e40)
# TYPE HikariPool_1_pool_MinConnections gauge
HikariPool_1_pool_MinConnections 10.0
# HELP jvm_memory_heap_committed Generated from Dropwizard metric import (metric=jvm.memory.heap.committed, type=com.codahale.metrics.jvm.MemoryUsageGaugeSet$8)
# TYPE jvm_memory_heap_committed gauge
jvm_memory_heap_committed 5.36870912E8
# HELP prime_mvc___api_key_generate__requests Generated from Dropwizard metric import (metric=prime-mvc.[/api/key/generate].requests, type=com.codahale.metrics.Timer)
# TYPE prime_mvc___api_key_generate__requests summary
prime_mvc___api_key_generate__requests{quantile="0.5",} 0.2392109
prime_mvc___api_key_generate__requests{quantile="0.75",} 0.2392109
prime_mvc___api_key_generate__requests{quantile="0.95",} 0.2392109
prime_mvc___api_key_generate__requests{quantile="0.98",} 0.2392109
prime_mvc___api_key_generate__requests{quantile="0.99",} 0.2392109
prime_mvc___api_key_generate__requests{quantile="0.999",} 0.2392109
prime_mvc___api_key_generate__requests_count 1.0
...
If you get no response, add -v
to the command to see what error occurs. If you see 401
, it is likely that your API key is incorrect.
Check what metrics Prometheus scraped from FusionAuth in the Prometheus web interface by browsing to Menu -> Status -> TSDB Status (time-series database).
In the Prometheus web interface, check that FusionAuth is running by browsing to Menu -> Status -> Targets .
See charts of FusionAuth metrics in the Prometheus web interface by browsing to Menu -> Graph . Push Ctrl + Spacebar in the text box to view all the metrics and functions available. Try entering Database_primary_pool_Usage
and clicking Execute.
To monitor all FusionAuth errors, use the expression prime_mvc_____errors_total
. A useful metric to monitor is simply called up
, which has the value 1
if Prometheus successfully scraped its target.
At this point, Prometheus is set up and you can monitor FusionAuth. The rest of this guide will show you how to enhance the system with alerts and an improved dashboard.
Run Alertmanager To Send Alerts
Let’s set up a service to notify you when errors occur in FusionAuth.
The service will check if the prime_mvc_____errors_total
counter has increased in the last minute. If it has, FusionAuth will send a message to a channel that your company can monitor, for example, on Discord or Slack, or by email or SMS.
The simplest and cheapest alert service is ntfy.sh. ntfy is free, but all channels are public, so don’t broadcast secrets.
To see how ntfy works, run the command below in a terminal.
curl -H "Title: Error" -d "A FusionAuth error occurred in the last minute" ntfy.sh/fusionauthprometheus
Browse to the channel at https://ntfy.sh/fusionauthprometheus to see messages.
Now let’s configure Prometheus to send errors automatically using the Prometheus Alertmanager component.
The Prometheus documentation doesn’t say it explicitly, but Alertmanager is not included with Prometheus and must be run separately. This guide runs Alertmanager using the Ubuntu Docker container.
At the time of writing, FusionAuth found an error in the Ubuntu container documentation. The Alertmanager configuration file path is actually /etc/alertmanager/alertmanager.yml
not /etc/prometheus/alertmanager.yml
.
Below is a diagram of the system design with the new components.
Add the code below to the docker-compose.yml
file to include the new Alertmanager container and point the existing Prometheus container to it.
alertmanager:
image: ubuntu/alertmanager:0.27.0-22.04_stable
platform: linux/amd64
container_name: faAlert
ports:
- 9093:9093
networks:
- db_net
volumes:
- ./prometheusAlertConfig.yml:/etc/alertmanager/alertmanager.yml
prometheus:
image: ubuntu/prometheus:2.52.0-22.04_stable
platform: linux/amd64
container_name: faProm
depends_on:
- fa
- alertmanager
ports:
- 9090:9090
networks:
- db_net
volumes:
- ./prometheusConfig.yml:/etc/prometheus/prometheus.yml
- ./prometheusRules.yml:/etc/prometheus/rules.yml
- ./prometheusDb:/prometheus
Update prometheusConfig.yml
to provide Prometheus with the URL of the Alertmanager service and define the rules for when alerts should be sent.
global:
evaluation_interval: 30s
scrape_configs:
- job_name: FusionAuth
scrape_interval: 15s
scheme: http
metrics_path: api/prometheus/metrics
static_configs:
- targets: ["fa:9011"]
basic_auth:
username: "apikey"
password: "33052c8a-c283-4e96-9d2a-eb1215c69f8f-not-for-prod"
rule_files:
- rules.yml
alerting:
alertmanagers:
- static_configs:
- targets:
- "alertmanager:9093"
Create the prometheusRules.yml
file below.
groups:
- name: fusionauthAlerts
rules:
- alert: FusionAuthError
expr: prime_mvc_____requests_count > 0
for: 30s
labels:
severity: error
annotations:
summary: FusionAuth Error Detected
description: A FusionAuth error occurred in the last minute
Here, the expr
expression rule checks the requests metric, not the errors metric, to be sure that a notification is sent in this prototype. In reality, you could use an error metric like increase(prime_mvc_____errors_total[1m]) > 0
.
Finally, create a prometheusAlertConfig.yml
configuration file for the Alertmanager.
route:
receiver: ntfy
repeat_interval: 1m
receivers:
- name: ntfy
webhook_configs:
- url: http://ntfy.sh/fusionauthprometheus
You can rename all these configuration files, as long as you update the filenames in the docker-compose.yaml
file, too.
Push Ctrl + C in the terminal, and then run docker compose up
again to start everything. Check the terminal logs to confirm that Alertmanager started successfully. If it didn’t, check the configuration file and try to restart the service individually with docker compose up alertmanager
.
Confirm that Alertmanager can connect to ntfy manually by running the command below in a new terminal.
curl -X POST -H "Content-Type: application/json" -d '[{"labels":{"alertname":"TestAlert"}}]' http://localhost:9093/api/v2/alerts
If you browse to http://localhost:9093, you should see the alert has arrived. Browse to the Status page to check that Alertmanager has successfully loaded the configuration file.
If you wait a minute and then browse to https://ntfy.sh/fusionauthprometheus, you should see that Prometheus scraped FusionAuth, registered that requests were greater than zero, and sent an alert to Alertmanager, and that Alertmanager sent the alert to ntfy.
Alertmanager sent ntfy raw JSON that includes the annotations
fields from the prometheusRules.yml
configuration. If you would like notifications to look neater, read about templates in Prometheus.
Run Grafana To Create A Dashboard
To display FusionAuth metrics in a set of charts in a dashboard instead of a single Prometheus query, you can use Grafana.
This section will show you how to run Grafana in Docker and create a simple dashboard to monitor FusionAuth.
Below is a diagram of the system design with the new components.
Add the new service below to docker-compose.yml
.
grafana:
image: ubuntu/grafana:11.0.0-22.04_stable
platform: linux/amd64
container_name: faGraf
depends_on:
- prometheus
ports:
- 9091:3000
networks:
- db_net
volumes:
- ./prometheusGrafanaConfig.ini:/etc/grafana/grafana-config.ini
- ./prometheusGrafana/:/data/
# - ./prometheusGrafanaProvisioning/:/conf/
This configuration uses the Ubuntu Grafana container to maintain consistency with the Ubuntu containers used previously.
None of the three volumes in the configuration above are needed for this example, but you will want to use them in production.
- The
/etc/grafana/grafana-config.ini
Grafana configuration file specifies values for settings like security, proxies, and servers. These values are explained in the configuration documentation. Note that the documentation lists various places the configuration file could live inside the container. This path will differ between Docker images and operating systems. By default, the filename/etc/grafana/grafana.ini
is different from thegrafana-config.ini
in this container. If you use a different image for Grafana, look at the Docker logs in the terminal to see where Grafana looks for configuration files when starting. - The
/data
volume stores the Grafana database so that data persists if you restart the container. - The
/conf
directory allows you to automatically provision infrastructure like data sources for Grafana to monitor and dashboards to create. Leave the volume commented out for now. If you uncomment it without having the correct files in your local directory, Grafana will fail to start. To create provisioning files, read the documentation and look at all the sample files in/conf/provisioning/
when the container is running.
Run docker compose up
to start Grafana (or docker compose up grafana
if FusionAuth is already running).
Log in to Grafana at http://localhost:9091 with username and password admin
.
If you want to change the login settings in production, you can create the local file prometheusGrafanaConfig.ini
with the example content below.
[security]
admin_user = admin2
Run docker exec -it faGraf /bin/bash
to log in to the container and view sample files. For example, when in the container, run more /conf/sample.ini
to see all configuration values. Lines starting with ;
are commented out.
Add a dashboard in the Grafana web interface:
- Click Add your first data source in the center of the home page.
- Click Prometheus to add the default Prometheus connection.
- In the Prometheus Connection , enter the URL of the Docker container,
http://prometheus:9090
. - Click Save & test. Grafana should now be able to connect to Prometheus. (Note that this connection retrieves the FusionAuth metrics, which is what we want, not metrics about Prometheus itself.)
- Click Dashboards in the sidebar, and then New dashboard on the right.
- Click Add visualization.
- Select prometheus as the data source.
- Enter the value
prime_mvc_____errors_total
in the Metric browser field at the bottom. Click Run queries, change the panel Title , and then click Apply to save the visualization. - Add another visualization with the value
prime_mvc___admin_login__requests
and save. - Save the dashboard and give it the name
FusionAuth dashboard
. You can rearrange the charts if you would like to. The dashboard should look like the image below.
You can add any other metrics as visualizations. In the browser, search for metrics related to user login
or oauth
to keep track of how your system is used.
If you edit the dashboard as a whole, The JSON Model tab contains the full configuration text for your dashboard, which you can use in the provisioning files referred to earlier to automatically recreate the dashboard in a new instance of Grafana.
You can also create a new dashboard by importing a standard template from the Grafana repository. However, there is no FusionAuth template currently, and FusionAuth does not export all the Java metrics necessary to use the JVM template.
Store Logs In Loki
The final monitoring component you might want to use is Grafana Loki for storing logs. Loki indexes only the metadata of a log line (its time, and attributes such as the server that sent it) and not its content. This is unlike Elasticsearch or OpenSearch, which index the log content, too. Loki therefore uses far less disk space than OpenSearch but is not quickly searchable. The no-indexing choice Loki made is better for most applications, where you need only to monitor logs for errors and store logs for auditing purposes, and don’t need to run frequent queries against old logs.
Loki can run as a single app in a single Docker container or as separate components in multiple containers. In monolithic mode, Loki can handle up to 20 GB per day. This is enough for FusionAuth and is what you’ll use in this guide.
Below is a diagram showing all the components Loki runs in a single container.
You can query logs in Grafana, or in the terminal with the Loki API or LogCLI.
Loki is primarily a log store, and will not fetch logs itself. Tools to send logs to Loki include Promtail (the original sending tool), OpenTelemetry, and Alloy (a new OpenTelemetry-compliant tool from Grafana). For more options, see the documentation. In this guide, you use Promtail for simplicity and stability.
When FusionAuth runs in Docker, it writes logs to the terminal and does not save them to a file or provide them via the API. This means that the logs are not available in the web interface.
To use Loki, add the services below to your docker-compose.yml
file. You are now using Grafana images because Ubuntu has no images for Promtail.
faLoki:
image: grafana/loki:3.0.0
container_name: faLoki
ports:
- 3100:3100
volumes:
- ./prometheusLoki/:/loki/
- ./prometheusLokiConfig.yml:/etc/loki/local-config.yaml
user: root
environment:
- target=all
networks:
- db_net
faPromtail:
image: grafana/promtail:3.0.0
container_name: faPromtail
depends_on:
- faLoki
volumes:
- ./prometheusPromtailConfig.yml:/etc/promtail/config.yml
- /var/run/docker.sock:/var/run/docker.sock
- /var/lib/docker/containers:/var/lib/docker/containers
networks:
- db_net
The faLoki
port 3100 is open so that Grafana can query it. The prometheusLoki
volume persists log storage across container restarts. The prometheusLokiConfig.yml
volume allows you to adjust Loki settings. Unlike the Ubuntu images, Grafana images don’t use the root user. This means that the user in the container won’t have permissions to create files on the Docker host machine. In production, you can inspect the running container to see what user it has, then create the prometheusLoki
directory, and assign the directory owner as the container user. But for this prototype, it’s faster to set the container user to user: root
instead, so the container can directly write to the shared volume. The target=all
configuration runs the Loki container in monolithic mode.
The faPromtail
service waits for Loki to start by using depends_on: faLoki
. The service has volumes for a configuration file and for access to the log files saved by Docker and the Docker socket file.
Use the code below to change the fa
service to make FusionAuth wait for Promtail
to run before FusionAuth starts. If FusionAuth isn’t configured to wait, Loki will not record potential FusionAuth starting errors.
depends_on:
faPromtail:
condition: service_started
fa_db:
condition: service_healthy
You can comment out the prometheusLokiConfig.yml
volume in the faLoki
service configuration to use default values. The default values are fine. But if you want to use Loki with Alertmanager, you should create the file with the contents below (where only the last line differs from the default). Below, the Alertmanager URL now points to the Docker service for the ruler
(rules manager).
auth_enabled: false
server:
http_listen_port: 3100
common:
instance_addr: 127.0.0.1
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://alertmanager:9093
The prometheusPromtailConfig.yml
file controls which containers Promtail will get logs from. It is documented here. Create the prometheusPromtailConfig.yml
file and add the content below.
server:
http_listen_port: 9080
grpc_listen_port: 0
clients:
- url: http://faLoki:3100/loki/api/v1/push
scrape_configs:
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 15s
filters:
- name: name
values: [^fa$]
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
The clients
URL points to the Loki Docker service where Promtail will send logs. The scrape_configs
section describes how Promtail will get logs.
The docker_sd_configs
configuration option is one way for Promtail to get logs (along with local file logs and Kubernetes). It follows the Prometheus configuration format, which uses the Docker container reference format.
The filters
section excludes all containers from having their logs stored other than FusionAuth, which has the regular expression container name ^fa$
(start, fa, end). There is no /
in this name. If you instead used a filter of fa
, the logs of fa_db
would also be stored.
The relabel_configs
section maps the Docker container name to the logs container
metadata, so you can search for it when querying the logs. Note that while your container and service name in the Docker process list is fa
, the name exposed in the Docker API is actually /fa
. You can see the /
used in the regex
above. To see this is true in Docker, run docker inspect fa
. You’ll see the container name is actually "Name": "/fa"
.
Log monitoring is ready. Run docker compose up
to start all monitoring components. Browse to http://localhost:3100/ready to check that Loki is up.
To view the logs in Grafana:
- Browse to Grafana and choose Connections -> Data sources in the sidebar.
- Choose Add new data source and select Loki.
- Enter
http://faLoki:3100
in the URL field - this is the only setting to change. - Click Save and test. If Grafana cannot detect Loki, check that your URL matches the one in your Docker Compose file and that there are no errors in the Docker terminal.
- Click Explore in the sidebar to start browsing your Loki logs.
- Choose Loki as your data source and enter a query value of
{container="fa"}
. - Press Run query to view the logs.
You can filter logs and make complex queries. For example, try {container="fa"} |~ "(ERROR|WARN)"
.
Now that Loki stores FusionAuth logs, you can add log widgets to your Grafana dashboard, and use either Grafana or Loki directly to send alerts to Alertmanager.
Next Steps
In addition to monitoring the Prometheus metrics provided by FusionAuth, you might want to know various custom metrics, such as user login rates and successes. To do this, read the FusionAuth guide to OpenTelemetry and how to use it to create a bash script to collect any metric the FusionAuth API offers.
Final System Architecture
If you combine the Prometheus, Alertmanager, Grafana, Loki, and ntfy infrastructure shown in this guide, your architecture will be as follows.