架构图
本地安装loki
# 下载loki二进制文件
cd /usr/local/src/ && wget https://github.com/grafana/loki/releases/download/v2.9.1/loki-linux-amd64.zip
# 解压
tar xvzf loki-linux-amd64.zip
# 授权
chmod +x loki-linux-amd64
mv loki-linux-amd64 /usr/bin/loki
# 编写loki service脚本
vim /etc/systemd/system/loki.service
[Unit]
Description=Loki service
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/loki -config.file /etc/loki/config.yml
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec = 120
Restart = on-failure
RestartSec = 2
[Install]
WantedBy=multi-user.target
# 创建loki配置文件
mkdir /etc/loki
vim /etc/loki/config.yml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
instance_addr: 172.21.61.21
path_prefix: /tmp/loki
#storage:
# filesystem:
# chunks_directory: /tmp/loki/chunks
# rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: aws
schema: v11
index:
prefix: index_
period: 24h
storage_config:
# 将chunk文件存储于minio集群的loki桶内,chunk文件的生命周期由桶进行管理
aws:
# Note: use a fully qualified domain name, like localhost.
# full example: http://loki:supersecret@localhost.:9000
s3: http://loki:1ZEPLaztqSMcCSICXO4f@172.21.61.5:19000/loki
s3forcepathstyle: true
boltdb_shipper:
active_index_directory: /loki/boltdb-shipper-active
cache_location: /loki/boltdb-shipper-cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: aws
ruler:
alertmanager_url: http://172.21.61.24:9093
# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# https://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
#analytics:
# reporting_enabled: false
# 存储桶的权限配置如下:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LokiStorage",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::<account_ID>"
]
},
"Action": [
"s3:ListBucket",
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::<bucket_name>",
"arn:aws:s3:::<bucket_name>/*"
]
}
]
}
# 启动服务
systemctl daemon-reload && systemctl enable --now loki
此时可以通过lokicli进行访问测试:
文档参考地址:https://grafana.com/docs/loki/latest/query/logcli/
通过grafana连接loki,需要注意grafana的版本,使用2.9.1的loki版本时,通过grafana 6、grafana 7访问连接loki,均出现可以连接,但没有找到标签的报错。通过grafana 9及其以上版本,连接成功
收集docker日志
可以通过promtail进行日志收集,但经过测试,对于docker服务,日志收集效果很差且不便管理,所以官方推荐通过 docker的插件:loki-docker-driver
来进行日志的采集及推送:
安装docker插件
在线安装
docker plugin install grafana/loki-docker-driver:main --alias loki --grant-all-permissions
离线安装
docker plugin不支持离线安装,可以通过在线安装的方式先下载安装,再将插件目录拷贝至需要离线安装的机器。具体步骤:
- 在线安装,默认docker的插件存储目录为:
/var/lib/docker/plugins/
- 打包文件,
cd /var/lib/docker/plugins/ && tar czvf loki-docker-driver.tar.gz xxxxxx
- 拷贝至离线环境,解压到
/var/run/docker/plugins
目录下
测试过在docker 23.x版本在线安装,20.x版本离线安装,插件可用,但以免意外,尽可能保持版本一致。
配置docker
修改docker配置文件:/etc/docker/daemon.json ,添加修改如下内容:
{
"debug": true,
"log-driver": "loki",
"log-opts": {
"loki-url": "http://172.21.61.21:3100/loki/api/v1/push",
"max-size": "50m",
"max-file": "10"
},
"insecure-registries":
["......"]
}
关于loki-docker-driver的具体配置说明,请看官方文档:https://grafana.com/docs/loki/latest/send-data/docker-driver/configuration/
重启docker
systemctl restart docker
本次安装在swarm环境进行的,重启docker前需要驱逐对应node上面的容器。
命令如下:
docker node update --availability drain ${nodeId}
重启后需要恢复节点状态:
docker node update --availability active ${nodeId}
日志清理
可以通过loki提供的压缩器进行日志的留存时间设置,在loki配置文件中定义:
compactor:
retention_enabled: true
compaction_interval: 30m
retention_delete_delay: 5m
retention_delete_worker_count: 100
limits_config:
retention_period: 180d # 保留180天
访问
此时可以在grafana上查看日志信息。