Press "Enter" to skip to content

Grafana+Loki搭建日志系统

Grafana+Loki搭建日志系统

背景: 之前搭建过elasticsearch+kibana+filebeat的日志系统,由于日志存储的位置是动态的,filebeat采集目录需要对应更换,当执行dockerc-compose down;docker-compose up;原容器ID会变,存储的目录也会变,所以觉得实用性不大。所以使用grafana+loki搭建日志系统,将容器的日志推送到loki上,然后使用grafana配置loki数据源即可~

1.构建docker-compose.yaml文件

version: "3"
services:
  # Loki 日志搜集
  loki:
    image: grafana/loki:2.4.1
    container_name: grafana_loki
    depends_on:
      - tempo
    ports:
      - "31000:3100"
    volumes:
      - ./loki/etc:/etc/loki
    environment:
      - JAEGER_AGENT_HOST=tempo
      - JAEGER_ENDPOINT=http://tempo:14268/api/traces # send traces to Tempo
      - JAEGER_SAMPLER_TYPE=const
      - JAEGER_SAMPLER_PARAM=1
    networks:
      - grafana
    restart: "always"

  # Promtail 日志搜集
  promtail:
    image: grafana/promtail:2.4.1
    container_name: grafana_promtail
    networks:
      - grafana
    restart: "always"
    volumes:
      - /var/log:/var/log
      - ./promtail/promtail-config.yaml:/etc/promtail/config.yaml
    command: -config.file=/etc/promtail/config.yaml

  # Tempo 日志追踪
  tempo:
    image: grafana/tempo:1.2.1
    command: ["-config.file=/etc/tempo.yaml"]
    container_name: grafana_tempo
    volumes:
      - ./tempo/tempo-local.yaml:/etc/tempo.yaml
    ports:
      - "14268:14268" # jaeger ingest
      - "3200:3200" # tempo
      - "55680" # otlp grpc
      - "55681" # otlp http
      - "9411" # zipkin
    restart: "always"
    networks:
      - grafana

  # Grafana 界面查看
  grafana:
    image: grafana/grafana:8.2.6
    container_name: grafana
    ports:
      - "30000:3000"
    volumes:
      - ./grafana/data:/var/lib/grafana
      - ./grafana/conf:/usr/share/grafana/conf
    networks:
      - grafana
    restart: "always"

  # Prometheus 指标监控
  prometheus:
    image: prom/prometheus:v2.31.1
    container_name: "prometheus"
    ports:
      - "9090:9090"
    volumes:
      - "./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml"
    networks:
      - grafana
    restart: always

  # 站点监控
  blackbox-exporter:
    image: prom/blackbox-exporter:v0.19.0
    container_name: "blackbox_exporter"
    ports:
      - "9115:9115"
    volumes:
      - "./blackbox_exporter/config.yml:/etc/blackbox_exporter/config.yml"
    networks:
      - grafana
    restart: always

  # 系统监控
  node_exporter:
    image: prom/node-exporter:v1.3.1
    container_name: node_exporter
    command:
      - "--path.rootfs=/host"
    network_mode: host
    pid: host
    restart: unless-stopped
    volumes:
      - "/:/host:ro,rslave"

networks:
  grafana:
    external: true

启动时,某些容器会失败,原因是配置文件没加载,针对上述容器,我们列举配置文件

1742282284601

  • blackbox_exporter文件夹下面config.yml

    modules:
      http_2xx:
        prober: http
      http_post_2xx:
        prober: http
        http:
          method: POST
      tcp_connect:
        prober: tcp
      tls_connect:
        prober: tcp
        timeout: 5s
        tcp:
          tls: true
          tls_config:
            insecure_skip_verify: true
      pop3s_banner:
        prober: tcp
        tcp:
          query_response:
          - expect: "^+OK"
          tls: true
          tls_config:
            insecure_skip_verify: false
      ssh_banner:
        prober: tcp
        tcp:
          query_response:
          - expect: "^SSH-2.0-"
          - send: "SSH-2.0-blackbox-ssh-check"
      irc_banner:
        prober: tcp
        tcp:
          query_response:
          - send: "NICK prober"
          - send: "USER prober prober prober :prober"
          - expect: "PING :([^ ]+)"
            send: "PONG ${1}"
          - expect: "^:[^ ]+ 001"
      icmp:
        prober: icmp
    
    • loki目录下 loki/etc/local-config.yaml

      auth_enabled: false
      
      server:
        http_listen_port: 3100
      
      common:
        path_prefix: /loki
        storage:
          filesystem:
            chunks_directory: /loki/chunks
            rules_directory: /loki/rules
        replication_factor: 1
        ring:
          instance_addr: 127.0.0.1
          kvstore:
            store: inmemory
      
      schema_config:
        configs:
          - from: 2020-10-24
            store: boltdb-shipper
            object_store: filesystem
            schema: v11
            index:
              prefix: index_
              period: 24h
      
      ruler:
        alertmanager_url: http://localhost:9093
      storage_config:
        boltdb:
          directory: /tmp/loki/index
        filesystem:
          directory: /tmp/loki/chunks
      limits_config:
        enforce_metric_name: false
        reject_old_samples: true
        reject_old_samples_max_age: 168h
      # 表的保留期7天
      table_manager:
        retention_deletes_enabled: true
        retention_period: 168h
      
    • promtail目录下的promtail-config.yaml

      server:
        http_listen_port: 9080      # Promtail HTTP 监听端口
        grpc_listen_port: 9095       # Promtail gRPC 监听端口
      
      positions:
        filename: /var/promtail/positions.yaml  # 存储日志位置文件的路径
      
      clients:
        - url: http://loki:3100/loki/api/v1/push  # Loki 的推送端点
      
      scrape_configs:
        - job_name: system_logs  # 作业名称
          static_configs:
            - targets:
                - localhost  # 目标地址
              labels:
                job: varlogs  # 标签
                __path__: /var/log/*.log  # 要抓取的日志路径
      
        - job_name: container_logs  # 另一作业名称适用于 Docker 容器日志
          pipeline_stages:
            - docker: {}  # 处理 Docker 格式的日志
          kubernetes_sd_configs:
            - role: pod  #  Kubernetes 中获取 Pod 日志
          relabel_configs:
            - source_labels: [__meta_kubernetes_namespace]
              action: replace
              target_label: namespace
            - source_labels: [__meta_kubernetes_pod_name]
              action: replace
              target_label: pod
            - source_labels: [__meta_kubernetes_container_name]
              action: replace
              target_label: container
            - source_labels: [__address__]
              action: replace
              target_label: __host__
              replacement: $1  # 用于替换的地址
      
    • tempo文件夹下的tempo-local.yaml

      # tempo.yaml 最小化可用配置 v1.2.1
      server:
        http_listen_port: 3200  # 默认HTTP端口
      
      distributor:
        receivers:
          otlp:
            protocols:
              grpc:
                endpoint: 0.0.0.0:4317
              http:
                endpoint: 0.0.0.0:4318
          jaeger:
            protocols:
              thrift_http:
                endpoint: 0.0.0.0:14268
              grpc:
                endpoint: 0.0.0.0:14250
              thrift_binary:
                endpoint: 0.0.0.0:6832
              thrift_compact:
                endpoint: 0.0.0.0:6831
          zipkin:
            endpoint: 0.0.0.0:9411
        ring:
          kvstore:
            store: memberlist  # 使用内置的memberlist集群
      
      ingester:
        lifecycler:
          ring:
            kvstore:
              store: memberlist
            replication_factor: 1
          final_sleep: 0s
        trace_idle_period: 10s  # 跟踪空闲时间
        max_block_bytes: 100_000_000  # 每个块的最大大小
        max_block_duration: 5m
      
      storage:
        trace:
          backend: local
          local:
            path: /tmp/tempo/blocks
          pool:
            max_workers: 100
            queue_depth: 1000
      
      compactor:
        compaction:
          block_retention: 24h   # 块保留时间
          compaction_window: 1h  # 压缩时间窗口
      
      querier:
        frontend_worker:
          frontend_address: 127.0.0.1:9095
      
      query_frontend:
        log_queries_longer_than: 5s
        # 如果需启用 Jaeger 搜索改为配置 jaeger_query 部分
        #jaeger_query:
        #  enabled: true
        #  http_prefix: "/jaeger"
      
      memberlist:
        join_members: []  # 单节点模式集群模式需配置节点列表
        bind_port: 7946
      
      multitenancy_enabled: false  # 是否启用多租户
      

2.然后执行docker-compose down;docker-compose up -d; 即可运行

1742289481795

3.开放服务器安全组的30000端口,然后在浏览器访问ip:30000 , 即可打开grafana【第一次登陆的账号和密码均为admin】

1742289579189

4.配置数据源

在grafana侧边菜单点击Configuration->Data Source->Add data source搜索并选择Loki

1742289731376

配置Loki地址

1742289881676

1742289907163

5.容器日志推送至Loki

  • 安装Loki插件【多试几次】

    docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
    

    1742291094453

  • 容器配置推送

    在任意服务器上【前提是安装了Loki插件,且能访问目标服务器的Loki地址】配置

    x-logging: &loki-logging
      driver: loki
      options:
        loki-url: "http://120.77.213.80:31000/loki/api/v1/push" # grafana和Loki目标服务
        max-size: "10m"          # 每个日志文件的最大大小
        max-file: "3"            # 保留的日志文件最大数量
    
      brewing-bigdata:
        image: golang:1.15
        container_name: "soa_brewing_bigdata"
        ports:
          - 30013:30013
        volumes:
          - ./brewing-bigdata:/app/
          - ./logs:/app/logs
        working_dir: /app
        command: /app/brewing_bigdata_app
        restart: "always"
        environment:
          - TZ=Asia/Shanghai
        networks:
          - nbi-net
        logging: *loki-logging
    

    1742291362542

    1742291303744

6.搜索日志

点击grafana的侧边栏的Explore

点击Log browser选择容器->Show logs

1742290007370

注意:grafana的语法比较特殊,这里给出两个示例

搜索brewing容器下,过滤354e8706299d0002 和 1430422b99f755890ed3dffa68b392b5

{container_name="brewing"} |= "354e8706299d0002" |= "1430422b99f755890ed3dffa68b392b5"

1742290216603

搜索brewing容器下,过滤354e8706299d0002 或 354ea893b2fa0001

{container_name="brewing"} |~ "354e8706299d0002|354ea893b2fa0001"

1742290868347

eg: 例如我们要将A服务器上容器日志推送到B服务器上的Loki上, 需要在A服务器上安装Loki插件,且配置容器docker-compose.yml文件的loki地址

Grafana安装包