Simple scalable deployment mode and TSDB index on filesystem

Hi, I actually try to run Loki in a simple scalable deployment mode as statefulset. I have a statefulset for the read and write instances with a persistent volume. I also configured loki to use the filesystem with rook-cephfs as persitent volume to store everything except the wal directory. That is working well but I got error messages for the tsdb index:

remove /data/tsdb-index/multitenant/index_19558/1689841408-loki-write-5.tsdb: no such file or directory

The index was already deleted on a different node/pod, since the data is stored on a shared volume it can’t be deleted within this pod anymore.I read the documentation and found the information to set an index_gateway_client - unfortunately the documentation is a little bit tenuous about that component and it is not clear for me how to configure that in a simple scalable deployment mode?

// Edit:

Or to ask in another way, is it possible to run Loki in simple scalable deployment mode and use only the filesystem as storage? The helm chart for example mentioned it is not possible but I already run Loki in this mode. But I have problems with random restarts and the mentioned error messages in my log.

My Loki config looks like the code below. This config is used by two statefulsets that are configured as read or write Loki deployment, in front of the pods is a loadbalancer that send the traffic in round robbin.

    auth_enabled: false
    chunk_store_config:
      max_look_back_period: 120h
    common:
      path_prefix: /data
      storage:
        filesystem:
          chunks_directory: /data/chunks
          rules_directory: /data/rules
      compactor_address: http://loki-loadbalancer:3100
    compactor:
      working_directory: /data/compactor
      shared_store: filesystem
    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
      max_outstanding_per_tenant: 2048
    ingester:
      lifecycler:
        join_after: 10s
        observe_period: 5s
        ring:
          replication_factor: 3
          kvstore:
            store: memberlist
        final_sleep: 0s
      chunk_idle_period: 1m
      wal:
        enabled: true
        dir: /wal
        checkpoint_duration: 15m
      max_chunk_age: 1m
      chunk_retain_period: 30s
      chunk_encoding: snappy
      chunk_target_size: 1.572864e+06
      chunk_block_size: 262144
      flush_op_timeout: 10s
    limits_config:
      max_cache_freshness_per_query: '10m'
      enforce_metric_name: false
      reject_old_samples: true
      reject_old_samples_max_age: 30m
      ingestion_rate_mb: 10
      ingestion_burst_size_mb: 20
      # parallelize queries in 15min intervals
      split_queries_by_interval: 15m
    query_range:
      # make queries more cache-able by aligning them with their step intervals
      align_queries_with_step: true
      max_retries: 5
      parallelise_shardable_queries: true
      cache_results: true
    ruler:
      enable_api: true
      wal:
        dir: /wal/ruler-wal
      storage:
        type: local
        local:
          directory: /data/rules
      rule_path: /tmp/prom-rules
      remote_write:
        enabled: true
        clients:
          local:
            url: http://prometheus:9090/api/v1/write
            queue_config:
              # send immediately as soon as a sample is generated
              capacity: 1
              batch_send_deadline: 0s
    schema_config:
      configs:
        - from: "2023-07-01"
          index:
            period: 24h
            prefix: index_
          object_store: filesystem
          schema: v12
          store: tsdb
    server:
      http_listen_address: 0.0.0.0
      grpc_listen_address: 0.0.0.0
      http_listen_port: 3100
      grpc_listen_port: 9095
      log_level: info
    storage_config:
      tsdb_shipper:
        active_index_directory: /data/tsdb-index
        cache_location: /data/tsdb-cache
        shared_store: filesystem
      filesystem:
        directory: /data/chunks
    table_manager:
      retention_deletes_enabled: true
      retention_period: 120h
    memberlist:
      join_members: ["loki-read-headless", "loki-write-headless"]
      dead_node_reclaim_time: 30s
      gossip_to_dead_nodes_time: 15s
      left_ingesters_timeout: 30s
      bind_addr: ['0.0.0.0']
      bind_port: 7946
      gossip_interval: 2s
    querier:
      query_ingesters_within: 2h
    query_scheduler:
      max_outstanding_requests_per_tenant: 1024

Edit2:

I configured Loki now like the following:

  • Each write pod deployed through the statefulset has a dedicated volume for the index, that means that every write pods writes and stores own index files
  • The read pods have a shared volume and store the cache of the index, that means all read pods access the same cache files

Till now I don’t see index deletion errors anymore, can I be sure, that with every read request the whole index is accessed?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.