28 May 2015 • on

Elasticsearch change default shard count

By default, elasticsearch will create 5 shards when receiving data from logstash. While 5 shards, may be a good default, there are times that you may want to increase and decrease this value.

Suppose you are splitting up your data into a lot of indexes. And you are keeping data for 30 days.

web-servers
database-servers
mail-servers

At 10 shards per day (5 shards x 2 copies), thats 300 shards. Considering that each shard is its own lucene index, this has the potential to be a lot of overhead.

Templates

Elasticserach leverages templates to define the settings for the indexes in shards. You can see the elasticsearch template for logstash with this http GET

_template/logstash?pretty

From linux / Mac terminal

curl elasticsearch.example.com:9200/_template/logstash?pretty

My config will likely look different than yours since it leverages ‘doc_values’ according to this blog

Observe below how the template leverages a wildcard to apply to all logstash indexes. logstash-*

{
  "logstash" : {
    "order" : 0,
    "template" : "logstash-*",
    "settings" : { },
    "mappings" : {
      "_default_" : {
        "dynamic_templates" : [ {
          "date_fields" : {
            "mapping" : {
              "format" : "dateOptionalTime",
              "doc_values" : true,
              "type" : "date"
            },
            "match" : "*",
            "match_mapping_type" : "date"
          }
        }, {
          "byte_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "byte"
            },
            "match" : "*",
            "match_mapping_type" : "byte"
          }
        }, {
          "double_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "double"
            },
            "match" : "*",
            "match_mapping_type" : "double"
          }
        }, {
          "float_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "float"
            },
            "match" : "*",
            "match_mapping_type" : "float"
          }
        }, {
          "integer_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "integer"
            },
            "match" : "*",
            "match_mapping_type" : "integer"
          }
        }, {
          "long_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "long"
            },
            "match" : "*",
            "match_mapping_type" : "long"
          }
        }, {
          "short_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "short"
            },
            "match" : "*",
            "match_mapping_type" : "short"
          }
        }, {
          "string_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "omit_norms" : true,
              "doc_values" : true,
              "type" : "string"
            },
            "match" : "*",
            "match_mapping_type" : "string"
          }
        } ],
        "properties" : {
          "@version" : {
            "index" : "not_analyzed",
            "doc_values" : true,
            "type" : "string"
          }
        },
        "_all" : {
          "enabled" : true
        }
      }
    },
    "aliases" : { }
  }
}

Modify default shard count

To change the default shard count, we will need to modify the settings field in the template.

"settings" : {
        "number_of_shards" : 2
    },
    ...

Before you proceed, understand that this only applies to new indexes. You can’t change the mapping of an already indexed indicie without reimporting your data.

Save the template to your workstation

$elk = 'http://elasticsearch.example.com:9200'
curl $elk/_template/logstash?pretty > ~/Desktop/logstash-template.json

Backup the file, then edit the ‘settings’ section of your file to reflect the number of shards that you want. I’m going to change mine from the default of 5 down to 2

Remove unneeded fields from template

The most important, and least clearly documented step of uploading a template, is removing the following lines

  "logstash" : {
    "order" : 0,


   }

Before

{
  "logstash" : {
    "order" : 0,
    "template" : "logstash-*",
    "settings" : { },
    "mappings" : {
      "_default_" : {
        "dynamic_templates" : [ {

After

{
    "template" : "logstash-*",
    "settings" : {
      "number_of_shards": 2
    },
    "mappings" : {
      "_default_" : {
        "dynamic_templates" : [ {

Final result

Here is a full config that has the ‘logstash’, and ‘order’ fields removed

{
    "template" : "logstash-*",
    "settings" : {
      "number_of_shards": 2
    },
    "mappings" : {
      "_default_" : {
        "dynamic_templates" : [ {
          "date_fields" : {
            "mapping" : {
              "format" : "dateOptionalTime",
              "doc_values" : true,
              "type" : "date"
            },
            "match" : "*",
            "match_mapping_type" : "date"
          }
        }, {
          "byte_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "byte"
            },
            "match" : "*",
            "match_mapping_type" : "byte"
          }
        }, {
          "double_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "double"
            },
            "match" : "*",
            "match_mapping_type" : "double"
          }
        }, {
          "float_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "float"
            },
            "match" : "*",
            "match_mapping_type" : "float"
          }
        }, {
          "integer_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "integer"
            },
            "match" : "*",
            "match_mapping_type" : "integer"
          }
        }, {
          "long_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "long"
            },
            "match" : "*",
            "match_mapping_type" : "long"
          }
        }, {
          "short_fields" : {
            "mapping" : {
              "doc_values" : true,
              "type" : "short"
            },
            "match" : "*",
            "match_mapping_type" : "short"
          }
        }, {
          "string_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "omit_norms" : true,
              "doc_values" : true,
              "type" : "string"
            },
            "match" : "*",
            "match_mapping_type" : "string"
          }
        } ],
        "properties" : {
          "@version" : {
            "index" : "not_analyzed",
            "doc_values" : true,
            "type" : "string"
          }
        },
        "_all" : {
          "enabled" : true
        }
      }
    },
    "aliases" : { }

}

Verify that you have valid json by using a tool like this one

Double check your config, then upload to any elasticsearch node. You can specify a file to upload by prefacing the filename with @. In this case, I named my file ‘foobar.json’

cd ~/Desktop
$elk = 'http://elasticsearch.example.com:9200'
curl -XPUT $elk/_template/logstash -d "@foobar.json"

If everything uploaded correctly, you can check with the same command you ran earlier

curl elasticsearch.example.com:9200/_template/logstash?pretty

Tomorrows index should only have 2 shards

If there is a problem reading the file, or uploading, elasticsearch will warn you and ignore the changes.

Warning: Couldn't read data from file "foobar", this makes an empty
Warning: POST.

Additional Resources

indices-templates

changing default number of shards