Solr/Lucene: Document routing problem, upgrading from SolrCloud 4.0.0 to later 4.x

Problem

Situation

You are using Solr 4.0.0 in "cloud"-mode - that is, using ZooKeeper to make several Solr-nodes work together in a cluster.

You want to upgrade to a later version of Solr 4.x (e.g. Solr 4.4.0)

You already have a collection with at least 10 shards

Description

You do the upgrade by just installing, the new Solr binaries, config etc. on top of the old installation, of course leaving the data folder(s) untouched.

When you start the upgraded Solr-nodes, it will not route documents to shards, the same way as it did before the upgrade (for collections with at least 10 shards)

Consequences

Solr-nodes internally use the routing-algorithm to decide which shard a new or updated document has to go to.

If you try to insert a new document, with the same id as an existing document, that new document will (potentially) go to a shard different from the shard where the document with the same id already exists. You will be left with two documents in store (the old one and the new one), and if you are using optimistic locking you will not receive the intended exception basically saying that you cannot insert a document that already exists.
If you try to update an existing document (by using same id), the update-request will (potentially) go to a shard different from the shard where the document with the id already exists. Depending on consistency features (like optimistic locking) you will either be left with two documents in store (the old one and the new one), or you will receive an exception basically saying that you cannot update a document that does not exist (even though it does).
If you do real-time-gets from clients by looking at where Solr will route a document with the specific id (e.g using SOLR-5360), you will (potentially) not find the document, even though it actually exists.

Reason

When SolrCloud creates a collection, it writes stuff about that collection in clusterstate.json persisted in ZooKeeper. Among other, for each shard in the collection, clusterstate.json will contain a range. The ranges assigned to each shard is made by cutting (in equal sized portions) the number range from Integer.MIN_VALUE to Integer.MAX_VALUE, and assigning each shard a portion. The range on each shard is supposed to say which documents belong to this shard - a hash is made of the documents id and the document belongs to the shard for which the hash fits the range.

In SolrCloud 4.0.0, SolrCloud does not use the ranges written to clusterstate.json when it is routing documents. Instead it uses ranges calculated in the code - ranges that are supposed to be distributed among shards, the same way they are distributed in clusterstate.json.
But if you have 10 or more shards in you collection, the calculated ranges will not fit the ranges in clusterstate.json
Ranges in clusterstate.json are distributed among shards by shard number: shard1 gets the 1th range, shard2 gets the 2nd range ... shard10 gets the 10th range ... shardN gets the last range.
Ranges actually used for routing are distributed among shards in the order where shards are sorted by their name. E.g. if you have 15 shards, shard1 gets the 1st range, shard10 gets the 2rd range, shard11 gets the 3rd range ... shard2 gets the 8th range etc.

Solution

There are several solutions. Solr releases after 4.0.0 allows you to specify you own routing-algorithm per collection. You could write your own routing-algorithm and use that for the "old" collections.

I suggest that you modify clusterstate.json when you upgrade from 4.0.0 to a later 4.x version

Code

 
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

import java.io.File;
import java.lang.reflect.Field;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

import org.apache.commons.io.FileUtils;
import org.apache.solr.common.cloud.ClusterState;
import org.apache.solr.common.cloud.HashPartitioner.Range;
import org.apache.solr.common.cloud.Slice;
import org.apache.solr.common.cloud.ZkNodeProps;
import org.apache.solr.common.cloud.ZkStateReader;
import org.apache.zookeeper.data.Stat;

public class CorrectShardRangesInClusterState {

 public static void main(String[] args) throws Exception {
  String inputClusterStateFile = args[0];
  String outputClusterStateFile = args[1];
  
  byte[] bytes = FileUtils.readFileToByteArray(new File(inputClusterStateFile));
  Stat stat = new Stat();
  ClusterState clusterState = ClusterState.load(stat.getVersion(), bytes, new HashSet());

     for (Entry rangeInfo : getRangeInfos(clusterState).entrySet()) {
       RangeInfo ri = rangeInfo.getValue();
       Map shards = clusterState.getCollectionStates().get(rangeInfo.getKey());
       for (int i = 0; i < ri.getShardList().size(); i++) {
         String shard = ri.getShardList().get(i);
         int min = ri.getRanges().get(i).min;
         int max = ri.getRanges().get(i).max;
         Slice slice = shards.get(shard);
         Range range = getRange(slice);
         if ((range.min != min) || (range.max != max)) {
          System.out.println("Changing collection " + rangeInfo.getKey() + ", shard " + shard + ", min " + Integer.toHexString(range.min) + " -> " + Integer.toHexString(min) + ", max " + Integer.toHexString(range.max) + " -> " + Integer.toHexString(max));
          range.min = min;
          range.max = max;
         }
         Map newPropMap = new HashMap(slice.getProperties());
         newPropMap.put(Slice.RANGE, range);
         setPropMap(slice, Collections.unmodifiableMap(newPropMap));
       }
     }

     bytes = ZkStateReader.toJSON(clusterState.getCollectionStates());
     FileUtils.writeByteArrayToFile(new File(outputClusterStateFile), bytes);
 }
 
 private static class RangeInfo {
  Object clusterStateRangeInfo;
  
  private RangeInfo(Object clusterStateRangeInfo) {
   this.clusterStateRangeInfo = clusterStateRangeInfo;
  }
  
  private List getShardList() throws Exception {
   Field field = clusterStateRangeInfo.getClass().getDeclaredField("shardList");
   field.setAccessible(true);
   List result = (List)field.get(clusterStateRangeInfo);
   field.setAccessible(false);
   return result;
  }
  
  private List getRanges() throws Exception {
   Field field = clusterStateRangeInfo.getClass().getDeclaredField("ranges");
   field.setAccessible(true);
   List result = (List)field.get(clusterStateRangeInfo);
   field.setAccessible(false);
   return result;
  }
 }
 
 private static Range getRange(Slice slice) throws Exception {
  Field field = Slice.class.getDeclaredField("range");
  field.setAccessible(true);
  Range result = (Range)field.get(slice);
  field.setAccessible(false);
  return result;
 }
 
 private static Map getRangeInfos(ClusterState clusterState) throws Exception {
  Field field = ClusterState.class.getDeclaredField("rangeInfos");
  field.setAccessible(true);
  Map clusterStateRangeInfos = (Map)field.get(clusterState);
  Map result = new HashMap(clusterStateRangeInfos.size());
  for (Entry entry : clusterStateRangeInfos.entrySet()) {
   result.put(entry.getKey(), new RangeInfo(entry.getValue()));
  }
  field.setAccessible(false);
  return result;
 }
  
 private static void setPropMap(Slice slice, Map newPropMap) throws Exception {
  Field field = ZkNodeProps.class.getDeclaredField("propMap");
  field.setAccessible(true);
  field.set(slice, newPropMap);
  field.setAccessible(false);
 }

}

How to use it

Compile CorrectShardRangesInClusterState above against 4.0.0 Solr code
Before doing the upgrade (still running Solr 4.0.0)

Stop Solr
Extract clusterstate.json from ZooKeeper to file clusterstate_old.json
Run CorrectShardRangesInClusterState like this

 
java -classpath .:${SOLR_4_0_0_INSTALL}/dist/apache-solr-solrj-4.0.0.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/zookeeper-3.3.6.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/commons-io-2.1.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/slf4j-api-1.6.4.jar CorrectShardRangesInClusterState clusterstate_old.json clusterstate_new.json

Upload/replace clusterstate.json in ZooKeeper with content from generated file clusterstate_new.json

Now do the upgrade to your target Solr version (after 4.0.0)

Disclaimer

No warranty. Test it thoroughly before you do it in production!

I have successfully used CorrectShardRangesInClusterState as described above, to fix clusterstate.json before upgrading from Solr 4.0.0 to Solr 4.4.0.

The exact versions (4.0.0 on one side and any 4.x version after that on the other side) between which the problem seems to start has only been found by code-inspection. But the problem is definitely there between 4.0.0 and 4.4.0.

Solr/Lucene

onsdag den 15. januar 2014

Document routing problem, upgrading from SolrCloud 4.0.0 to later 4.x