Introduction
I have recently been upgrading one of my systems that use Solr in "cloud"-mode - that is, using ZooKeeper to make several Solr-nodes work together in a cluster. I upgraded from Solr 4.0.0 to 4.4.0, but this blog will also be relevant for several other combinations of from- and to-version - you just need to filter what is relevant for you.
Basically you can just stop your Solr-nodes, copy the new binaries and start Solr-nodes again, but
- If you upgraded from 4.0.0 to a newer version, routing will not work the same anymore
- If you, like me, like the state/configuration of the system after the upgrade, to be as it would have been if you had used 4.4.0 all along, there are several differences you need to correct
Differences
Let me go through the differences I encountered between a 4.0.0-to-4.4.0 upgraded system and a clean 4.4.0 system
clusterstate.json
clusterstate.json lives in the root (of Solr-area) in ZooKeeper. It contains information about collection, shards and replica.
Before the upgrade my clusterstate.json looks like this (Solr 4.0.0)
{ "my_collection_1":{ "shard1":{ "range":"80000000-8ccccccc", "replicas":{"my_host_1:8983_solr_my_collection_1_shard1_replica1":{ "shard":"shard1", "roles":null, "state":"active", "core":"my_collection_1_shard1_replica1", "collection":"my_collection_1", "node_name":"my_host_1:8983_solr", "base_url":"http://my_host_1:8983/solr", "leader":"true"}}}, "shard2": ... 19 more shards under "my_collection_1" ... }, "my_collection_2": ... 23 more collections ... }
If I had used Solr 4.4.0 all along my clusterstate.json would have looked like this
{ "my_collection_1":{ "shards":{ "shard1":{ "range":"80000000-8ccccccc", "state":"active", "replicas":{"core_node1":{ "state":"active", "core":"my_collection_1_shard1_replica1", "node_name":"192.168.xxx.yyy:8983_solr", "base_url":"http://192.168.xxx.yyy:8983/solr", "leader":"true"}}}, "shard2": ... 19 more shards under "my_collection_1" ... }, "router":"compositeId" }, "my_collection_2": ... 23 more collections ... }
Differences
- All shards at collection-level has been wrapped inside a shards-map in 4.4.0
- This is probably to make room for other key-values at collection-level. router=compositeId has been added at this level in 4.4.0
- state=active has been added at shard-level in 4.4.0
- Replica-keys/names have changed from something on the form <hostname>:<port>_<context>_<shard-name>_replica<X> to just core_node<Y>
- shard, roles and collection no longer present at replica-level in 4.4.0
- node_name and base_url are now based on IP instead of hostname
{ "my_collection_1":{"shards":{ "shard1":{ "range":"80000000-8ccccccc", "replicas":{"my_host_1:8983_solr_my_collection_1_shard1_replica1":{ "state":"active", "core":"my_collection_1_shard1_replica1", "node_name":"my_host_1:8983_solr", "base_url":"http://my_host_1:8983/solr", "leader":"true"}}, "state":"active"}, "shard2": ... 19 more shards under "my_collection_1" ... }}, "my_collection_2": ... 23 more collections ... }
Which of the differences between 4.0.0 and 4.4.0, was automatically "corrected" by 4.4.0 started on top of a system that used to run 4.0.0
- All shards at collection-level has been wrapped inside a shards-map. Check!
- router=compositeId has not been added at collection-level :-(
- state=active has been added at shard-level. Check!
- Replica-keys/names have not been changed :-(
- shard, roles and collection have been removed from replica-level. Check!
- node_name and base_url are not based on IP :-(
In 4.0.0 my solr.xml files contain <cores>-tag with lots of <core>-tags underneath. This still works after upgrading to 4.4.0, but you will be running in "legacy mode", which will not be supported from Solr 5.x
I would like my solr.xml files in my 4.0.0-to-4.4.0 upgraded system to be as they would have been if I had run 4.4.0 all along. Appendix 1) below shows my 4.4.0 solr.xml
core.properties files
In Solr 4.0.0 there are no core.properties files in <solr-home>/<replica-name> on disk (<solr-home> is controlled by VM-param -Dsolr.solr.home given when you start your Solr web-container (e.g. Jetty))
In Solr 4.4.0 there are a core.properties file for each replica in <solr-home>/<replica-name>. For my_collection_1 | shard1 | core_node1 it contains the following
name=my_collection_1_shard1_replica1 shard=shard1 collection=my_collection_1 coreNodeName=core_node1I would like to have core.properties files in my 4.0.0-to-4.4.0 upgraded system, just as I would have if I had run 4.4.0 all along
Data in collection znode's
For each collection there exist a znode (folder) in ZooKeeper at /collections/<collection-name> (in Solr-area). As you probably know, a znode can contain data, even though it is a folder (contains "children")
In 4.0.0 the data of those collection-znodes is
{"configName":"my_conf"}
In 4.4.0 the data is
{"configName":"my_conf", "router":"implicit"}
I would like that also in my 4.0.0-to.4.4.0 upgraded system.
How I corrected the differences
Now that we have seen all the differences I encountered, lets look at what I did to "correct" them, in order for my 4.0.0-to-4.4.0 upgraded system to seem as if it had been 4.4.0 all along
- Make sure ZooKeeper is running, but that Solr 4.0.0 nodes are not
- Extract/download (using your favorite tool) clusterstate.json from ZooKeeper (root of Solr-area) in a folder <my-favorite-folder>/upgrade/before on the machine from which you do the upgrade
- Correct ranges in clusterstate.json as explained here (only necessary if you upgrade from 4.0.0)
- Compile ClusterState4_0ToClusterStateAndCoreProperties4_4Upgrader.java from Appendix 2) below against 4.4.0 Solr code. You need to implement method hostnameToIP yourself first
- Convert clusterstate.json from 4.0.0-style to 4.4.0-style and generate all core.properties files
- By now you have your new configuration files in <my-favorite-folder>/upgrade/after
- clusterstate.json
- <IP>/data/<replica-name>/core.properties (a <IP> for each Solr-node in your system, and a <replica-name> for each replica run by that Solr-node
- Upload (using your favorite tool) <my-favorite-folder>/upgrade/after/clusterstate.json to ZooKeeper (root of Solr-area) replacing the existing one
- Upload all core.properties files (bash example)
- Compile SolrConfigDirInZookeeperUpgrader.java from Appendix 3) below against 4.4.0 Solr code
- Modify data in all collection/<collection-name> in ZooKeeper (in Solr-area)
- Now install the 4.4.0 binaries on all Solr-nodes (replacing existing 4.0.0 binaries) and start them again - viola!
java -classpath .:${SOLR_4_0_0_INSTALL}/dist/apache-solr-solrj-4.0.0.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/zookeeper-3.3.6.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/commons-io-2.1.jar:${SOLR_4_0_0_INSTALL}/dist/solrj-lib/slf4j-api-1.6.4.jar CorrectShardRangesInClusterState <my-favorite-folder>/upgrade/before/clusterstate.json <my-favorite-folder>/upgrade/after_ranges_fix/clusterstate.json
java -classpath .:${SOLR_4_4_0_INSTALL}/dist/apache-solr-solrj-4.4.0.jar:${SOLR_4_4_0_INSTALL}/dist/solrj-lib/commons-io-2.1.jar ClusterState4_0ToClusterStateAndCoreProperties4_4Upgrader <my-favorite-folder>/upgrade/after_ranges_fix <my-favorite-folder>/upgrade/after
for IP in <IP#1> <IP#2> ... <IP#N>; do # mention the IP's of all your Solr-nodes scp -r <my-favorite-folder>/upgrade/after/${IP}/data/. <solr-node-user>@${IP}:<solr-home> done
java -classpath .:${SOLR_4_4_0_INSTALL}/dist/apache-solr-solrj-4.4.0.jar:${SOLR_4_4_0_INSTALL}/dist/solrj-lib/zookeeper-3.4.5.jar SolrConfigDirInZookeeperUpgrader <solr-zookeeper-connection-string> <name-of-your-solr-configuration>
Disclaimer
No warranty. Test it thoroughly before you do it in production!
I have successfully followed the sketched procedure and upgraded a 4.0.0 system, so that it seems as if it had been running Solr 4.4.0 all along
Appendix
1) My 4.4.0 solr.xml
<solr> <str name="sharedLib">${sharedLib:}</str> <solrcloud> <str name="host">${host:}</str> <int name="hostPort">${jetty.port:8983}</int> <str name="hostContext">${hostContext:solr}</str> <int name="zkClientTimeout">${zkClientTimeout:30000}</int> <bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool> </solrcloud> <shardHandlerFactory name="shardHandlerFactory" class="HttpShardHandlerFactory"> <int name="socketTimeout">${socketTimeout:0}</int> <int name="connTimeout">${connTimeout:0}</int> </shardHandlerFactory> </solr>
2) ClusterState4_0ToClusterStateAndCoreProperties4_4Upgrader.java
/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ import java.io.File; import java.io.PrintWriter; import java.util.ArrayList; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import org.apache.commons.io.FileUtils; import org.apache.solr.common.cloud.ZkStateReader; public class ClusterState4_0ToClusterStateAndCoreProperties4_4Upgrader { /* * Assuming that the environment is set up with topology */ public static void main(String[] args) throws Exception { String inputFolder = args[0]; String outputFolder = args[1]; byte[] bytes = FileUtils.readFileToByteArray(new File(inputFolder + File.separator + "clusterstate.json")); Map<String, Object> stateMap = (Map<String, Object>) ZkStateReader.fromJSON(bytes); for (Entry<String, Object> collectionEntry : stateMap.entrySet()) { int nextCoreNodeNameNumber=1; Object collectionEntryValue = collectionEntry.getValue(); if (collectionEntryValue instanceof Map<?, ?>) { for (Entry<?, ?> collectionMapEntry : ((Map<?, ?>)collectionEntryValue).entrySet()) { if (collectionMapEntry.getKey().toString().startsWith("shard")) { Object shardEntryValue = collectionMapEntry.getValue(); if (shardEntryValue instanceof Map<?, ?>) { Map<String, Object> shardEntryMap = (Map<String, Object>)shardEntryValue; shardEntryMap.put("state", "active"); Object replicasEntryValue = shardEntryMap.get("replicas"); if (replicasEntryValue instanceof Map<?, ?>) { Map<String, Object> replicasEntryMap = (Map<String, Object>)replicasEntryValue; List<Object> replicasEntryMapValues = new ArrayList<Object>(replicasEntryMap.values()); replicasEntryMap.clear(); for (Object replicasEntryMapValue : replicasEntryMapValues) { String coreNodeName="core_node" + (nextCoreNodeNameNumber++); replicasEntryMap.put(coreNodeName, replicasEntryMapValue); } for (Entry<String, Object> replicasEntryMapEntry : replicasEntryMap.entrySet()) { Object replicasEntryMapEntryValue = replicasEntryMapEntry.getValue(); if (replicasEntryMapEntryValue instanceof Map<?, ?>) { Map<String, Object> replicaMap = (Map<String, Object>)replicasEntryMapEntryValue; String nodeName = replicaMap.get("node_name").toString(); String hostname = nodeName.substring(0, nodeName.indexOf(':')); String IP = hostnameToIP(hostname); String core = replicaMap.get("core").toString(); String shard = replicaMap.get("shard").toString(); String collection = replicaMap.get("collection").toString(); File corePropertiesDir = new File(outputFolder + File.separator + IP + File.separator + "data" + File.separator + core); corePropertiesDir.mkdirs(); File corePropertiesFile = new File(corePropertiesDir, "core.properties"); corePropertiesFile.createNewFile(); PrintWriter corePropertiesFileWriter = new PrintWriter(corePropertiesFile); try { corePropertiesFileWriter.println("name=" + core); corePropertiesFileWriter.println("shard=" + shard); corePropertiesFileWriter.println("collection=" + collection); corePropertiesFileWriter.println("coreNodeName=" + replicasEntryMapEntry.getKey()); } finally { corePropertiesFileWriter.flush(); corePropertiesFileWriter.close(); } replicaMap.remove("roles"); replicaMap.remove("shard"); replicaMap.remove("collection"); replicaMap.put("state", "down"); replicaMap.put("node_name", replicaMap.get("node_name").toString().replace(hostname, IP)); replicaMap.put("base_url", replicaMap.get("base_url").toString().replace(hostname, IP)); } } } } } } Map<String, Object> shardEntries = new LinkedHashMap<String, Object>(); for (Entry<?, ?> collectionMapEntry : ((Map<?, ?>)collectionEntryValue).entrySet()) { if (collectionMapEntry.getKey().toString().startsWith("shard")) { shardEntries.put(collectionMapEntry.getKey().toString(), collectionMapEntry.getValue()); } } for (Entry<String, Object> shardEntry : shardEntries.entrySet()) { ((Map<?, ?>)collectionEntryValue).remove(shardEntry.getKey()); } ((Map<String, Object>)collectionEntryValue).put("shards", shardEntries); ((Map<String, Object>)collectionEntryValue).put("router", "compositeId"); } } bytes = ZkStateReader.toJSON(stateMap); FileUtils.writeByteArrayToFile(new File(outputFolder + File.separator + "clusterstate.json"), bytes); System.exit(0); } protected static String hostnameToIP(String hostname) { // TODO calculate and return the IP corresponding to hostname } }
3) SolrConfigDirInZookeeperUpgrader.java
/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ import java.security.NoSuchAlgorithmException; import java.util.List; import org.apache.solr.common.cloud.SolrZkClient; import org.apache.solr.common.cloud.SolrZooKeeper; import org.apache.zookeeper.KeeperException; public class SolrConfigDirInZookeeperUpgrader { public static final String SOLR_COLLECTION_NODE = "/collections"; public static final String SOLR_CONFIG_PREFIX = "{\n" + " \"configName\":\""; public static final String SOLR_CONFIG_POSTFIX = "\",\n" + " \"router\":\"implicit\"}"; public static void main(String[] args) throws KeeperException, InterruptedException, NoSuchAlgorithmException { SolrConfigDirInZookeeperUpgrader instance = new SolrConfigDirInZookeeperUpgrader(args[0], args[1]); instance.updateSolrConfNameForAllCollections(); } private final String solrZkConnectionStr; private final String confName; public SolrConfigDirInZookeeperUpgrader(String solrZkConnectionStr, String confName) { super(); this.solrZkConnectionStr = solrZkConnectionStr; this.confName = confName; } private void updateSolrConfNameForAllCollections() throws KeeperException, InterruptedException { final SolrZkClient client = new SolrZkClient(solrZkConnectionStr, 16000); try { SolrZooKeeper zk = client.getSolrZooKeeper(); Listchildren = zk.getChildren(SOLR_COLLECTION_NODE, null); for(String child: children) { updateData(zk, SOLR_COLLECTION_NODE + "/" + child); } } finally { client.close(); } } private void updateData(SolrZooKeeper zk, String node) throws KeeperException, InterruptedException { zk.setData(node, new String(SOLR_CONFIG_PREFIX + confName + SOLR_CONFIG_POSTFIX).getBytes(), -1); } }