Support for separated update/query Solr's#12
Conversation
|
Hey Eric, Just had a look at your request, looks good, having looked at your changes the first thing that came to mind and something i have tried to tackle previously is multicore search (or different server searches). We need to figure out a flexible way of using single server connection function to possibly decide which server it should be reading/writing to etc. Although I havent worked it all out the way i was envisioning it was possibly making the $port/$host/$path an associative array. (similar UI as yours) This would allow us to set a paramater in the url where by we can decide which core/server to retrieve the search from. Naturally we would only be able to write/send our index to one server but but search as many as we need. How does something like that sound? |
|
Shakur, Good to hear from you. I was trying to divine the best way to hook my code in, and then realized that it seemed to be set up for multiple different "connections". At any rate, the model I was working from was the Ruby Sunspot library for Solr that allows you to separate out the writer from the readers, that was kind of where I came from. Honestly though, I am not sure that I quite grok what you mean by multicore search. Are you suggesting issuing a query across solr1 and solr2? But not via the distributed search pattern? Or, do you mean that you might have an architecture with 1 master and 2 slaves... I think that load balancing between the various slaves is more the domain of the Solr backend, versus the frontend Wordpress plugin? Any way that you can incorporate solving the itch I am trying to scratch, of separating out the writer from the reader totally works for me, I am not emotionally tied up in my little bit of hackery! And would love to go back to using the master branch! Eric
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com |
|
Eric, This is great, especially if you want to index to a master and query off of a slave(s). If Shakur doesn't get a chance to merge this in, I will try and do it this weekend. Thanks for the patch! Shakur, Can you explain what you mean? To handle high qps one would use this patch to index to a master, set that master instance up to replicate to some slaves, and setup load balancing between the slaves using haproxy, nginx, etc. I don't think we want the php client doing the load balancing even though I think it supports it. Thanks, |
|
okay branching out to 'multi_core_and_servers'. |
|
Cool. And remember, from Solr's perspective, multiple cores look like multiple servers or not! They are all just URL's! |
|
Yes, you are right :D The above mentioned branch works with master search but can be extended to take any other server/core for that matter. On 15 Sep 2011, at 14:43, Eric Pugh wrote:
|
|
Okay lads, I have this in a stable state. I have made it so that s4w_get_solr now accepts server ID based keys given during the initial setup. We now have it so that from the plugin page one may decide which of the defined servers the plugin will use as search/update. https://skitch.com/shaksi/f3mdh/solr-options-solr.test-wordpress <<< admin page https://skitch.com/shaksi/f3mf8/another-search-results-solr.test << setting server parameter https://skitch.com/shaksi/f3mfi/without-server-parameter << without server parameter uses the options set in teh admin page As per your instruction a given user can go from single server setup to multi sever without any problem vice versa. https://skitch.com/shaksi/f3mgj/single-server-instance Hope that all makes sense, I have merged and pushed to master please have a play around. |
|
Is the code smart enough to allow only 1 update server selection or send documents to each update server when more than 1 is selected? |
|
As things stand there is one canonical update server, this can be set to any given server. IMO I dont think its the plugins place to be medling with sending data to more than one server as that is a setup issue as you previously mentioned above one index on to the master and allow it to replicate to all the other instances as required. |
|
I like. Is there a limit to how many slaves you can have? I assume it's dynamic, not fixed to 1 master and two slaves? Again, I guess it's good that the plugin does the roundrobin-ing, however I would think that roundrobin-ing via the plugin would be a less common way, versus having a real load balancer in front of the slaves. Two nitpicks would be:
|
|
Yea I like this too. Not a fan of the option button between selecting index/search hosts. I would think the first host defined is the master, and any others after that are search slaves. I really don't think anyone will use more than 1 search server because a real load balancer will be considerably better. Eric is right, we need to clean up the wording and make sure we define the differences between search and index servers. Users will get confused otherwise. Then again, maybe we just name that tab "Advanced" setup and assume the user knows what they are doing if they go in there. Thanks, |
|
Arh its late and I was running out off the office! Having reread your comment. The answer is yes, it is smart enough and does only send the documents to the one designated server. Where to send documents is decided option: Similarly where to search by default is decided by the following option: The above mentioned options contain ServerIDs generated by the plugin. ------Original Message------ Is the code smart enough to allow only 1 update server selection or send documents to each update server when more than 1 is selected? Reply to this email directly or view it on GitHub: |
|
So if more than 1 search server is defined we one search across the selected one? If yes, I think we need to use the load balancing built into the solr php api and round-robin between them. Thanks, |
|
Sounds great! Eric On Sep 15, 2011, at 5:25 PM, Matt Weber wrote:
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com |
|
I agree the UI needs working! I don't know what the right way to go about is. The way I see it atm, we haven't actually been clear as to what we mean when we say master/slave! Is that even the appropriate adjectives to use? I like the idea of renaming the second tab as advanced. @eric What does is give a user the ability to define more than one server, modify the search form to provide dropdown of the servers which could searched. Example: http://bfc.staging.headshift.com (it uses radio buttons to allow users to select between solr instances) PS I am making the assumption that the listed server instances might not contain the same data. Is that a fair assumption? Or beyond the realm of a plugin a for a specific cms. ------Original Message------ Yea I like this too. Not a fan of the option button between selecting index/search hosts. I would think the first host defined is the master, and any others after that are search slaves. I really don't think anyone will use more than 1 search server because a real load balancer will be considerably better. Eric is right, we need to clean up the working and make sure we define the differences between search and index servers. Users will get confused otherwise. Then again, maybe we just name that tab "Advanced" setup and assume the user knows what they are doing if they go in there. Thanks, Reply to this email directly or view it on GitHub: |
|
Definitely beyond the realm of this plugin. We should only be searching across servers that contain data indexed using the plugin. I think a simple 2 server setup like the original patch is probably the way to go. For the majority of users that is all they will ever need and it reduces the complexity of using the plugin considerably. |
|
+1 Aim for the simplest use case that solves the itch! On Sep 15, 2011, at 7:20 PM, Matt Weber wrote:
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com |
|
I'd say that if a user has more than one read server they should be using some outside method for load balancing. Been using an old version of the plugin and it's great to see that I can now use two servers. In our setup we have a master solr box with a couple of load balanced read (search) servers. Solr for Wordpress's current setup is perfect. |
|
Glad to hear it! |
|
awesomeness, Sorry I havent got back to this discussion... @epugh & @mattweber its settled then for advanced users we will only offer the simplest use case. master and slave. The former to take care of the crud business and the latter just for reading. |
Matt,
I've been doing some hacking around analytics for Solr, and using a proxy to pick up the analytic data I need. My test bed has been our own website. I just hacked up the plugin to support separated query/update Solrs (or at least the end points!!).
I have a screenshot of the admin panel change: https://skitch.com/epugh/f2pt8/solr-options-open-source-search-engine-implementation-solr-lucene-search-integration-wordpress
There is one slight CSS issue in the formatting of the radio button options, but otherwise it works and starts with sane defaults. This is my first attempt at working with Wordpress, so any feedback appreciated. Thought others might have a similar need.
Eric Pugh