Support for separated update/query Solr's by epugh · Pull Request #12 · mattweber/solr-for-wordpress

epugh · 2011-09-12T21:26:18Z

Matt,

I've been doing some hacking around analytics for Solr, and using a proxy to pick up the analytic data I need. My test bed has been our own website. I just hacked up the plugin to support separated query/update Solrs (or at least the end points!!).

I have a screenshot of the admin panel change: https://skitch.com/epugh/f2pt8/solr-options-open-source-search-engine-implementation-solr-lucene-search-integration-wordpress

There is one slight CSS issue in the formatting of the radio button options, but otherwise it works and starts with sane defaults. This is my first attempt at working with Wordpress, so any feedback appreciated. Thought others might have a similar need.

Eric Pugh

shaksi · 2011-09-12T22:16:49Z

Hey Eric,

Just had a look at your request, looks good, having looked at your changes the first thing that came to mind and something i have tried to tackle previously is multicore search (or different server searches).

We need to figure out a flexible way of using single server connection function to possibly decide which server it should be reading/writing to etc.

Although I havent worked it all out the way i was envisioning it was possibly making the $port/$host/$path an associative array. (similar UI as yours)

This would allow us to set a paramater in the url where by we can decide which core/server to retrieve the search from. Naturally we would only be able to write/send our index to one server but but search as many as we need.

How does something like that sound?

epugh · 2011-09-13T11:22:27Z

Shakur,

Good to hear from you. I was trying to divine the best way to hook my code in, and then realized that it seemed to be set up for multiple different "connections". At any rate, the model I was working from was the Ruby Sunspot library for Solr that allows you to separate out the writer from the readers, that was kind of where I came from.

Honestly though, I am not sure that I quite grok what you mean by multicore search. Are you suggesting issuing a query across solr1 and solr2? But not via the distributed search pattern? Or, do you mean that you might have an architecture with 1 master and 2 slaves... I think that load balancing between the various slaves is more the domain of the Solr backend, versus the frontend Wordpress plugin?

Any way that you can incorporate solving the itch I am trying to scratch, of separating out the writer from the reader totally works for me, I am not emotionally tied up in my little bit of hackery! And would love to go back to using the master branch!

Eric
On Sep 12, 2011, at 6:16 PM, Shakur Shidane wrote:

Hey Eric,

Just had a look at your request, looks good, having looked at your changes the first thing that came to mind and something i have tried to tackle previously is multicore search (or different server searches).

We need to figure out a flexible way of using single server connection function to possibly decide which server it should be reading/writing to etc.

Although I havent worked it all out the way i was envisioning it was possibly making the $port/$host/$path an associative array. (similar UI as yours)

This would allow us to set a paramater in the url where by we can decide which core/server to retrieve the search from. Naturally we would only be able to write/send our index to one server but but search as many as we need.

How does something like that sound?

Reply to this email directly or view it on GitHub:
#12 (comment)

Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

mattweber · 2011-09-13T22:40:18Z

Eric,

This is great, especially if you want to index to a master and query off of a slave(s). If Shakur doesn't get a chance to merge this in, I will try and do it this weekend. Thanks for the patch!

Shakur,

Can you explain what you mean? To handle high qps one would use this patch to index to a master, set that master instance up to replicate to some slaves, and setup load balancing between the slaves using haproxy, nginx, etc. I don't think we want the php client doing the load balancing even though I think it supports it.

Thanks,
Matt Weber

shaksi · 2011-09-15T01:49:56Z

okay branching out to 'multi_core_and_servers'.
Just prepped things (merged erics work and trying to work on top of it), its bit late atm but will give you guys explanation of what I was talking about in few hours.

epugh · 2011-09-15T13:43:18Z

Cool. And remember, from Solr's perspective, multiple cores look like multiple servers or not! They are all just URL's!

shaksi · 2011-09-15T13:53:08Z

Yes, you are right :D

The above mentioned branch works with master search but can be extended to take any other server/core for that matter.
Just need to some how surface UI for that.

On 15 Sep 2011, at 14:43, Eric Pugh wrote:

Cool. And remember, from Solr's perspective, multiple cores look like multiple servers or not! They are all just URL's!

Reply to this email directly or view it on GitHub:
#12 (comment)

shaksi · 2011-09-15T20:49:56Z

Okay lads, I have this in a stable state.

I have made it so that s4w_get_solr now accepts server ID based keys given during the initial setup.
We always have a 'master' defined as default and any number of slave servers can be defined.

We now have it so that from the plugin page one may decide which of the defined servers the plugin will use as search/update.
but that is not say thats the only options that exists, we can now include the server parameter as part of the search and if the id provided is valid that instance will used for search.

https://skitch.com/shaksi/f3mdh/solr-options-solr.test-wordpress <<< admin page

https://skitch.com/shaksi/f3mf8/another-search-results-solr.test << setting server parameter

https://skitch.com/shaksi/f3mfi/without-server-parameter << without server parameter uses the options set in teh admin page

As per your instruction a given user can go from single server setup to multi sever without any problem vice versa.

https://skitch.com/shaksi/f3mgj/single-server-instance
https://skitch.com/shaksi/f3mg4/single-server-search

Hope that all makes sense, I have merged and pushed to master please have a play around.
Feedback welcome.

mattweber · 2011-09-15T20:54:33Z

Is the code smart enough to allow only 1 update server selection or send documents to each update server when more than 1 is selected?

shaksi · 2011-09-15T21:09:45Z

As things stand there is one canonical update server, this can be set to any given server.
Similarly for search one instance is chosen as the default search but not limited to it, I have allowed leeway for occasions where someone might want to search different server than the default (one default indexes and returns wordpress and the other some external data.)

IMO I dont think its the plugins place to be medling with sending data to more than one server as that is a setup issue as you previously mentioned above one index on to the master and allow it to replicate to all the other instances as required.

epugh · 2011-09-15T21:12:17Z

I like. Is there a limit to how many slaves you can have? I assume it's dynamic, not fixed to 1 master and two slaves?

Again, I guess it's good that the plugin does the roundrobin-ing, however I would think that roundrobin-ing via the plugin would be a less common way, versus having a real load balancer in front of the slaves.

Two nitpicks would be:

Do we need some messaging to tell users the difference between and update and query solr? Or do we assume folks understand it if they are using the plugin.
messaging under the "Single Solr Server" should probably not refer to Solr 1.4, since the version changes. And really, that line of "Download, install, and configure your own Solr 1.4 instance" doesn't really make sense I don't think...

mattweber · 2011-09-15T21:17:11Z

Yea I like this too. Not a fan of the option button between selecting index/search hosts. I would think the first host defined is the master, and any others after that are search slaves. I really don't think anyone will use more than 1 search server because a real load balancer will be considerably better.

Eric is right, we need to clean up the wording and make sure we define the differences between search and index servers. Users will get confused otherwise. Then again, maybe we just name that tab "Advanced" setup and assume the user knows what they are doing if they go in there.

Thanks,
Matt Weber

shaksi · 2011-09-15T21:22:19Z

Arh its late and I was running out off the office! Having reread your comment.

The answer is yes, it is smart enough and does only send the documents to the one designated server.

Where to send documents is decided option:
[s4w_server][type][update]

Similarly where to search by default is decided by the following option:
[s4w_server][type][search]

The above mentioned options contain ServerIDs generated by the plugin.

------Original Message------
From: Matt Weber
To: Shakur Shidane
Subject: Re: [solr-for-wordpress] Support for separated update/query Solr's (#12)
Sent: Sep 15, 2011 21:54

Is the code smart enough to allow only 1 update server selection or send documents to each update server when more than 1 is selected?

Reply to this email directly or view it on GitHub:
#12 (comment)

mattweber · 2011-09-15T21:24:59Z

So if more than 1 search server is defined we one search across the selected one? If yes, I think we need to use the load balancing built into the solr php api and round-robin between them.

Thanks,
Matt Weber

epugh · 2011-09-15T21:39:29Z

Sounds great!

Eric

On Sep 15, 2011, at 5:25 PM, Matt Weber wrote:

So if more than 1 search server is defined we one search across the selected one? If yes, I think we need to use the load balancing built into the solr php api and round-robin between them.

Thanks,
Matt Weber

Reply to this email directly or view it on GitHub:
#12 (comment)

Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

shaksi · 2011-09-15T22:03:51Z

I agree the UI needs working! I don't know what the right way to go about is.

The way I see it atm, we haven't actually been clear as to what we mean when we say master/slave!

Is that even the appropriate adjectives to use?

I like the idea of renaming the second tab as advanced.

@eric
With regards to the number of servers there is no limit, although round robin could in theory be done with this plugin, it itself does not implement it (on the fly anyways)

What does is give a user the ability to define more than one server, modify the search form to provide dropdown of the servers which could searched.

Example: http://bfc.staging.headshift.com (it uses radio buttons to allow users to select between solr instances)

PS I am making the assumption that the listed server instances might not contain the same data. Is that a fair assumption? Or beyond the realm of a plugin a for a specific cms.

------Original Message------
From: Matt Weber
To: Shakur Shidane
Subject: Re: [solr-for-wordpress] Support for separated update/query Solr's (#12)
Sent: Sep 15, 2011 22:17

Yea I like this too. Not a fan of the option button between selecting index/search hosts. I would think the first host defined is the master, and any others after that are search slaves. I really don't think anyone will use more than 1 search server because a real load balancer will be considerably better.

Eric is right, we need to clean up the working and make sure we define the differences between search and index servers. Users will get confused otherwise. Then again, maybe we just name that tab "Advanced" setup and assume the user knows what they are doing if they go in there.

Thanks,
Matt Weber

Reply to this email directly or view it on GitHub:
#12 (comment)

mattweber · 2011-09-15T23:20:06Z

Definitely beyond the realm of this plugin. We should only be searching across servers that contain data indexed using the plugin. I think a simple 2 server setup like the original patch is probably the way to go. For the majority of users that is all they will ever need and it reduces the complexity of using the plugin considerably.

epugh · 2011-09-16T15:04:22Z

+1 Aim for the simplest use case that solves the itch!

On Sep 15, 2011, at 7:20 PM, Matt Weber wrote:

Definitely beyond the realm of this plugin. We should only be searching across servers that contain data indexed using the plugin. I think a simple 2 server setup like the original patch is probably the way to go. For the majority of users that is all they will ever need and it reduces the complexity of using the plugin considerably.

Reply to this email directly or view it on GitHub:
#12 (comment)

Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.

dustinrue · 2011-10-10T19:13:32Z

I'd say that if a user has more than one read server they should be using some outside method for load balancing. Been using an old version of the plugin and it's great to see that I can now use two servers. In our setup we have a master solr box with a couple of load balanced read (search) servers. Solr for Wordpress's current setup is perfect.

epugh · 2011-10-10T19:38:14Z

Glad to hear it!

shaksi · 2011-10-10T21:00:58Z

awesomeness, Sorry I havent got back to this discussion... @epugh & @mattweber its settled then for advanced users we will only offer the simplest use case. master and slave. The former to take care of the crud business and the latter just for reading.

added support for separated update/query solrs

53ed495

shaksi closed this in 9f631c1 Sep 15, 2011

shaksi reopened this Sep 15, 2011

Conversation

epugh commented Sep 12, 2011

Uh oh!

shaksi commented Sep 12, 2011

Uh oh!

epugh commented Sep 13, 2011

Uh oh!

mattweber commented Sep 13, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

epugh commented Sep 15, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

mattweber commented Sep 15, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

epugh commented Sep 15, 2011

Uh oh!

mattweber commented Sep 15, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

mattweber commented Sep 15, 2011

Uh oh!

epugh commented Sep 15, 2011

Uh oh!

shaksi commented Sep 15, 2011

Uh oh!

mattweber commented Sep 15, 2011

Uh oh!

epugh commented Sep 16, 2011

Uh oh!

dustinrue commented Oct 10, 2011

Uh oh!

epugh commented Oct 10, 2011

Uh oh!

shaksi commented Oct 10, 2011

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments