Parsing is very slow

There are a few reasons why:
1. `Handsoap::XmlQueryFront::NokogiriDriver#to_s` is very inefficient

The method uses a lot of literal strings that are constant and do not need to be modified. Using literal strings means they need to be `#dup`ed every time they are used in the method. Also, there are several `#gsub` calls where `#gsub!` could be called instead.

There is also a note about Nokogiri APIs being unstable, I'm not sure if this is the case anymore, but I overrode this method to just call `#content` on the backing Nokogiri node. I have something like this as a solution:

``` diff
diff --git a/lib/handsoap/xml_query_front.rb b/lib/handsoap/xml_query_front.rb
index 3df435c..742d7e1 100644
--- a/lib/handsoap/xml_query_front.rb
+++ b/lib/handsoap/xml_query_front.rb
@@ -168,9 +168,8 @@ module Handsoap
       # Returns the underlying native element.
       #
       # You shouldn't need to use this, since doing so would void portability.
-      def native_element
-        @element
-      end
+      attr_reader :native_element
+
       # Returns the node name of the current element.
       def node_name
         raise NotImplementedError.new
@@ -350,13 +349,34 @@ module Handsoap
           element = @element.children.first
         end
         return if element.nil?
+        string = element.content
+
         # This looks messy because it is .. Nokogiri's interface is in a flux
         if element.kind_of?(Nokogiri::XML::CDATA)
-          element.serialize(:encoding => 'UTF-8').gsub(/^<!\[CDATA\[/, "").gsub(/\]\]>$/, "")
+          stirng.gsub!(EBEGIN_CDATA, BLANK_STRING)
+          string.gsub!(EEND_CDATA,   BLANK_STRING)
         else
-          element.serialize(:encoding => 'UTF-8').gsub('&lt;', '<').gsub('&gt;', '>').gsub('&quot;', '"').gsub('&apos;', "'").gsub('&amp;', '&')
+          string.gsub!(ELT,   LT)
+          string.gsub!(EGT,   GT)
+          string.gsub!(EQUOT, QUOT)
+          string.gsub!(EAPOS, APOS)
+          string.gsub!(EAMP,  AMP)
         end
-      end
+        string
+      end
+      EBEGIN_CDATA = /^<!\[CDATA\[/
+      EEND_CDATA   = /\]\]>$/
+      BLANK_STRING = ''
+      ELT          = '&lt;'
+      LT           = '<'
+      EGT          = '&gt;'
+      GT           = '>'
+      EQUOT        = '&quot;'
+      QUOT         = '"'
+      EAPOS        = '&apos;'
+      APOS         = "'"
+      EAMP         = '&amp;'
+      AMP          = '&'
     end
   end
 end
```
1. All the data transformers use `#to_s`

This is expensive since calling `#to_s` is expensive, but even if `#to_s` is fixed I do not think the other transformers need to unescape the escape sequences, do they?

I don't really have the time to fix this right now and also make sure I don't break the other drivers. :(
1. Using XPath is not very efficient for large data structures

Rewalking the XML subtree is expensive for big data structures. I'm not sure if this is a problem for Handsoap, but maybe a notice in the documentation should be added.

I have worked around all of these issues in a gem that uses handsoap: http://github.com/Marketcircle/jiraSOAP.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parsing is very slow #25

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parsing is very slow #25

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions