Modify SqlStreamingXml XmlWriter to internally use a MemoryStream#3974
Modify SqlStreamingXml XmlWriter to internally use a MemoryStream#3974jimhblythe wants to merge 1 commit intodotnet:mainfrom
Conversation
…tead of a StringBuilder. (dotnet#1877) Note: UTF8Encoding(false) addition in s_writerSettings is consistent with prior default used within StringWriter/StringBuilder
|
@dotnet-policy-service agree |
| //_xmlWriter.WriteNode(_xmlReader, true); | ||
| // _xmlWriter.Flush(); | ||
| WriteXmlElement(); | ||
| // Update memoryStreamRemaining based on the number of chars just written to the MemoryStream |
There was a problem hiding this comment.
What's written to the MemoryStream is bytes, the number of bytes and chars can be quite different depending on encoding. It might be clearer to just refer to byte counts in the comment.
There was a problem hiding this comment.
I agree for a MemoryStream, but within method GetChars() the stream is passed into XmlWriter.Create() with specific Encoding = new UTF8Encoding(false) to prevent the BOM, as well as meet the XML expectation.
The only use for writing is within WriteXmlElement() consistent with only XML characters being written. Changing the comment for this specific method might dilute that only chars are written. Thoughts?
There was a problem hiding this comment.
Ultimately, the comment should assist future maintenance so I'm good with modifying it to achieve that goal while also being more inline with MemoryStream semantics.
Something felt unexpected about the prior StringWriter using the default UTF-8 encoding, and why I included "Note: UTF8Encoding(false)" within the PR comment. Perhaps there is a different, existing issue with SqlClient support of other encodings. I've seen references to using FOR XML with non-XML columns causing concerns.
There was a problem hiding this comment.
It's hard to tell what the right thing to do is. I'll go with your intuition as you currently know the code better.
do the new tests you've added work on the old implementation as well? What sort of code coverage do we have?
There was a problem hiding this comment.
I will modify the comment on Mon. and also comment regarding the above "UTF8Encoding(false) addition in s_writerSettings" as the PR note is not as good as within the actual code.
- LinearSingleNode failed with prior code base as it most directly replicated the logic within SqlBenchmarker - time delta more than 20% of linear.
- LinearMultipleNodes passes with both versions - time delta within 20% of linear for 1MB vs. 5 x 1MB.
TDD-wise, I was using a locally modified version SqlBenchmarker testing extremes - up to 500MB single node & up to 500 x 1MB nodes. Both ran longer than I cared to wait on prior SqlClient code and completed under 3 minutes on new. Minimal time difference between single and multiple nodes, but noticeable memory usage difference through Task Manager.
I'm not sure on code coverage tests. I will review them before pushing comment changes. I did not run the PerformanceTests, so I will have to get my environment configured for them.
Modify SqlStreamingXml XmlWriter to internally use a MemoryStream instead of a StringBuilder.
Note: UTF8Encoding(false) addition in s_writerSettings is consistent with prior default used within StringWriter/StringBuilder
Issues
Fixes #1877 to be O(n)
Testing
Added 2 new Manual tests to ensure linear behavior for single large node, and secondary validation for multiple nodes
