You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some of these issues became apparent when testing patch #1745 ( for issue #1720 )
in some places will generate errors in PDF export, but not always.
Expected Behavior
[1] Should succeed in exporting PDF from Staff as well as Public . apps.
[2] Other named character entities should be encoded properly in EAD exports by being converted into numerical entities and only single lone ampersands should be escaped.
( i.e. the pattern in inner_xml function below needs to be more specific so it doesn't match entities. )
Inserting several instances of for testing, in one place this is output EAD fragment: <physdesc id="aspace_b818d732f87bd233ec68c8d8d764afa3"><extent altrender="materialtype spaceoccupied">150 items</extent> <extent altrender="carrier">1 Hollinger box</extent> <dimensions>less than 1 linear foot</dimensions></physdesc>
Which causes an error when producing PDF for undefined nbsp entity.
However, elsewhere it is serialized as:
<physdesc> <dimensions id="aspace_d4005b5f554e4603d55436ef91de7fc4">less than &nbsp; 1 &nbsp; linear foot</dimensions> </physdesc>
And adding some arbitrary named character entities to title also produces escaped ampersands: <unittitle>M&atilde;ry &Atilde;. Wilson p&atilde;p&eacute;rs</unittitle>
Note that all of those examples seem to display properly in PUI display ( they are known entities to HTML ) and after #1745 fix, they seem to work properly in PUI PDF download.
They also display properly in Staff view (HTML again) and only break when output in EAD/XML or PDF.
My writeup of what was happening is correct, but I shouldn't have tried to diagnose the cause so late at night: the function I pointed to was in EAD converter, not exporter, which is obviously where the problem is. But description of symptoms is correct.
Some of these issues became apparent when testing patch #1745 ( for issue #1720 )
in some places will generate errors in PDF export, but not always.Expected Behavior
[1] Should succeed in exporting PDF from Staff as well as Public . apps.
[2] Other named character entities should be encoded properly in EAD exports by being converted into numerical entities and only single lone ampersands should be escaped.
( i.e. the pattern in inner_xml function below needs to be more specific so it doesn't match entities. )
Current Behavior
Depending on where exactly entities are, this function will replace the ampersand with
&
entity:https://github.com/archivesspace/archivesspace/blob/master/backend/app/converters/lib/xml_sax.rb#L221
Inserting several instances of
for testing, in one place this is output EAD fragment:<physdesc id="aspace_b818d732f87bd233ec68c8d8d764afa3"><extent altrender="materialtype spaceoccupied">150 items</extent> <extent altrender="carrier">1 Hollinger box</extent> <dimensions>less than 1 linear foot</dimensions></physdesc>
Which causes an error when producing PDF for undefined nbsp entity.
However, elsewhere it is serialized as:
<physdesc> <dimensions id="aspace_d4005b5f554e4603d55436ef91de7fc4">less than &nbsp; 1 &nbsp; linear foot</dimensions> </physdesc>
And adding some arbitrary named character entities to title also produces escaped ampersands:
<unittitle>M&atilde;ry &Atilde;. Wilson p&atilde;p&eacute;rs</unittitle>
Note that all of those examples seem to display properly in PUI display ( they are known entities to HTML ) and after #1745 fix, they seem to work properly in PUI PDF download.
They also display properly in Staff view (HTML again) and only break when output in EAD/XML or PDF.
Possible Solution
But need to figure out why some ampersands are escaped and other are not first.
Steps to Reproduce (for bugs)
Context
Current behavior is inconsistent between Staff/PUI display and PDF/EAD serialization, and between Staff PDF and Public PDF exports.
Your Environment
The text was updated successfully, but these errors were encountered: