Failsafe when using JSON to load HTML data through PHP and AJAX
When we use JSON to load data into PHP from an external source we could run into some 'double-quote' problems when decoding the JSON string. By using a fairly simple regular expression we make sure the JSON string is still valid.
When loading HTML data we assume this data is valid HTML thus each applicable character is encoded. For user generated content this is easily achieved by using something like TinyMCE or FCKeditor, this data is usually stored in a database so we can display it directly on screen.
Lets assume we would like to use JSON for some extra information. For example:
{ item_id : 1, item_date : "2007-01-01", item_html : "<div>"some item text"</div>" }
This enables us to get a nice reference to the item_id on the frontend (webpage) which could be used to create a permanent link to the object. Note we used HTML encoded quotes.
Now consider the following example:
{ item_id : 1, item_date : '2007-01-01', item_html : "<div>"some item text"<a href="external_link">read more...</a></div>" }
The author added a link to the item HTML since the quotes in the href are not encoded the json_decode will fail since the quotes of the link have not been escaped properly.
The following static function can be used as a pre JSON decoding patch:
/** * A simple JSON search and replace, HTML output is assumed * * @param String $value * @return String */ static function jsonSafe( $value ) { /** * RTE editor should solve "all" our problems however it does not * escape the double quotes in <a href="test"> */ $pattern = array( '/(.*)="(.*)"/iU' ); $replacement = array( '$1=\'$2\'' ); $value = preg_replace($pattern, $replacement, $value); return str_replace( array( '\r' , '\n' , "\r" , "\n" , '\\' ), array( '' , '' , '' , '' , '\\\\' ), $value ); }
This function also strips the newlines since we really do not need them and they could interfere with the decoding process; since we are displaying HTML no there is no visible difference on the webpage.
Note: this is just a simple fully unoptimized function; there is a lot of room for improvement. For example: the regular expression could be improved and merged with the str_replace.
www.zeger.nl