Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

bold/italic/underline and other simply html code in reviews


Inferos

Recommended Posts

Esteemed fellow forumers,

 

I looked everywhere but it seems I cannot find a viable solution to this issue.

 

I need to allow some basic html code in the reviews, such as: bold, italic, underline and the symbol ":". HTMLSpecialChars strips away all that. I'm aware of the security risks involved in just removing the HTMLSpecialChars occurences, and obviously I won't do that. I'm just trying to have some simple text formatting, because for some products (I run a bookstore), I insert book reviews.

 

Thanks in advance to anyone who can come up with an idea.

Link to comment
Share on other sites

Any good web application will disable arbitrary HTML tags in code that is sent to the browser, so that miscreants can't insert nasty code into their posted text (such as reviews). The PHP function htmlspecialchars() turns < into <, > into >, & into &, and (depending on configuration) " into " and ' into 039;. This has the effect of disabling all HTML tags and printing them out as literal text. I'm surprised that you have trouble with colons ( : ), as they shouldn't be treated specially.

 

If you want to give the ability to users to do "safe" formatting in text they post, there are a couple of ways to do this. Both would involve replacing htmlspecialchars() calls (in reviews) with a new function. That function could implement "BBCode" (square brackets such as [ i ] for italics), which is converted to real HTML tags (<i>) upon output to the browser. Look in the code for a forum such as SMF for the "parse_bbc()" function, or you can write your own if the set of allowed tags is small. The alternative would be to examine all HTML tags in the text (anything starting with < ) and let a small set <i>, <b> etc. pass, while disabling (like htmlspecialchars()) anything else. It would also be possible look for "nasty" tags (<script>, <iframe>, etc.) and just disable those, although that may be a bit riskier (in overlooking hazardous tags).

 

A quick look doesn't show any existing add-ons to do this, but be sure to check carefully before going through the effort of coding these changes. Don't forget to handle the matching end tags </i> or [ /i ] too!

Link to comment
Share on other sites

Many thanks for your reply. Indeed, BBCode was in my mind too, but I couldn't find an already made contribution to use.

 

I downloaded the SMF and found the parse_bbc() function, looks a bit tough to implement into OSC, but if I have some time soon I'll give it a shot.

 

Apparently not many people needed this; OSC forum shows just 2-3 users trying to remove the HTMLSpecialChars() entirely, which is a huge mistake.

 

First thing that comes to my mind is to create a function that replaces for example [ i ] with < i > and so on, but there's a large list of tags that remain and need to be stripped away completely, like iframe, script etc. If I miss one, there's a security hole.

 

Basically, I needed a function that does everything HTMLSpecialChars does, except for bold, italic, underline.

 

Indeed, in the reviews, HTMLSpecialChars changes ( : ) to ( ; ).

Link to comment
Share on other sites

Load the tinyMCE instead. It has full version as well as basic: tinyMCE

 

Or you could allow the use of extra input fields say for author, publisher, date etc which will be bold or italic by default.. does that help?

Upon receiving fixes and advice, too many people don't bother to post updates informing the forum of how it went. Until of course they need help again on other issues and they come running back!

 

Why receive the information you require in good faith for free, only to then have the attitude to ignore the people who gave it to you?

 

There's no harm in saying, 'Thanks, it worked'. On the contrary, it creates a better atmosphere.

 

CHOOCH

Link to comment
Share on other sites

I'm not sure TinyMCE would help -- you still need to remove hazardous HTML tags somehow. TinyMCE just provides a WYSIWYG way to get tags into the text, doesn't it?

 

First thing that comes to my mind is to create a function that replaces for example [ i ] with < i > and so on, but there's a large list of tags that remain and need to be stripped away completely, like iframe, script etc. If I miss one, there's a security hole.

Well, the idea is to first run htmlspecialchars() to disable all HTML tags, and then to go through and replace BBCode tags (square brackets) with their HTML equivalents (angle brackets). You can implement just the ones you want (e.g., italic, bold, underline, certain symbols, and even colors and other special effects). Anything else is simply ignored and left unmolested. That shouldn't be too big a task.

 

The alternative is to seek out all <'s and </'s and examine the following two characters. For i>, b>, u> (upper or lower case) you just leave them be. Anything else gets disabled. This could be expanded to other various tags, but greatly increases your programming task.

 

Indeed, in the reviews, HTMLSpecialChars changes ( : ) to ( ; ).

All colons get changed, or just some? htmlspecialchars() isn't supposed to affect colons. I wonder if something else is changing them? Colons have no special meaning in HTML.

Link to comment
Share on other sites

Problem solved. Thanks a lot for your input on this.

 

For those interested, here's what I did:

 

1. created a new function in includes/functions/general.php

 

function tep_sanitize_string_reviews($string) {

$string = ereg_replace(' +', ' ', trim($string));

$string = preg_replace("/[<]/", '(' , $string);

$string = preg_replace("/[>]/", ')' , $string);

$string = preg_replace("/[[]/", '(' , $string);

$string = preg_replace("/[]]/", ')' , $string);

$string = preg_replace("/[{]/", '(' , $string);

$string = preg_replace("/[}]/", ')' , $string);

$string = preg_replace("/[<>]/", '_', $string);

 

$regex = "#([(]b[)])(.*)([(]/b[)])#e";

$string = preg_replace($regex,"('<b>$2</b>')",$string);

$regex = "#([(]i[)])(.*)([(]/i[)])#e";

$string = preg_replace($regex,"('<em>$2</em>')",$string);

$regex = "#([(]u[)])(.*)([(]/u[)])#e";

$string = preg_replace($regex,"('<u>$2</u>')",$string);

 

return $string;

}

 

2. then I used the function where I needed it, which is in product_reviews_info.php. Put this where you want the review text to appear:

 

tep_break_string(nl2br(tep_sanitize_string_reviews($review['reviews_text'])), 60, '-<br>')

 

 

It's excellent this way, as I have the bold/italic/underline only in the review text body, but anywhere else (reviews box, reviews list etc.), I get the non-html, simple text version, which is perfect.

 

3. Use the classic BBCode-style tags in your review text:

[b]text[/b], [i]text[/i] and [u]text[/u]

. You can easily expand this to other tags as well.

 

 

All colons get changed, or just some? htmlspecialchars() isn't supposed to affect colons. I wonder if something else is changing them? Colons have no special meaning in HTML.

 

You are correct again. It was replaced by another function I had, implemented from a contribution. OSC default functions, such as htmlspecialchars(), had nothing to do with it.

Link to comment
Share on other sites

Yet an even simpler way for the function:

 

function tep_sanitize_string_reviews($string) {

$string = ereg_replace(' +', ' ', trim($string));

$string = preg_replace("/[<]/", '(' , $string);

$string = preg_replace("/[>]/", ')' , $string);

$string = preg_replace("/[{]/", '(' , $string);

$string = preg_replace("/[}]/", ')' , $string);

$string = preg_replace("/[<>]/", '_', $string);

 

$subs = array(

'/\[b\](.+)\[\/b\]/Ui' => '<b>$1</b>',

'/\[i\](.+)\[\/i\]/Ui' => '<i>$1</i>',

'/\[u\](.+)\[\/u\]/Ui' => '<u>$1</u>'

);

 

$string = preg_replace(array_keys($subs), array_values($subs), $string);

 

return $string;

}

Link to comment
Share on other sites

  • 1 year later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...