Is it actually possible to:
1) Parse the HTML output of BBCode tags
2) Scan it
3) Accurately recreate the original tags?
I see a lot of tags like [indent][/indent] that look like the following in HTML:
HTML Code:
<div style="margin-left:40px">indent_text</div>
I don't know, does indent _always_ produce that HTML? It just looks so unsafe to me
The reason I'm asking is because I'm trying to parse individual posts on the site and recreate the BBCode tags; I need to know what type of BBCode tag I'm looking at as depending on the tag I either want to get the text (only the text) in the tag (including descendants), or get nothing - for instance, if the tag is an image tag or a quote tag I ignore it.
At the moment what I'm doing is I look for certain characteristics that distinguish tags from one another (for instance, [center][/center] is
HTML Code:
<div style = "text-align: center">TEXT</div>
), and then I recreate the tag (I have specific types for each tag) as an object consisting of options plus the list of tags contained within the tag. So something like:
[color=#ffd421]
[b]
Text, text
[/b]
[/color]
Would become:
PHP Code:
new Color("#ffd421",
{ new Bold(
{new Text("Text, text")}
)
}
);
btw the code isn't actually PHP; that's just pseudocode.