Parsing BBCode from HTML

User Tag List

Results 1 to 2 of 2
  1. #1

    Parsing BBCode from HTML

    Is it actually possible to:

    1) Parse the HTML output of BBCode tags
    2) Scan it
    3) Accurately recreate the original tags?

    I see a lot of tags like [indent][/indent] that look like the following in HTML:
    HTML Code:
    <div style="margin-left:40px">indent_text</div>
    I don't know, does indent _always_ produce that HTML? It just looks so unsafe to me

    The reason I'm asking is because I'm trying to parse individual posts on the site and recreate the BBCode tags; I need to know what type of BBCode tag I'm looking at as depending on the tag I either want to get the text (only the text) in the tag (including descendants), or get nothing - for instance, if the tag is an image tag or a quote tag I ignore it.

    At the moment what I'm doing is I look for certain characteristics that distinguish tags from one another (for instance, [center][/center] is
    HTML Code:
    <div style = "text-align: center">TEXT</div>
    ), and then I recreate the tag (I have specific types for each tag) as an object consisting of options plus the list of tags contained within the tag. So something like:

    Text, text

    Would become:
    PHP Code:
    new Color("#ffd421"
        { new 
    Text("Text, text")}
    btw the code isn't actually PHP; that's just pseudocode.
    Last edited by Oberon; November 10th, 2020 at 09:32 AM.

  2. #2



Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts