Static Public Attributes | |
list | I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ['noscript'] |
I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = \ | |
tuple | NESTABLE_TAGS |
The BeautifulSoup class is oriented towards skipping over common HTML errors like unclosed tags. However, sometimes it makes errors of its own. For instance, consider this fragment: <b>Foo<b>Bar</b></b> This is perfectly valid (if bizarre) HTML. However, the BeautifulSoup class will implicitly close the first b tag when it encounters the second 'b'. It will think the author wrote "<b>Foo<b>Bar", and didn't close the first 'b' tag, because there's no real-world reason to bold something that's already bold. When it encounters '</b></b>' it will close two more 'b' tags, for a grand total of three tags closed instead of two. This can throw off the rest of your document structure. The same is true of a number of other tags, listed below. It's much more common for someone to forget to close a 'b' tag than to actually use nested 'b' tags, and the BeautifulSoup class handles the common case. This class handles the not-co-common case: where you can't believe someone wrote what they did, but it's valid HTML and BeautifulSoup screwed up by assuming it wouldn't be.
Definition at line 1381 of file BeautifulSoup.py.
list BeautifulSoup.ICantBelieveItsBeautifulSoup::I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = ['noscript'] [static] |
Definition at line 1411 of file BeautifulSoup.py.
Definition at line 1406 of file BeautifulSoup.py.
tuple BeautifulSoup.ICantBelieveItsBeautifulSoup::NESTABLE_TAGS [static] |
buildTagMap([], BeautifulSoup.NESTABLE_TAGS, I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS, I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS)
Reimplemented from BeautifulSoup.BeautifulSoup.
Definition at line 1413 of file BeautifulSoup.py.