Vytautas Saltenis
cf6bfc9d6d
Rip off all blackfriday's html sanitization effort
...
As per discussion in issue #90 .
2014-09-19 21:25:23 +03:00
Daniel Imfeld
5bf00efe39
Remove unnecessary HTML_ABSOLUTE_LINKS flag
2014-05-29 09:17:20 -05:00
Daniel Imfeld
4ccf982a9e
Add tests for absolute prefix
2014-05-25 13:22:33 -05:00
Daniel Imfeld
2ce0592896
Add tests for new footnote functionality
2014-05-25 13:07:05 -05:00
Daniel Imfeld
628c02d37b
Move footnote prefix to a better place
2014-05-24 14:28:37 -05:00
Daniel Imfeld
ec41294bc4
Add footnote prefix option. Needs testing
2014-05-24 02:55:13 -05:00
Daniel Imfeld
5c12499aa1
Add ability to convert relative links to absolute
2014-05-18 01:28:15 -05:00
Martin Probst
7daa6e8b70
Move sanitization tests into their own file.
...
Also adds an explicit test for [link](...) syntax to be sanitized.
2014-05-03 14:37:23 +02:00
Vytautas Šaltenis
717a976f69
Merge pull request #76 from mprobst/self-closing
...
feat: Write self-closing tags with a />
2014-05-03 15:11:53 +03:00
Martin Probst
55d8f72dde
feat: Write self-closing tags with a />
...
Adds tests for self-closing tags both for correct writing and for correct
sanitization, i.e. stripping attributes on them.
2014-05-03 13:59:10 +02:00
Martin Probst
11e042f6c1
Avoid raw mode parsing so that raw mode tags like <script> don't cause issues.
...
Certain tags like <script> but also <title> and others switch an HTML5 parser
into raw mode, which causes the rest of the HTML string to be always parsed as
text, including any elements or entities that we do want to support (e.g. <p>).
As we're going to escape any of the raw text elements anyway (it's e.g. script,
style, title, xmp, noframes, and a couple of others) we can just switch of raw
text parsing by disabling it after each starting tag.
2014-05-03 13:26:52 +02:00
Martin Probst
915f7049a0
Add a test for the correct handling of escaped entities in HTML.
...
The sanitization code does not retain any particular escaped entities - it
parses the HTML and thus loses the information on what entities were in the
original. The result is correct UTF-8 HTML though.
2014-05-03 12:34:16 +02:00
Martin Probst
8d2af3a21b
Add support for a bunch more safe HTML element tags, and bring them into some order.
2014-05-01 22:08:32 +02:00
Vytautas Šaltenis
aeb569ff46
Merge pull request #70 from mprobst/master
...
fix: Handle all different token types that the parser can emit (d'oh).
2014-05-01 21:59:07 +03:00
Martin Probst
f9b7593e65
fix: Handle all different token types that the parser can emit (d'oh).
2014-05-01 20:55:53 +02:00
Vytautas Šaltenis
3dba5bc56e
Merge branch 'master' of github.com:gihnius/blackfriday into gihnius-master
...
Conflicts:
html.go
inline_test.go
2014-05-01 21:43:42 +03:00
Vytautas Šaltenis
b44be78459
Allow rel attribute in sanitizer
...
Fixes issue #68 .
2014-05-01 20:49:49 +03:00
Martin Probst
41251715ad
Use go.net/html's parser to sanitize HTML.
...
Use an HTML5 compliant parser that interprets HTML as a browser would to parse
the Markdown result and then sanitize based on the result.
Escape unrecognized and disallowed HTML in the result.
Currently works with a hard coded whitelist of safe HTML tags and attributes.
2014-04-27 23:40:44 +02:00
Vytautas Šaltenis
55bb56bf9b
Merge pull request #55 from rtfb/master
...
Autolink fixes
2014-03-30 19:58:39 +03:00
Vytautas Šaltenis
d643453f1e
Merge pull request #50 from rtfb/master
...
Better protection against JavaScript injection
2014-03-30 19:52:13 +03:00
gihnius
c9977f0c0b
test: add nofollow ref for non internal links only
2014-03-21 11:17:31 +08:00
gihnius
ecf59d4a55
add target blank attr
2014-03-21 10:52:46 +08:00
Graham Miller
d71c759108
add HTML_NOFOLLOW_LINKS
2014-02-25 09:21:57 -05:00
Vytautas Šaltenis
e5937643a9
Fix bug in autolink with trailing semicolon
...
In case the link ends with escaped html entity, the semicolon is a part
of the link and should not be interpreted as punctuation.
2014-02-17 21:09:04 +02:00
Vytautas Šaltenis
b0bdfbec4c
Fix bug in autolink overescaping html entities
...
If autolink encounters a link which already has an escaped html entity,
it would escape the ampersand again, producing things like these:
& --> &amp;
" --> &quot;
This commit solves that by first looking for all entity-looking things
in the link and copying those ranges verbatim, only considering the rest
of the string for escaping.
Doesn't seem to have considerable performance impact.
The mailto: links are processed the old way.
2014-02-17 21:09:04 +02:00
Vytautas Šaltenis
f2d43f69a4
Fix bug in autolink termination
...
Detect the end of link when it is immediately followed by an element.
2014-02-17 21:09:03 +02:00
Vytautas Šaltenis
9fc8c9d866
Fix bug with overzealous autolink processing
...
When the source Markdown contains an anchor tag with URL as link text
(i.e. <a href=...>http://foo.bar </a>), autolink converts that link text
into another anchor tag, which is nonsense. Detect this situation with
regexp and early exit autolink processing.
2014-02-17 21:09:03 +02:00
Vytautas Šaltenis
2f50a53f8e
Rename HTML_SKIP_SCRIPT to HTML_SANITIZE_OUTPUT
2014-01-22 01:23:43 +02:00
Vytautas Šaltenis
55cd82008e
Rewrite protection against JavaScript injection
...
This drops the naive approach at <script> tag stripping and resorts to
full sanitization of html. The general idea (and the regexps) is grabbed
from Stack Exchange's PageDown JavaScript Markdown processor[1]. Like in
PageDown, it's implemented as a separate pass over resulting html.
Includes a metric ton (but not all) of test cases from here[2]. Several
are commented out since they don't pass yet.
Stronger (but still incomplete) fix for #11 .
[1] http://code.google.com/p/pagedown/wiki/PageDown
[2] https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
2014-01-22 01:14:35 +02:00
Darren Coxall
607ec21435
Tests for links when using HTML_SAFELINK
2013-12-19 10:00:47 +00:00
Russ Ross
ca82b8db3a
panic fix (issue #33 ) with test case
2013-09-11 12:47:43 -06:00
Alex Xandra Albert Sim
da8f2753e2
Added test for link inside image
2013-09-09 12:51:20 +07:00
athom
31798e0eab
add testcase for GFM autolink
2013-08-09 17:24:26 +08:00
moshee
3ea84a5811
parser no longer returns prematurely from empty footnote ref
2013-07-08 22:34:12 +00:00
moshee
1a73bae554
added slice bounds check
2013-07-08 06:54:25 +00:00
moshee
c23099e5ee
Implementation and some tests for inline footnotes. Also I noticed the list items had the wrong ids, that was silly of me.
2013-07-01 01:37:52 +00:00
moshee
7bdb82c53a
new tests pass but old tests now fail...
2013-06-26 15:57:51 +00:00
moshee
be082a1ef2
First attempt at supporting Pandoc-style footnotes. The existing tests have not broken but the new functionality does not work yet.
2013-06-25 01:18:47 +00:00
Vytautas Šaltenis
8226238289
Improve html element stripping code
2013-04-18 03:15:47 +03:00
Vytautas Šaltenis
85e2207cd0
Couple more tests
2013-04-14 01:42:47 +03:00
Vytautas Šaltenis
dcaaa9b5dc
More <script> stripping
...
Partially addresses issue #11 .
2013-04-13 23:24:30 +03:00
Vytautas Šaltenis
fb923cdb78
Add an option to strip <script> elements
...
Partially addresses issue #11 .
2013-04-13 22:57:16 +03:00
Vytautas Šaltenis
b79e720a36
Make isHtmlTag() case insensitive
2013-04-13 22:34:37 +03:00
Vytautas Šaltenis
d5a8df164b
Fix bug in isHtmlTag()
...
Fix what seems to be a typo. j should iterate through all tagname, so it
should be initialized to zero. The test exposes this bug.
2013-04-13 22:21:47 +03:00
Vytautas Šaltenis
90509d39d4
Make a way to parameterize inline tests
...
Expose extensions and html flags parameters so that tests could specify
what code paths they want to exercise.
2013-04-13 22:18:14 +03:00
Russ Ross
e35b4b66cc
bounds checking stress tests
2011-07-03 10:51:07 -06:00
Russ Ross
ae9562f685
move whitespace stripping to parser, not renderers
2011-06-29 15:38:35 -06:00
Russ Ross
2aca667078
simplify inline callback interface
2011-06-29 13:00:54 -06:00
Russ Ross
873a60ad49
complete page rendering is now an option in the library
2011-06-29 10:08:56 -06:00
Russ Ross
c969dff782
added simplified interface for common usage
2011-06-28 15:55:27 -06:00
Russ Ross
fde2c60665
version number, few more options for command-line tool
2011-06-28 11:30:10 -06:00
Russ Ross
f8f70572a4
simplified BSD license
2011-06-27 20:11:32 -06:00
Russ Ross
c8f7e789d4
more robust whitespace stripping and matching corrections to tests
2011-06-27 16:06:16 -06:00
Russ Ross
9a0217f7aa
fixed minor bugs uncovered by more testing
2011-06-27 14:35:11 -06:00
Russ Ross
3af64a90ad
fixed headers nested in lists, added prefix header unit tests
2011-06-27 10:13:13 -06:00
Russ Ross
be0fb4602b
more inline unit tests
2011-06-24 16:39:50 -06:00
Russ Ross
1e40ebaf47
unit test for linebreaks
2011-06-01 18:52:55 -06:00
Russ Ross
2abc3af015
starting inline unit tests, fix a few minor bugs they exposed
2011-06-01 12:17:17 -06:00