blackfriday

mirror of https://github.com/danog/blackfriday.git synced 2024-12-03 09:57:57 +01:00

Author	SHA1	Message	Date
Austin Ziegler	9c061de92b	Allow configurable header ID prefix/suffixes. This is specifically driven by the Hugo usecase where multiple documents are often rendered into the same ultimate HTML page. When a header ID is written to the output HTML format (either through `HTML_TOC`, `EXTENSION_HEADER_IDS`, or `EXTENSION_AUTO_HEADER_IDS`), it is possible that multiple documents will hvae identical header IDs. To permit validation to pass, it is useful to have a per-document prefix or suffix (in our case, an MD5 of the content filename, and we will be using it as a suffix). That is, two documents (`A` and `B`) that have the same header ID (`# Reason {#reason}`), will end up having an actual header ID of the form `#reason-DOCID` (e.g., `#reason-A`, `#reason-B`) with these HTML parameters. This is built on top of #126 (more intelligent collision detection for `EXTENSION_AUTO_HEADER_IDS`).	2014-11-23 20:37:27 -05:00
Austin Ziegler	40f28ee022	Prevent generated header collisions, less naively. > This is a rework of an earlier version of this code. The automatic header ID generation code submitted in #125 has a subtle bug where it will use the same ID for multiple headers with identical text. In the case below, all the headers are rendered a `<h1 id="header">Header</h1>`. ```markdown # Header # Header # Header # Header ``` This change is a simple but robust approach that uses an incrementing counter and pre-checking to prevent header collision. (The above would be rendered as `header`, `header-1`, `header-2`, and `header-3`.) In more complex cases, it will append a new counter suffix (`-1`), like so: ```markdown # Header # Header 1 # Header # Header ``` This will generate `header`, `header-1`, `header-1-1`, and `header-1-2`. This code has two additional changes over the prior version: 1. Rather than reimplementing @shurcooL’s anchor sanitization code, I have imported it as from `github.com/shurcooL/go/github_flavored_markdown/sanitized_anchor_name`. 2. The markdown block parser is now only interested in generating a sanitized anchor name, not with ensuring its uniqueness. That code has been moved to the HTML renderer. This means that if the HTML renderer is modified to identify all unique headers prior to rendering, the hackish nature of the collision detection can be eliminated.	2014-11-23 20:35:43 -05:00
bep	857a1a0260	Add support for angled, double quotes The flag `HTML_SMARTYPANTS_ANGLED_QUOTES` combined with `HTML_USE_SMARTYPANTS` configures rendering of double quotes as angled left and right quotes (« »). The SmartyPants documentation mentions a special syntax for these, `<<>>`, a syntax neither pretty nor user friendly. Typical use cases would be either or, or combined, but never in the same document. As an example would be a person from Norway; he has a blog in both English and Norwegian (his native tounge); he would then configure Blackfriday to use angled quotes for the Norwegian section, but keep them as reqular double quotes for the English. If the flag `HTML_SMARTYPANTS_ANGLED_QUOTES` is not provided, everything works as before this commit.	2014-11-05 23:29:41 +01:00
Austin Ziegler	8cc40f8e07	Use supplied header ID for TOC rendering. - Fixes #112 so that `#header {#header-id}` renders the TOC with `#header-id` instead of `#toc_1`.	2014-10-27 16:49:28 -04:00
Vytautas Saltenis	cf6bfc9d6d	Rip off all blackfriday's html sanitization effort As per discussion in issue #90.	2014-09-19 21:25:23 +03:00
tummychow	67002b01b6	Use HTML5 recommended style of language on code blocks For code blocks that contain a certain language of code, the recommended attribute structure is <pre><code class="language-foo">. This also corresponds to the behavior expected by various JS syntax highlighters. The GitHub code block implementation was obsolete, and identical to the normal implementation except for its attribute structure, so it was removed. Closes #108.	2014-08-28 18:01:06 -04:00
Brian Goff	539b27a624	Add titleblock support	2014-08-04 14:08:22 -04:00
Daniel Imfeld	5bf00efe39	Remove unnecessary HTML_ABSOLUTE_LINKS flag	2014-05-29 09:17:20 -05:00
Daniel Imfeld	10f1dc6358	Fix spelling error	2014-05-28 23:52:45 -05:00
Daniel Imfeld	628c02d37b	Move footnote prefix to a better place	2014-05-24 14:28:37 -05:00
Daniel Imfeld	c7f4b178c2	Use parameters object for extra options. Enhance footnote support. Option to add return links. Option to make footnote prefixes unique, for rendering multiple documents per page.	2014-05-24 13:29:39 -05:00
Daniel Imfeld	ec41294bc4	Add footnote prefix option. Needs testing	2014-05-24 02:55:13 -05:00
Daniel Imfeld	5c12499aa1	Add ability to convert relative links to absolute	2014-05-18 01:28:15 -05:00
Vytautas Šaltenis	3dba5bc56e	Merge branch 'master' of github.com:gihnius/blackfriday into gihnius-master Conflicts: html.go inline_test.go	2014-05-01 21:43:42 +03:00
Martin Probst	41251715ad	Use go.net/html's parser to sanitize HTML. Use an HTML5 compliant parser that interprets HTML as a browser would to parse the Markdown result and then sanitize based on the result. Escape unrecognized and disallowed HTML in the result. Currently works with a hard coded whitelist of safe HTML tags and attributes.	2014-04-27 23:40:44 +02:00
willnix	be9cbc634a	tagWhitelist allows alignment attribute now This is the closest I could get to removing everything "unsave" without introducing an additional regex.	2014-04-19 21:59:04 +00:00
willnix	c1e4996787	Add table tags to the whitelist. Fixing: `55cd82008e` This commit introduced a html tag whitelist which does not include any table tags (<td>,<tr>,<thead>...). Therefore even tables the markdown parser itself generated will be removed.	2014-04-17 15:44:40 +00:00
Vytautas Šaltenis	c5ece173ad	Merge pull request #59 from johnsto/master Header ID specifiers	2014-04-11 21:31:27 +03:00
Dave Johnston	2dff0864f0	Add header ID support and tests: # Header {#myid}	2014-04-05 20:42:58 +01:00
Kjetil Mehl	786aed6213	Explicit return byte array at end of function.	2014-04-05 16:59:28 +02:00
Vytautas Šaltenis	55bb56bf9b	Merge pull request #55 from rtfb/master Autolink fixes	2014-03-30 19:58:39 +03:00
Vytautas Šaltenis	d643453f1e	Merge pull request #50 from rtfb/master Better protection against JavaScript injection	2014-03-30 19:52:13 +03:00
gihnius	93484b1424	add nofollow ref for non internal links only	2014-03-21 11:14:58 +08:00
gihnius	ecf59d4a55	add target blank attr	2014-03-21 10:52:46 +08:00
Graham Miller	d71c759108	add HTML_NOFOLLOW_LINKS	2014-02-25 09:21:57 -05:00
Vytautas Šaltenis	b0bdfbec4c	Fix bug in autolink overescaping html entities If autolink encounters a link which already has an escaped html entity, it would escape the ampersand again, producing things like these: & --> &amp; " --> &quot; This commit solves that by first looking for all entity-looking things in the link and copying those ranges verbatim, only considering the rest of the string for escaping. Doesn't seem to have considerable performance impact. The mailto: links are processed the old way.	2014-02-17 21:09:04 +02:00
Vytautas Šaltenis	cc0d56d092	Extract a chain of ifs into separate func This gives a ~10% slowdown of a full test run, which is tolerable. Switch statement is still slightly slower (~5%). Using map turned out to be unacceptably slow (~3x slowdown).	2014-02-17 21:09:04 +02:00
Vytautas Šaltenis	31a96c6ce7	go fmt	2014-02-17 21:09:03 +02:00
Vytautas Šaltenis	2f50a53f8e	Rename HTML_SKIP_SCRIPT to HTML_SANITIZE_OUTPUT	2014-01-22 01:23:43 +02:00
Vytautas Šaltenis	55cd82008e	Rewrite protection against JavaScript injection This drops the naive approach at <script> tag stripping and resorts to full sanitization of html. The general idea (and the regexps) is grabbed from Stack Exchange's PageDown JavaScript Markdown processor[1]. Like in PageDown, it's implemented as a separate pass over resulting html. Includes a metric ton (but not all) of test cases from here[2]. Several are commented out since they don't pass yet. Stronger (but still incomplete) fix for #11. [1] http://code.google.com/p/pagedown/wiki/PageDown [2] https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet	2014-01-22 01:14:35 +02:00
Vytautas Šaltenis	e02c392dc6	Extract useful code to separate func	2014-01-22 00:45:43 +02:00
David Kitchen	6e6572e913	Added th to table headers so that styling with things like Twitter Bootstrap and typeset.css work as expected. Cells in headers should always be TH unless they are advisory cells within headers in which case TD is acceptable (but being Markdown a user with such needs could just enter HTML for this)	2013-10-16 11:36:33 +01:00
moshee	c23099e5ee	Implementation and some tests for inline footnotes. Also I noticed the list items had the wrong ids, that was silly of me.	2013-07-01 01:37:52 +00:00
moshee	7bdb82c53a	new tests pass but old tests now fail...	2013-06-26 15:57:51 +00:00
moshee	be082a1ef2	First attempt at supporting Pandoc-style footnotes. The existing tests have not broken but the new functionality does not work yet.	2013-06-25 01:18:47 +00:00
Vytautas Šaltenis	8226238289	Improve html element stripping code	2013-04-18 03:15:47 +03:00
Vytautas Šaltenis	dcaaa9b5dc	More <script> stripping Partially addresses issue #11.	2013-04-13 23:24:30 +03:00
Vytautas Šaltenis	fb923cdb78	Add an option to strip <script> elements Partially addresses issue #11.	2013-04-13 22:57:16 +03:00
Vytautas Šaltenis	b79e720a36	Make isHtmlTag() case insensitive	2013-04-13 22:34:37 +03:00
Vytautas Šaltenis	a2fda5e98f	Extract repetitive code to a func	2013-04-13 22:26:29 +03:00
Vytautas Šaltenis	d5a8df164b	Fix bug in isHtmlTag() Fix what seems to be a typo. j should iterate through all tagname, so it should be initialized to zero. The test exposes this bug.	2013-04-13 22:21:47 +03:00
Caleb Spare	a25d9a543f	Fix html tag ordering in doc string.	2012-11-22 12:52:56 -08:00
Caleb Spare	d0d854958e	Fix up method documentation formatting.	2012-11-22 12:12:08 -08:00
moshee	8a86b6d6be	HTML5 doctype, Wrap TOC with <nav> <nav> makes the TOC more easily identifiable and workable with CSS.	2012-10-21 21:23:44 -07:00
Russ Ross	a5441fd99f	updates for go 1	2012-03-07 21:36:31 -07:00
Russ Ross	530123dd9f	additional doc comments	2011-07-07 12:05:29 -06:00
Russ Ross	bb8ee591d1	doc improvements, commenting	2011-07-07 11:56:45 -06:00
Russ Ross	bd60e3691b	removing more redundant checks, additional cleanup of block parsing	2011-07-01 14:13:26 -06:00
Russ Ross	689f6cb79b	more consistent spacing of block-level elements	2011-07-01 11:19:42 -06:00
Russ Ross	ae9562f685	move whitespace stripping to parser, not renderers	2011-06-29 15:38:35 -06:00
Russ Ross	d3c8225096	corner case spacing issue with table of contents	2011-06-29 13:24:15 -06:00
Russ Ross	2aca667078	simplify inline callback interface	2011-06-29 13:00:54 -06:00
Russ Ross	3c6f18afc7	Renderer is now an interface	2011-06-29 11:13:17 -06:00
Russ Ross	793fee5451	preparing for switch to rendering interface	2011-06-29 10:43:10 -06:00
Russ Ross	55697351d0	table of contents support beefed up	2011-06-29 10:36:56 -06:00
Russ Ross	873a60ad49	complete page rendering is now an option in the library	2011-06-29 10:08:56 -06:00
Russ Ross	b1a0318250	refactoring: inline renderers return bools, preparing rendering struct to become an interface	2011-06-28 19:46:35 -06:00
Russ Ross	55cde00c8a	camel case	2011-06-28 16:02:12 -06:00
Russ Ross	fde2c60665	version number, few more options for command-line tool	2011-06-28 11:30:10 -06:00
Russ Ross	f8f70572a4	simplified BSD license	2011-06-27 20:11:32 -06:00
Russ Ross	e22e43bf76	eliminate a buffering level for paragraphs	2011-06-26 17:21:11 -06:00
Russ Ross	ea3d80e2d0	clean up main markdown function: split out first and second passes	2011-06-26 09:51:36 -06:00
Russ Ross	f5e3dc8073	refactoring: newlines as hard breaks changed from HTML option to global markdown option	2011-06-25 15:45:51 -06:00
Russ Ross	812e8d0185	refactoring paragraph rendering	2011-06-25 15:18:34 -06:00
Russ Ross	eff64c563f	reduce copying for lists	2011-06-25 15:02:46 -06:00
Russ Ross	cf97fbd897	experiment: render headers directly to output buffer to avoid a copy; minor speed boost	2011-06-25 08:20:08 -06:00
Russ Ross	45ab8d0dc4	dumb tweak that gives a little speed bump	2011-06-24 21:53:46 -06:00
Russ Ross	44db721708	rewrite of attrEscape: cleaner and faster	2011-06-24 19:11:06 -06:00
Russ Ross	f9b03f67fb	output validates, command-line tool has useful options	2011-06-24 11:50:03 -06:00
Russ Ross	f3386eb849	gofmt	2011-05-31 11:49:49 -06:00
Russ Ross	9d23b68fa5	export all names from Renderer struct This enables new back-ends that are not part of the package Basically a big search-and-replace for this commit	2011-05-30 21:44:52 -06:00
Russ Ross	679e1686db	performance fix: with autolinking on, it is almost twice as fast now	2011-05-30 15:36:31 -06:00
Russ Ross	ee3fe99203	rudimentary latex backend, additional cleanup	2011-05-30 11:06:20 -06:00
Russ Ross	81cefb5e7c	split parser into multiple files, clean up naming	2011-05-29 17:00:31 -06:00
Russ Ross	4e2d6a50a7	cleanup in markdown: better naming, misc fixes	2011-05-29 11:43:18 -06:00
Russ Ross	965748ad3d	refactored into a proper package	2011-05-28 21:17:53 -06:00

1 2 3

126 Commits