Changesets can be listed by changeset number.
The Git repository is here.
- Revision:
- 13
- Log:
Initial import of Typo 2.6.0 sources from a downloaded Tarball.
Typo is a Ruby On Rails based blog engine.
- Author:
- adh
- Date:
- Sat Jul 22 22:25:02 +0100 2006
- Size:
- 16691 Bytes
1 | # |
2 | # = RubyPants -- SmartyPants ported to Ruby |
3 | # |
4 | # Ported by Christian Neukirchen <mailto:chneukirchen@gmail.com> |
5 | # Copyright (C) 2004 Christian Neukirchen |
6 | # |
7 | # Incooporates ideas, comments and documentation by Chad Miller |
8 | # Copyright (C) 2004 Chad Miller |
9 | # |
10 | # Original SmartyPants by John Gruber |
11 | # Copyright (C) 2003 John Gruber |
12 | # |
13 | |
14 | # |
15 | # = RubyPants -- SmartyPants ported to Ruby |
16 | # |
17 | # == Synopsis |
18 | # |
19 | # RubyPants is a Ruby port of the smart-quotes library SmartyPants. |
20 | # |
21 | # The original "SmartyPants" is a free web publishing plug-in for |
22 | # Movable Type, Blosxom, and BBEdit that easily translates plain ASCII |
23 | # punctuation characters into "smart" typographic punctuation HTML |
24 | # entities. |
25 | # |
26 | # |
27 | # == Description |
28 | # |
29 | # RubyPants can perform the following transformations: |
30 | # |
31 | # * Straight quotes (<tt>"</tt> and <tt>'</tt>) into "curly" quote |
32 | # HTML entities |
33 | # * Backticks-style quotes (<tt>``like this''</tt>) into "curly" quote |
34 | # HTML entities |
35 | # * Dashes (<tt>--</tt> and <tt>---</tt>) into en- and em-dash |
36 | # entities |
37 | # * Three consecutive dots (<tt>...</tt> or <tt>. . .</tt>) into an |
38 | # ellipsis entity |
39 | # |
40 | # This means you can write, edit, and save your posts using plain old |
41 | # ASCII straight quotes, plain dashes, and plain dots, but your |
42 | # published posts (and final HTML output) will appear with smart |
43 | # quotes, em-dashes, and proper ellipses. |
44 | # |
45 | # RubyPants does not modify characters within <tt><pre></tt>, |
46 | # <tt><code></tt>, <tt><kbd></tt>, <tt><math></tt> or |
47 | # <tt><script></tt> tag blocks. Typically, these tags are used to |
48 | # display text where smart quotes and other "smart punctuation" would |
49 | # not be appropriate, such as source code or example markup. |
50 | # |
51 | # |
52 | # == Backslash Escapes |
53 | # |
54 | # If you need to use literal straight quotes (or plain hyphens and |
55 | # periods), RubyPants accepts the following backslash escape sequences |
56 | # to force non-smart punctuation. It does so by transforming the |
57 | # escape sequence into a decimal-encoded HTML entity: |
58 | # |
59 | # \\ \" \' \. \- \` |
60 | # |
61 | # This is useful, for example, when you want to use straight quotes as |
62 | # foot and inch marks: 6'2" tall; a 17" iMac. (Use <tt>6\'2\"</tt> |
63 | # resp. <tt>17\"</tt>.) |
64 | # |
65 | # |
66 | # == Algorithmic Shortcomings |
67 | # |
68 | # One situation in which quotes will get curled the wrong way is when |
69 | # apostrophes are used at the start of leading contractions. For |
70 | # example: |
71 | # |
72 | # 'Twas the night before Christmas. |
73 | # |
74 | # In the case above, RubyPants will turn the apostrophe into an |
75 | # opening single-quote, when in fact it should be a closing one. I |
76 | # don't think this problem can be solved in the general case--every |
77 | # word processor I've tried gets this wrong as well. In such cases, |
78 | # it's best to use the proper HTML entity for closing single-quotes |
79 | # ("<tt>’</tt>") by hand. |
80 | # |
81 | # |
82 | # == Bugs |
83 | # |
84 | # To file bug reports or feature requests (except see above) please |
85 | # send email to: mailto:chneukirchen@gmail.com |
86 | # |
87 | # If the bug involves quotes being curled the wrong way, please send |
88 | # example text to illustrate. |
89 | # |
90 | # |
91 | # == Authors |
92 | # |
93 | # John Gruber did all of the hard work of writing this software in |
94 | # Perl for Movable Type and almost all of this useful documentation. |
95 | # Chad Miller ported it to Python to use with Pyblosxom. |
96 | # |
97 | # Christian Neukirchen provided the Ruby port, as a general-purpose |
98 | # library that follows the *Cloth API. |
99 | # |
100 | # |
101 | # == Copyright and License |
102 | # |
103 | # === SmartyPants license: |
104 | # |
105 | # Copyright (c) 2003 John Gruber |
106 | # (http://daringfireball.net) |
107 | # All rights reserved. |
108 | # |
109 | # Redistribution and use in source and binary forms, with or without |
110 | # modification, are permitted provided that the following conditions |
111 | # are met: |
112 | # |
113 | # * Redistributions of source code must retain the above copyright |
114 | # notice, this list of conditions and the following disclaimer. |
115 | # |
116 | # * Redistributions in binary form must reproduce the above copyright |
117 | # notice, this list of conditions and the following disclaimer in |
118 | # the documentation and/or other materials provided with the |
119 | # distribution. |
120 | # |
121 | # * Neither the name "SmartyPants" nor the names of its contributors |
122 | # may be used to endorse or promote products derived from this |
123 | # software without specific prior written permission. |
124 | # |
125 | # This software is provided by the copyright holders and contributors |
126 | # "as is" and any express or implied warranties, including, but not |
127 | # limited to, the implied warranties of merchantability and fitness |
128 | # for a particular purpose are disclaimed. In no event shall the |
129 | # copyright owner or contributors be liable for any direct, indirect, |
130 | # incidental, special, exemplary, or consequential damages (including, |
131 | # but not limited to, procurement of substitute goods or services; |
132 | # loss of use, data, or profits; or business interruption) however |
133 | # caused and on any theory of liability, whether in contract, strict |
134 | # liability, or tort (including negligence or otherwise) arising in |
135 | # any way out of the use of this software, even if advised of the |
136 | # possibility of such damage. |
137 | # |
138 | # === RubyPants license |
139 | # |
140 | # RubyPants is a derivative work of SmartyPants and smartypants.py. |
141 | # |
142 | # Redistribution and use in source and binary forms, with or without |
143 | # modification, are permitted provided that the following conditions |
144 | # are met: |
145 | # |
146 | # * Redistributions of source code must retain the above copyright |
147 | # notice, this list of conditions and the following disclaimer. |
148 | # |
149 | # * Redistributions in binary form must reproduce the above copyright |
150 | # notice, this list of conditions and the following disclaimer in |
151 | # the documentation and/or other materials provided with the |
152 | # distribution. |
153 | # |
154 | # This software is provided by the copyright holders and contributors |
155 | # "as is" and any express or implied warranties, including, but not |
156 | # limited to, the implied warranties of merchantability and fitness |
157 | # for a particular purpose are disclaimed. In no event shall the |
158 | # copyright owner or contributors be liable for any direct, indirect, |
159 | # incidental, special, exemplary, or consequential damages (including, |
160 | # but not limited to, procurement of substitute goods or services; |
161 | # loss of use, data, or profits; or business interruption) however |
162 | # caused and on any theory of liability, whether in contract, strict |
163 | # liability, or tort (including negligence or otherwise) arising in |
164 | # any way out of the use of this software, even if advised of the |
165 | # possibility of such damage. |
166 | # |
167 | # |
168 | # == Links |
169 | # |
170 | # John Gruber:: http://daringfireball.net |
171 | # SmartyPants:: http://daringfireball.net/projects/smartypants |
172 | # |
173 | # Chad Miller:: http://web.chad.org |
174 | # |
175 | # Christian Neukirchen:: http://kronavita.de/chris |
176 | # |
177 | |
178 | |
179 | class RubyPants < String |
180 | VERSION = "0.2" |
181 | |
182 | # Create a new RubyPants instance with the text in +string+. |
183 | # |
184 | # Allowed elements in the options array: |
185 | # |
186 | # 0 :: do nothing |
187 | # 1 :: enable all, using only em-dash shortcuts |
188 | # 2 :: enable all, using old school en- and em-dash shortcuts (*default*) |
189 | # 3 :: enable all, using inverted old school en and em-dash shortcuts |
190 | # -1 :: stupefy (translate HTML entities to their ASCII-counterparts) |
191 | # |
192 | # If you don't like any of these defaults, you can pass symbols to change |
193 | # RubyPants' behavior: |
194 | # |
195 | # <tt>:quotes</tt> :: quotes |
196 | # <tt>:backticks</tt> :: backtick quotes (``double'' only) |
197 | # <tt>:allbackticks</tt> :: backtick quotes (``double'' and `single') |
198 | # <tt>:dashes</tt> :: dashes |
199 | # <tt>:oldschool</tt> :: old school dashes |
200 | # <tt>:inverted</tt> :: inverted old school dashes |
201 | # <tt>:ellipses</tt> :: ellipses |
202 | # <tt>:convertquotes</tt> :: convert <tt>"</tt> entities to |
203 | # <tt>"</tt> for Dreamweaver users |
204 | # <tt>:stupefy</tt> :: translate RubyPants HTML entities |
205 | # to their ASCII counterparts. |
206 | # |
207 | def initialize(string, options=[2]) |
208 | super string |
209 | @options = [*options] |
210 | end |
211 | |
212 | # Apply SmartyPants transformations. |
213 | def to_html |
214 | do_quotes = do_backticks = do_dashes = do_ellipses = do_stupify = nil |
215 | convert_quotes = false |
216 | |
217 | if @options.include? 0 |
218 | # Do nothing. |
219 | return self |
220 | elsif @options.include? 1 |
221 | # Do everything, turn all options on. |
222 | do_quotes = do_backticks = do_ellipses = true |
223 | do_dashes = :normal |
224 | elsif @options.include? 2 |
225 | # Do everything, turn all options on, use old school dash shorthand. |
226 | do_quotes = do_backticks = do_ellipses = true |
227 | do_dashes = :oldschool |
228 | elsif @options.include? 3 |
229 | # Do everything, turn all options on, use inverted old school |
230 | # dash shorthand. |
231 | do_quotes = do_backticks = do_ellipses = true |
232 | do_dashes = :inverted |
233 | elsif @options.include?(-1) |
234 | do_stupefy = true |
235 | else |
236 | do_quotes = @options.include? :quotes |
237 | do_backticks = @options.include? :backticks |
238 | do_backticks = :both if @options.include? :allbackticks |
239 | do_dashes = :normal if @options.include? :dashes |
240 | do_dashes = :oldschool if @options.include? :oldschool |
241 | do_dashes = :inverted if @options.include? :inverted |
242 | do_ellipses = @options.include? :ellipses |
243 | convert_quotes = @options.include? :convertquotes |
244 | do_stupefy = @options.include? :stupefy |
245 | end |
246 | |
247 | # Parse the HTML |
248 | tokens = tokenize |
249 | |
250 | # Keep track of when we're inside <pre> or <code> tags. |
251 | in_pre = false |
252 | |
253 | # Here is the result stored in. |
254 | result = "" |
255 | |
256 | # This is a cheat, used to get some context for one-character |
257 | # tokens that consist of just a quote char. What we do is remember |
258 | # the last character of the previous text token, to use as context |
259 | # to curl single- character quote tokens correctly. |
260 | prev_token_last_char = nil |
261 | |
262 | tokens.each { |token| |
263 | if token.first == :tag |
264 | result << token[1] |
265 | if token[1] =~ %r!<(/?)(?:pre|code|kbd|script|math)[\s>]! |
266 | in_pre = ($1 != "/") # Opening or closing tag? |
267 | end |
268 | else |
269 | t = token[1] |
270 | |
271 | # Remember last char of this token before processing. |
272 | last_char = t[-1].chr |
273 | |
274 | unless in_pre |
275 | t = process_escapes t |
276 | |
277 | t.gsub!(/"/, '"') if convert_quotes |
278 | |
279 | if do_dashes |
280 | t = educate_dashes t if do_dashes == :normal |
281 | t = educate_dashes_oldschool t if do_dashes == :oldschool |
282 | t = educate_dashes_inverted t if do_dashes == :inverted |
283 | end |
284 | |
285 | t = educate_ellipses t if do_ellipses |
286 | |
287 | # Note: backticks need to be processed before quotes. |
288 | if do_backticks |
289 | t = educate_backticks t |
290 | t = educate_single_backticks t if do_backticks == :both |
291 | end |
292 | |
293 | if do_quotes |
294 | if t == "'" |
295 | # Special case: single-character ' token |
296 | if prev_token_last_char =~ /\S/ |
297 | t = "’" |
298 | else |
299 | t = "‘" |
300 | end |
301 | elsif t == '"' |
302 | # Special case: single-character " token |
303 | if prev_token_last_char =~ /\S/ |
304 | t = "”" |
305 | else |
306 | t = "“" |
307 | end |
308 | else |
309 | # Normal case: |
310 | t = educate_quotes t |
311 | end |
312 | end |
313 | |
314 | t = stupefy_entities t if do_stupefy |
315 | end |
316 | |
317 | prev_token_last_char = last_char |
318 | result << t |
319 | end |
320 | } |
321 | |
322 | # Done |
323 | result |
324 | end |
325 | |
326 | protected |
327 | |
328 | # Return the string, with after processing the following backslash |
329 | # escape sequences. This is useful if you want to force a "dumb" quote |
330 | # or other character to appear. |
331 | # |
332 | # Escaped are: |
333 | # \\ \" \' \. \- \` |
334 | # |
335 | def process_escapes(str) |
336 | str.gsub('\\\\', '\'). |
337 | gsub('\"', '"'). |
338 | gsub("\\\'", '''). |
339 | gsub('\.', '.'). |
340 | gsub('\-', '-'). |
341 | gsub('\`', '`') |
342 | end |
343 | |
344 | # The string, with each instance of "<tt>--</tt>" translated to an |
345 | # em-dash HTML entity. |
346 | # |
347 | def educate_dashes(str) |
348 | str.gsub(/--/, '—') |
349 | end |
350 | |
351 | # The string, with each instance of "<tt>--</tt>" translated to an |
352 | # en-dash HTML entity, and each "<tt>---</tt>" translated to an |
353 | # em-dash HTML entity. |
354 | # |
355 | def educate_dashes_oldschool(str) |
356 | str.gsub(/---/, '—').gsub(/--/, '–') |
357 | end |
358 | |
359 | # Return the string, with each instance of "<tt>--</tt>" translated |
360 | # to an em-dash HTML entity, and each "<tt>---</tt>" translated to |
361 | # an en-dash HTML entity. Two reasons why: First, unlike the en- and |
362 | # em-dash syntax supported by +educate_dashes_oldschool+, it's |
363 | # compatible with existing entries written before SmartyPants 1.1, |
364 | # back when "<tt>--</tt>" was only used for em-dashes. Second, |
365 | # em-dashes are more common than en-dashes, and so it sort of makes |
366 | # sense that the shortcut should be shorter to type. (Thanks to |
367 | # Aaron Swartz for the idea.) |
368 | # |
369 | def educate_dashes_inverted(str) |
370 | str.gsub(/---/, '–').gsub(/--/, '—') |
371 | end |
372 | |
373 | # Return the string, with each instance of "<tt>...</tt>" translated |
374 | # to an ellipsis HTML entity. Also converts the case where there are |
375 | # spaces between the dots. |
376 | # |
377 | def educate_ellipses(str) |
378 | str.gsub('...', '…').gsub('. . .', '…') |
379 | end |
380 | |
381 | # Return the string, with "<tt>``backticks''</tt>"-style single quotes |
382 | # translated into HTML curly quote entities. |
383 | # |
384 | def educate_backticks(str) |
385 | str.gsub("``", '“').gsub("''", '”') |
386 | end |
387 | |
388 | # Return the string, with "<tt>`backticks'</tt>"-style single quotes |
389 | # translated into HTML curly quote entities. |
390 | # |
391 | def educate_single_backticks(str) |
392 | str.gsub("`", '‘').gsub("'", '’') |
393 | end |
394 | |
395 | # Return the string, with "educated" curly quote HTML entities. |
396 | # |
397 | def educate_quotes(str) |
398 | punct_class = '[!"#\$\%\'()*+,\-.\/:;<=>?\@\[\\\\\]\^_`{|}~]' |
399 | |
400 | str = str.dup |
401 | |
402 | # Special case if the very first character is a quote followed by |
403 | # punctuation at a non-word-break. Close the quotes by brute |
404 | # force: |
405 | str.gsub!(/^'(?=#{punct_class}\B)/, '’') |
406 | str.gsub!(/^"(?=#{punct_class}\B)/, '”') |
407 | |
408 | # Special case for double sets of quotes, e.g.: |
409 | # <p>He said, "'Quoted' words in a larger quote."</p> |
410 | str.gsub!(/"'(?=\w)/, '“‘') |
411 | str.gsub!(/'"(?=\w)/, '‘“') |
412 | |
413 | # Special case for decade abbreviations (the '80s): |
414 | str.gsub!(/'(?=\d\ds)/, '’') |
415 | |
416 | close_class = %![^\ \t\r\n\\[\{\(\-]! |
417 | dec_dashes = '–|—' |
418 | |
419 | # Get most opening single quotes: |
420 | str.gsub!(/(\s| |--|&[mn]dash;|#{dec_dashes}|ȁ[34];)'(?=\w)/, |
421 | '\1‘') |
422 | # Single closing quotes: |
423 | str.gsub!(/(#{close_class})'/, '\1’') |
424 | str.gsub!(/'(\s|s\b|$)/, '’\1') |
425 | # Any remaining single quotes should be opening ones: |
426 | str.gsub!(/'/, '‘') |
427 | |
428 | # Get most opening double quotes: |
429 | str.gsub!(/(\s| |--|&[mn]dash;|#{dec_dashes}|ȁ[34];)"(?=\w)/, |
430 | '\1“') |
431 | # Double closing quotes: |
432 | str.gsub!(/(#{close_class})"/, '\1”') |
433 | str.gsub!(/"(\s|s\b|$)/, '”\1') |
434 | # Any remaining quotes should be opening ones: |
435 | str.gsub!(/"/, '“') |
436 | |
437 | str |
438 | end |
439 | |
440 | # Return the string, with each RubyPants HTML entity translated to |
441 | # its ASCII counterpart. |
442 | # |
443 | # Note: This is not reversible (but exactly the same as in SmartyPants) |
444 | # |
445 | def stupefy_entities(str) |
446 | str. |
447 | gsub(/–/, '-'). # en-dash |
448 | gsub(/—/, '--'). # em-dash |
449 | |
450 | gsub(/‘/, "'"). # open single quote |
451 | gsub(/’/, "'"). # close single quote |
452 | |
453 | gsub(/“/, '"'). # open double quote |
454 | gsub(/”/, '"'). # close double quote |
455 | |
456 | gsub(/…/, '...') # ellipsis |
457 | end |
458 | |
459 | # Return an array of the tokens comprising the string. Each token is |
460 | # either a tag (possibly with nested, tags contained therein, such |
461 | # as <tt><a href="<MTFoo>"></tt>, or a run of text between |
462 | # tags. Each element of the array is a two-element array; the first |
463 | # is either :tag or :text; the second is the actual value. |
464 | # |
465 | # Based on the <tt>_tokenize()</tt> subroutine from Brad Choate's |
466 | # MTRegex plugin. <http://www.bradchoate.com/past/mtregex.php> |
467 | # |
468 | # This is actually the easier variant using tag_soup, as used by |
469 | # Chad Miller in the Python port of SmartyPants. |
470 | # |
471 | def tokenize |
472 | tag_soup = /([^<]*)(<[^>]*>)/ |
473 | |
474 | tokens = [] |
475 | |
476 | prev_end = 0 |
477 | scan(tag_soup) { |
478 | tokens << [:text, $1] if $1 != "" |
479 | tokens << [:tag, $2] |
480 | |
481 | prev_end = $~.end(0) |
482 | } |
483 | |
484 | if prev_end < size |
485 | tokens << [:text, self[prev_end..-1]] |
486 | end |
487 | |
488 | tokens |
489 | end |
490 | end |