I've thought about creating my own syntax highlighter. I've been using Chili, but there are some odd bugs that pop up here and there and it doesn't seem to play well with Chrome. And it hasn't been updated in 2 years.
One thing I did want was line numbering, but that's been a bugaboo of syntax highighlighters for a long time—you want the numbers but do not want them copied when code is selected. Firefox copies the numbers when using <li>
elements, and tables or inserted text will also copy everything. The answer seems to be using :before
to insert the line numbers, since that text won't be copied in any modern browser (IE 8 and below don't support :before
, but we won't worry about that).
The issue then is how to tell CSS about the lines. We want to wrap them in <span>
s, as so:
<pre>
<span class=line>This is a <em>text</em></span>
<span class=line>This is the second line</span>
</pre>
And number everything with CSS:
pre.test1 {
counter-reset: linecounter;
}
pre.test1 span.line{
counter-increment: linecounter;
}
pre.test1 span.line:before{
content: counter(linecounter);
width: 2em;
display: inline-block;
border-right: 1px solid black;
}
And this is the result, exactly as desired.
This is a text This is the second line
The keys in the CSS are lines 1 and 4 that set up the counter (change line 1 to linecounter 4
to start the numbering at 5 (counter-increment
increments before displaying)) (change linecounter
to anything you want as long as its consistent). Line 7 displays the value of the counter in the :before
pseudoelement, and lines 8-10 are just old-fashioned styling to make it prettier. You of course would want to add some padding, margin, odd/even backgrounds etc., but that's old hat.
But how do we get the <span>
s to wrap the lines? We could just take the text and split it on '\n'
and use string processing to wrap them: element.innerHTML = element.textContent.replace(/.+/g, '$&')
but that loses all internal markup. Luckily, browsers that implement contentEditable
know how to insert stuff without messing up the structure by using ranges, and we know how to manipulate ranges.
Rather than including the whole bililiteRange
class, since I know we're only going to be dealing with standards-compliant browsers, I can just take out the relevant code:
function wrapLines (el){
var text = el.textContent.split('\n');
var range = document.createRange();
var pointer = 0; // start of text
el.textContent.split('\n').forEach(function(line, i){
var len = line.length;
setBounds (pointer, pointer+len); // sets range to the characters of the line
var wrapper = document.createElement('span');
wrapper.setAttribute('class', 'line');
wrapper.appendChild(range.extractContents()); // pulls the contents of the range out of the document and into wrapper
range.insertNode(wrapper); // and put back the wrapped line
pointer += len+1; // skip the newline
});
// now, we're left with a bunch of empty spans/other elements that were split across lines and the browser divided them into three parts (first line, newline character, second line)
// those mess up the odd/even calculations. Replace them with plain text.
for (var node = el.firstChild; node; node = node.nextSibling){
if (node.nodeType != 3 && node.getAttribute('class') != 'line'){
var replacement = document.createTextNode(node.textContent);
el.replaceChild(replacement, node);
node = replacement;
}
}
function setBounds (start, end){
// since the browser throws an error if we try to move the beginning past the end (unlike IE, which just adusts the end)
// we have to reset the range to cover the entire element, then move the start, then move the end to the start, then move the end
range.selectNodeContents(el);
moveBoundary (start, 'start');
range.collapse (true);
moveBoundary (end-start, 'end');
}
function moveBoundary (n, start){
// move the boundary n characters forward, up to the end of the element. Forward only!
// start is 'start' or 'end', and is used to create the appropriate method names ('startContainer' or 'endContainer' etc.)
// if the start is moved after the end, then an exception is raised
if (n <= 0) return;
var startNode = range[start+'Container'];
// we may be starting somewhere into the text
if (startNode.nodeType == 3) n += range[start+'Offset'];
// nodeIterators from http://www.w3.org/TR/DOM-Level-2-Traversal-Range/traversal.html
var iter = document.createNodeIterator(el, 4 /* SHOW_TEXT */), node;
while (node = iter.nextNode()){
if (startNode.compareDocumentPosition(node) & 2 /* DOCUMENT_POSITION_PRECEDING */ ) continue;
if (n <= node.nodeValue.length){
// we found the last character!
range[start == 'start' ? 'setStart' :'setEnd'](node, n);
return;
}else{
n -= node.nodeValue.length; // eat these characters
}
}
}
}
And now it works (note the original markup has no line-wrapping spans; that's added with javascript):
<pre class="test1 numbered">
This is a <em>text</em>
This is the second line
This is the third line; this text should have line numbers.</pre>
wrapLines($('.numbered')[0])
Mike says:
Can you add some code to change background colour for alternate lines?
April 20, 2014, 12:46 pmDanny says:
@Mike:
Use CSS:
–Danny
April 20, 2014, 1:34 pmMohammad Hamza says:
I think instead this the syntax highlighter codes are pretty better
October 23, 2014, 3:19 amEzeh says:
How can one install this code on blogger blog please.
October 28, 2014, 9:49 amDanny says:
@Ezeh:
October 28, 2014, 11:34 amI don’t know anything about blogger. Can you include javascript?
–Danny
Danny says:
@Mohammad:
October 28, 2014, 11:40 amI agree. This was more a proof-of-concept. I like my Prism plugin, though I have to admit that Lea Verou (the creator of Prism) doesn’t like wrapping lines to set line numbers.
–Danny
Ezeh says:
@Danny:
October 31, 2014, 6:24 amYes on my template i can do that. But i need steps that will guide me.
Danny says:
I don’t know anything about Blogger to tell you how to include javascript and CSS files. Sorry
November 27, 2014, 3:59 pmDavid says:
I’m seeing odd behavior with an embedded span that crosses multiple lines, e.g,
blah blah
blah blah blah
blah blah
Is the middle line of samples like this formatted correctly for everyone else (i.e., have I don’t something wrong), or does the code not retrieve the context except in cases where the range logic has to patch up something that isn’t well formed?
December 9, 2014, 10:33 pmDavid says:
Sorry; new to the way WordPress seems to eat markup. What I meant was: I’m seeing odd behavior with an embedded span that crosses multiple lines, e.g,
blah <span style=”color:red;”>blah
blah blah blah
blah blah
Is the middle line of samples like this formatted correctly for everyone else (i.e., have I don’t something wrong), or does the code not retrieve the context except in cases where the range logic has to patch up something that isn’t well formed?
December 9, 2014, 10:34 pmDavid says:
Sorry; messed up again. What I *really* meant was: I’m seeing odd behavior with an embedded span that crosses multiple lines, e.g,
blah <span style=”color:red;”>blah
blah blah blah
blah</span> blah
Is the middle line of samples like this formatted correctly for everyone else (i.e., have I don’t something wrong), or does the code not retrieve the context except in cases where the range logic has to patch up something that isn’t well formed?
December 9, 2014, 10:35 pmDanny says:
@David:
Unfortunately, what happens in that case is out of my control. My code does
range.extractContents()
and then inserts that extracted element into a new<span>
. Ifrange.extractContents()
doesn’t preserve the formatting then you will have to split the span by hand.When I have time, I will look at it. If you inspect the results, what do you get?
–Danny
December 10, 2014, 8:57 amDavid says:
@Danny, after sleeping on it, I think I understand the cause of the problem and have a solution; I’ll try to get it out in the next day or so, and I’ll drop a note here and post the code in an accessible space when that happens. Thank you for the quick response (and feel free to delete my first few malformed postings, which apparently the blog site won’t let me delete myself).
December 10, 2014, 4:37 pmDavid says:
The problem was that the range strategy fixes stranded start and end tags so that the range contents will be well-formed, but it doesn’t walk up the tree. If after the automatic range fix-up the start and end points of the range are both contained in an element that starts before and ends after the range, any styling associated with that element doesn’t get applied. I’ve addressed this by walking up the tree; code is at https://github.com/djbpitt/numberlines. My version is heavily dependent on the one here (credited in the initial comments), but because I’m not a very adept JavaScript coder, I deviated in places where I didn’t understand how the original worked. I haven’t tested this extensively, but it seems to maintain formatting that starts anywhere inside the <pre> wrapper element.
December 14, 2014, 11:14 am