Install-Package SoftCircuits.CodeColorizer
Code Colorizer is a .NET class library to convert source code to HTML with syntax coloring. The library is language-agnostic, meaning that the the same code is used for all supported languages. Only the language rules change for each language.
Each language is defined by creating a LanguageRules
object that describes the language. The LanguageRulesCollection
can store the rules for any number of languages, and read and write them to an XML file using the Save()
and Load()
methods. A link to a sample XML file is down below. I can give you a head start on defining the languages you want to support.
The XML file contains the following elements.
Rule | Purpose |
---|---|
name | The name of this language. |
caseSensitive | Determines if this language is case-sensitive (boolean). |
symbolChars | Characters that make up language keywords and symbol names. |
symbolFirstChars | Characters that can appear as the first character in language keywords and symbol names. |
operatorChars | Characters that can appear within language operators. Must include all characters used to signify comments. |
quotes | Single character used denote string literals. Also supports an optional escape character. If a string contains the escape character followed by the quote, that quote is assumed to be part of the string and not the terminator). If the language supports more than one quote type (such as " and '), you can include multiple quotes rules. |
blockComments | Defines strings to delimit block comments. If the language supports multiple block comments delimiters, you can include multiple blockComment rules. |
lineComments | String that starts a line comment (characters to the end of the line are assumed to be a comment). If the language supports multiple line comment operators, you can include multiple lineComment rules. |
keywords | Lists all the keywords supported by this language. |
symbols | Lists all the symbol names supported by this language. For example, the names of custom types (those not defined by the language itself) could be included in this list. Because this list could be incredibly long and require frequent updates, it is often not used. |
The following example loads the language rules from the file "LanguageRules.xml". It then creates a CodeColorizer
instance, passing one of the languages stored in the LanguageRulesCollection
to the CodeColorizer
constructor. It then defines the CSS class names for each token type. Finally, it calls the Transform()
method to convert the source code.
LanguageRulesCollection Languages = new LanguageRulesCollection();
Languages.Load("LanguageRules.xml");
CodeColorizer colorizer = new CodeColorizer(Languages["cs"]);
colorizer.CommentCssClass = "Comment_Class";
colorizer.KeywordCssClass = "Keyword_Class";
colorizer.OperatorCssClass = "Operator_Class";
colorizer.StringCssClass = "String_Class";
colorizer.SymbolCssClass = "Symbol_Class";
string html = colorizer.Transform(sourceCode);
The resulting output above will be the source code with markup added to implement the CSS classes that were specified. In addition, the output will be HTML encoded. The output is formatted to appear correctly when placed within a <pre>...</pre>
block on a web page.
For additional information and a discussion of the source code, please see my article Colorizing Source Code.