Language
Language is the interface in Sora Editor to provide language-specific functionality, including syntax analysis, auto-completion and auto-indent.
Single Language
instance should serve for only one editor. And it is automatically destroyed when the editor is released or a new Language
instance is set.
You can use CodeEditor#setEditorLanguage
to apply a new Language
to it. By default, the editor uses built-in EmptyLanguage
and no analysis is performed. Thus, syntax-highlight and other language features are unavailable.
We provide some universal language implementation for you to setup the analysis and syntax-highlight for a programming language. Note that language-java
module is only for simple token-based Java syntax-highlight.
Use Language Modules
Before using the language module, make sure you have imported it into your project.
language-textmate
This module uses TextMate grammars to help tokenize text and highlight for various programming languages. TextMate is also used in Visual Studio Code and Eclipse for syntax-highlight. Most library integrators will in favour of using this module instead of writing Language
implementation themselves.
Follow the steps below to use TextMate for your editor.
Find Language Syntax and Config
TextMate supports various languages, and syntax-highlight rules are defined by *.tmLanguage
PLIST files or *.tmLanguage.json
JSON files. You need these TextMate rule files (aka syntaxes
) and optionally language configuration files (*.language-configuration.json
) for your target language.
You can find those files in:
NOTE
Some TextMate syntaxes are not fully supported by current TextMate engine. Because the regexp library Joni does not fully support those regular expressions used in grammar files.
These regexps will fallback to ^$
to avoid errors during highlight analysis.
Find Themes
TextMate must be used together with TextMate themes. You also need to find theme JSON files from VSCode Extensions. There are some folders named in theme-*
pattern. Those folders are for VSCode built-in TextMate themes.
Prepare Language Registry
Multiple languages can be loaded by TextMate. We should prepare languages.json
for later loading. For exmaple, your assets directory:
.
├─ textmate
│ ├─ java
│ │ ├─ syntaxes
│ │ │ └─ java.tmLanguage.json
│ │ └─ language-configuration.json
│ └─ kotlin
│ ├─ syntaxes
│ │ └─ Kotlin.tmLanguage
│ └─ language-configuration.json
└─ language.json
Your language.json
:
{
"languages": [
{
"grammar": "textmate/java/syntaxes/java.tmLanguage.json",
"name": "java",
"scopeName": "source.java",
"languageConfiguration": "textmate/java/language-configuration.json"
},
{
"grammar": "textmate/kotlin/syntaxes/Kotlin.tmLanguage",
"name": "kotlin",
"scopeName": "source.kotlin",
"languageConfiguration": "textmate/kotlin/language-configuration.json"
}
]
}
name
is custom and scopeName
is the root scope of the syntax file.
For language (like HTML and Markdown) with embedded languages, refer to HTML sample in Demo App
Load Syntaxes and Themes
Before using TextMate languages in editor, we should load the syntax and theme files into registry. These steps are performed only once, no matter how many editors are to use TextMate.
Supposing we are to load textmate files from our APK assets. First, we need to add FileResolver
for TextMate internal file access.
FileProviderRegistry.getInstance().addFileProvider(
AssetsFileResolver(
applicationContext.assets // use application context
)
)
FileProviderRegistry.getInstance().addFileProvider(
new AssetsFileResolver(
getApplicationContext().getAssets() // use application context
)
)
Then, the themes should be loaded. The code below shows how to load a single theme into the editor.
val themeRegistry = ThemeRegistry.getInstance()
val name = "quietlight" // name of theme
val themeAssetsPath = "textmate/$name.json"
themeRegistry.loadTheme(
ThemeModel(
IThemeSource.fromInputStream(
FileProviderRegistry.getInstance().tryGetInputStream(themeAssetsPath), themeAssetsPath, null
),
name
).apply {
// If the theme is dark
// isDark = true
}
)
var themeRegistry = ThemeRegistry.getInstance();
var name = "quietlight"; // name of theme
var themeAssetsPath = "textmate/" + name + ".json";
var model = new ThemeModel(
IThemeSource.fromInputStream(
FileProviderRegistry.getInstance().tryGetInputStream(themeAssetsPath), themeAssetsPath, null
),
name
);
// If the theme is dark
// model.setDark(true);
themeRegistry.loadTheme(model);
Next, select an active theme for TextMate. TextMate uses its registry to manage global color scheme.
ThemeRegistry.getInstance().setTheme("your-theme-name")
ThemeRegistry.getInstance().setTheme("your-theme-name");
Finally, we load the language syntaxes and configurations.
GrammarRegistry.getInstance().loadGrammars("textmate/languages.json")
GrammarRegistry.getInstance().loadGrammars("textmate/languages.json");
Load by Kotlin DSL
You can load languages into grammar registry without languages.json
, by Kotlin DSL.
For example:
GrammarRegistry.getInstance().loadGrammars(
languages {
language("java") {
grammar = "textmate/java/syntaxes/java.tmLanguage.json"
defaultScopeName()
languageConfiguration = "textmate/java/language-configuration.json"
}
language("kotlin") {
grammar = "textmate/kotlin/syntaxes/Kotlin.tmLanguage"
defaultScopeName()
languageConfiguration = "textmate/kotlin/language-configuration.json"
}
language("python") {
grammar = "textmate/python/syntaxes/python.tmLanguage.json"
defaultScopeName()
languageConfiguration = "textmate/python/language-configuration.json"
}
}
)
defaultScopeName()
sets scopeName
to source.${languageName}
.
Setup Editor
Set color scheme for the editor. If TextMateColorScheme
is not applied to the editor, the colors of syntax-highlight result from TextMate will be transparent.
editor.colorScheme = TextMateColorScheme.create(ThemeRegistry.getInstance())
editor.setColorScheme(TextMateColorScheme.create(ThemeRegistry.getInstance()));
Set editor language.
val languageScopeName = "source.java" // The scope name of target language
val language = TextMateLanguage.create(
languageScopeName, true /* true for enabling auto-completion */
)
editor.setEditorLanguage(language)
var languageScopeName = "source.java"; // The scope name of target language
var language = TextMateLanguage.create(
languageScopeName, true /* true for enabling auto-completion */
);
editor.setEditorLanguage(language);
Congratulations! You've done all the setup. Enjoy!
language-java
The Java language support provides token-based highlight, identifier auto-completion and code block markers. It also has some experimental features for testing editor.
Though its functionality remains to be simple, its speed is fairly fast than other complex language analysis.
To create and apply the language, see code below:
editor.editorLanguage = JavaLanguage()
editor.setEditorLanguage(new JavaLanguage());
language-treesitter
TreeSitter is developed by the creators of Atom and now Zed and used in the two code editors. TreeSitter is a parser generator tool and an incremental parsing library.
With TreeSitter, we can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. And use the syntax tree for accurate syntax-highlight.
We use Java binding android-tree-sitter to invoke tree-sitter APIs.
Before reading ahead, we strongly recommend you to check out TextStyle in editor framework first.
Prepare Language
You can find existing language implementation from android-tree-sitter. If the language you want is missing, you have to build the language for Android on your own.
Besides, Four scm
files for querying the syntax tree are required.
- For highlight
highlights.scm
for most languages can be found in TreeSitter language repositories. For exmaple, the one for Java is here
- For highlight
- For code blocks (optional) This is sora-editor specific queries. Refer to here for instructions and sample.
- For brackets (optional) This is sora-editor specific queries. Refer to here for instructions and sample.
- For local variables (optional)
locals.scm
for most languages can be found in nvim-treesitter repository.
- For local variables (optional)
Useful Links:
Create Language Spec
First, TsLanguageSpec
should be created with tree-sitter language instance and scm
source texts. You may also need to add a custom LocalsCaptureSpec
for your locals.scm
.
val spec = TsLanguageSpec(
// Your tree-sitter language instance
language = TSLanguageJava.getInstance(),
// scm source texts
highlightScmSource = assets.open("tree-sitter-queries/java/highlights.scm")
.reader().readText(),
codeBlocksScmSource = assets.open("tree-sitter-queries/java/blocks.scm")
.reader().readText(),
bracketsScmSource = assets.open("tree-sitter-queries/java/brackets.scm")
.reader().readText(),
localsScmSource = assets.open("tree-sitter-queries/java/locals.scm")
.reader().readText(),
localsCaptureSpec = object : LocalsCaptureSpec() {
// Override any method to change the specification
}
)
Sometimes, your scm
file uses external predicate methods (client predicates) to better querying the syntax tree. In this case, add your predicate implementations to the predicates
argument.
Make Language and Theme
Create a TsLanguage
with your TsLanguageSpec
and theme builder DSL.
// Extension Function for easily make text styles in Kotlin
import io.github.rosemoe.sora.lang.styling.textStyle
// ...
val language = TsLanguage(languageSpec, false /* useTab */) {
// Theme Builder DSL
// Apply text style to captured syntax nodes
// Apply style to single type of node
textStyle(KEYWORD, bold = true) applyTo "keyword"
// Apply to multiple
textStyle(LITERAL) applyTo arrayOf("constant.builtin", "string", "number")
}
Apply Language
Now the language instance can be applied to the editor.
editor.setEditorLanguage(language)
Note that, the TsLanguageSpec
object can not be reused, because it is closed when the TsLanguage
is destroyed.