INTRODUCTION

X# pronounced X-Sharp is an High Level Assembly language that target the x86 architecture and is expected to be flexible enough to later target other kinds of processors.

The language is line based which means an instruction doesn't span several lines. This make the language easier to parse. Also parsing is performed in one path. This imply that some semantic checks are not performed by the parser which may lead to assembly failures when NASM is invoked later.

Close to 1:1 mapping for debugging, non disconnect. No large compounds.

SYNTAX

Comments

A comment must appear on its own line. You can't mix code and comments on a single line. A comment line is one that starts with two consecutive slashes. Whitespaces may be inserted before the comment line. For example :
// This is a comment.
    // Another comment prefixed with whitespaces.

Literal values

String literals

A string literal is surrounded with single quotes. Should your string contain a single quote you must escape it with a backslash character. For example :
'Waiting for \'debugger\' connection...'

Integer literals

You can write integer literal values either in decimal or hexadecimal. For hexadecimal values prefix the value with a dollar sign:
// Those two constant values are actually equal
const decimal = 255
const hexadecimal = $FF

Namespaces

A namespace is a naming scope that lets you organize your code to avoid naming collision. You declare a namespace by using the namespace keyword and giving it a name. For example :
namespace TEST

The namespace name is automatically used as a prefix for each named item that appear in that namespace (function name, labels, variables ...). The namespace extents from the souce code line it is declared until either another namespace definition appear or the end of the source code file is reached. Consequently there is no namespace hierarchy and you cannot "embed" a namespace into another one.

WARNING : Code inside a namespace has no way to reference or use code or data from another namespace.
Nothing prevents you to reuse a namespace including inside a single source code file. For example the following source code will compile without error.
namespace FIRST
// Everything here will be prefixed with FIRST. Hence the "true" full name of the below variable
// is FIRST_aVar
var aVar
namespace SECOND
// Not a problem to name another variable aVar. Its true name is SECOND_aVar
var aVar
namespace FIRST
// And here we get back to the FIRST namespace

Every program artefact MUST appear inside a namespace. It is hence strongly recommended to define a namespace at the very beginning of any X# source file.

Datatypes

X# is targeted at 32 bits assembler code generation. It support the following datatypes :

The signedness of the datatype is undefined. The X# code needs to handle itself the various control flags (carry, sign and overflow) according to the context. Also notice that X# is lacking floating point datatypes.

Constants

Constants are symbolic names associated with a numeric litteral value. A constant definition is introduced by the const keyword, followed by the constant name an equal sign and a constant numeric value. Constants are always considered to be of double word type. For example :
namespace TEST
const twoHundred = 200

The constant name itself is built differently than for other items. The above constant declaration is actually named TEST_Const_twoHundred. Consequently you can define another (non const) item with the same name without fearing name collision. However this is bad programming practice and is strongly discouraged.

WARNING : Whenever you want to reference one of you constants in your source code, you MUST have its name be prefixed with a dash. For example the following code initialize the EAX register with the value of the twoHundred constant :
EAX = #twoHundred

Variables

You can define either atomic variables of either doubleword or text type or one dimension array of any of the available datatypes. You declare a variable by giving it a name and optionally a value. For example the code below declares two variables :
var myNumVar = 876
var myTextVar = 'A message'

If you omit to give the variable a value it will be assumed to be a doubleword and will be initialized with a default value of 0.
The X# compiler silently appends a null byte at the end of textual initialization value.

You also can define a one dimension array of one of the available datatypes. All array members are initialized to 0. You must provide the array size at declaration time. For example delaring an array of 256 bytes is :
var myArray byte[256]

Registers

X# support all the four general purpose registers from the x86 architecture. These registers are available as byte sized : AH AL BH BL CH CL DH DL as well as word sized : AX BX CX DX and doubleword sized EAX EBX ECX EDX. The four specific registers are also available as doubleword sized : ESI EDI ESP EBP

Labels

Labels are a way to give a name to some memory addresses. This is a convenient way to be able to reference these addresses at coding time without having to know there value at runtime. The X# compiler automatically creates several labels. For example each time you define a variable, a label will be created having the variable name and referencing the memory address of the variable. This will be usefull to read and write variable content.
When you create a function a label will also be defined to be the address of the beginning of the function. This label will be used when you call the function.
Those automatically created labels are largely transparent for you. On the other hand you may want to explicitly define labels to denote some particular position in your code. This is the case for example when you want to perform a test and jump to a specific line of code depending on the result of the test. You will create a label at the code location where you will want to jump.
A label is nothing more than a name suffixed with :
// This is a useless label because the variable already got one.
MyUselessLabel:
var myVar

Functions

Functions are declared using the function keyword. A function name must follow the keyword and be followed by an opening curly brace. Be carefull to keep the opening curly brace on the same line than the function keyword. Contrarily to high level languages, X# function declaration doesn't support parameters declaration. You must handle parameters passing by yourself either using the stack and/or well known registers. For example :
function MyFirstFunction {
// Your code here
// Do not forget the closing curly brace.
}

Returning from a function

When the X# compiler encounters the closing curly brace that signal the end of the function source code, the compiler automatically adds a ret instruction. The recommended way to return from a function is to use the return keyword. Internally the X# compiler will translate it to an unconditional jump to a special label local to the function which is named Exit. The X# compiler tracks the use of this label and is wise enough to add such a label at the end of the function code if you don't define it by yourself.

Sometimes you will want to explicitly return from your function without going to the cleanup code that may be defined at and below the function Exit label. You can do so by using the ret keyword.
// This instruction will directly exit the function without jumping to the Exit label.
ret

WARNING : The X# compiler doesn't monitor stack content. It is the responsibility of your code to make sure that the return address is immediately on top of the stack before the ret instruction is executed, including for the one that is automatically added by the compiler at the end of the function body.

Invoking a function

You invoke a function by using the call keyword followed by the function name.
Call myFunction
Because X# doesn't support function parameters you must make sure you properly setup the stack and/or the registers that are expected by the invoked function.

Interrupt handlers

Interrupt handlers are special kind of functions used to handle an interruption. Those functions do not support parameters and are declared using the interrupt keyword. An interrupt function name must follow the keyword and be followed by an opening curly brace. Be carefull to keep the opening curly brace on the same line than the interrupt keyword. For example :
interrupt DivideByZero {
// Your code here
// Do not forget the closing curly brace.
}

Interrupt handlers are executed in a specific processor context that is different from the normal control flow within functions. So there must be a way for the processor to know when interrupt processing is done and normal operations should resume. This require a specific instruction, namely iret in x86 processors architecture. Normally you do not have to take care of this because the X# compiler knows you're defining an interrupt handler and silently insert the iret instruction at the end of the interrupt handler code. However you can diretcly insert the iret instruction in your X# code, including in a normal function.

WARNING : You must be very carefull not to use this instruction when your code is not handling an interruption otherwise the processor will trigger an exception. The X# compiler doesn't perform any control when you hardcode this instruction.

Assigning value

You can assign a value to a register or to a variable. You do it using the = operator. The left side is the register or variable name while the right side is the value to be assigned. For example :
// Assign the immediate value 123 to the EAX register (32 bits).
EAX = 123

On the right side of the assignment operator you can use either an immediate value, a constant (which name must be prefixed with a dash sign), or a register name.
When the left side of the assignment operator is a variable name and the right size is an immediate value you can additionally explicitey define the size of the right operand using an as clause associated with the datatype. For example :
// Assign the immediate value 200 as a word (16 bits) to the myVar variable. myVar = 200 as word

Address indirection

Sometimes a register contains the in memory address of another element, most lkely a variable. In this case you do not want to assign a value to the register itself and want instead to store the value at the memory adress stored in the register. This is called address indirection and is denoted by the register name being followed by a number surrounded between square brackets and known as an offset (more on this later). Address indirection may be used on both the right side and the left side of the = assignment operator. However you can't use it on both side at the same time. Let's take an example :
EAX[10] = EBX
The behavior is as follow : take the content of the EAX register, add to it the offset value (10 in our example) and consider this to be a memory address. Now store the content of the EBX register at this memory address.
The offset value must be a literal number including 0 or even a negative number.

So now how does it come for a register's value to be a memory address ? We do this with a special @ operator that is used as a suffix to a label name. Knowing each time you declare a variable the X# compiler automatically creates a label for this variable it comes that we now have the following syntax :
// Declare a variable
var myVar
// Read variable content into EAX register by using the variable name.
EAX = myVar
// Load EAX register with the in memory address of the myVar variable. EAX = @myVar
// So now we can store the content of EBX register into myVar variable.
EAX[0] = EBX
// And read back the content of the myVar variable into ECX register.
ECX = EAX[0]

Register arithmetic

X# support additive and substractive register arithmetic with the + and - operators. X# support a shotcut syntactic version for incrementing and decrementing a register. This syntax is not supported for variables. When incrementing or decrementing a register you must omit the assigment part of the instruction. The target register is the one on the left side of the operator. For example the following instruction increment the EAX register by 2 :
EAX + 2
In the above example you can replace the literal value with a register name but not with a variable name. In the following example the value of the EAX register is decremented by the value of the EBX register :
EAX - EBX

Finally there is even a shorter version when you want to increment or decrement a register by 1. This is performed with the ++ and -- operators. They must be applied to a register only. Incrementing and decrementing a variable this way is not supported. Additionally the operator must be used as a register suffix with no additional space between register name and operator. For example :
// Increment EAX register
EAX++
// Decrement ECX register
ECX--

Register shifting and rolling

Shifting a register to the right or to the left is performed with >> and << keywords respectively. Following the keyword you must provide a literal number that define how many bits to shift. For example :
code>// Shift EAX to the right by 8 bits.
EAX >> 8

Shifting a register to the right or to the left is performed with ~> and <~ keywords respectively. Following the keyword you must provide a literal number that define how many bits to shift. For example :
code>// Roll EAX to the left by 12 bits.
EAX <~ 12

Comparision

Classical comparision operatotrs are supported :
< > = <= >= !=.
See the two collections for what is supported in if statements foreach (var xComparison in mCompareOps) foreach (var xCompare in mCompares) The while statement only support the mCompares style.

Pure comparison

Sometimes you want to compare a register content for equality with a literal number, a variable content or a constant. You can do this with the ?= operator. The left side of the operator is the register name while the right side is the value to be compared with. The result of such an operation is to have the processor context flags (sign overflow, equality and carry) to be set accordingly with the comparison result.
// Compare EAX register content with literal value 812.
EAX ?= 812

You may also which to test some specific bits of the register value and not the full register value as a whole. This is where you use the ?& operator. Once again processor context flags are updated with the result of the bitwise AND comparison of the register value and the compared value.
// Test whether the fourth least significant bit of EAX register is set.
EAX ?& $08

Control flow instructions

Branching

The goto keyword lets you perform unconditional branching. Following the keyword you must name the target label. For example :
// Assuming a somewhereElse label is defined.
goto somewhereElse

The if keyword lets you perform conditional branching. Following the keyword and on the same line you must provide a condition followed by either a goto statement or a return statement or you must begin a code block with an opening curly brace.
The condition itself is usually a simple comparison as described above. It can also be a test involving just a comparison operator and nothing else. This special syntax is used to directly test one of the three main flags updated by the processor on almost any instruction : (signedness, overflow and carry). This syntax is not recommended unless you know very well how the processor behaves. Most of the time you can use the standard syntax to achieve the same result, albeit with a couple less line of codes sometimes. For example :
// A simple test with standard syntax :
if EAX > 10 return
// This is equivalent to this one with special syntax :
EAX ?= 10
if > return

Notice that unlike higher level languages there is no "else" construct available.

Looping

The while keyword only support standard comparison. Special syntax available with if statement can't be used with the while statement.

Define a loop on a simple condition. Example :
while eax < 0 {
eax = 1
}

Playing with the stack

The x86 architecture supports a stack concept that is backed by the ESP processor register. Pushing value(s) onto the stack is denoted with the + sign while popping value(s) from the stack is denoted by the - sign. You can push or pop a single register at a time by prefixing its name with the appropriate operation sign. There must not be any whitespace character between the sign and the register name. For example:
// Pop the EAX register from the stack.
-EAX

The datatype of the pushed/popped value is implied by the register name.

You can also directly push (and obvioulsy can't pop) an immediate numeric value value onto the stack. Should the value be defined as a constant with the const keyword do not forget the dash sign that must appear between the operation sign and the constant name. For example :
// Push the immediate value 200 onto the stack.
+200
// Push the value for the twoHundred constant onto the stack.
+#twoHundred

The default datatype for a pushed immediate value is doubleword. You can also explictly state the kind of datatype for the pushed/popped constant. You do this by appending a as clause at the end of the instruction such as :
// Push the immediate value 200 onto the stack as a word (2 bytes).
+200 as word
// Push the twoHundred constant onto the stack as a single byte.
+#twoHundred as byte

Finally is also a convenient instruction that let you push or pop all common purpose registers with the All instruction. Once again you must prefix this keyword with the appropriate operation sign.

Working with I/O ports

Reading and writing I/O ports is performed with the Port keyword. The port number must be set in the DX register. You can read or write a byte, a word or a doubleword at a time. The input or output data will be in AL, AX or EAX register respectively. To read a byte use the following syntax :
AL = Port[DX]
To write a double word use the following syntax :
Port[DX] = EAX

Debugging helper

The checkpoint instruction let you write a simple text to the console by directly copying text content to the video buffer. The text must fllow the keyword and be surrounded with single quotes. Should it contain quotes they must be escaped with an antislash.
checkpoint 'This is a \'debugging\' message'

Literal assembler code

Despite our efforts you may find necessary to directly write assembler code in your X# soure code. Any source code line which first non whitespace character is an exclamation point will be copied verbatim in the target assembler source. This may be usefull for some rarely used instruction. For exmaple :
// Hope our Execution state block in System Management RAM is valid otherwise crash-boom
! RSM

The most likely reason you may emit literal assembler code is for floating point operations which are not supported by the X# compiler. However these kind of operations is rarely encountered at an OS kernel level.