Forum > Other

My little Viruital Machine Project


Hi, Guys

I been trying to learn about programming languages and how there made So I have decided to write a little VM, I always been interested in languages for years but never really got the time, or never finished a project cos I always have ideas or doing things and never finishing them  :). anyway, I been writing this VM over two weeks after reading though some texts on the internet, it's very basic now but does run a few examples. anyway, check it out lets me know what you think and how I can improve it.

Noting- for relative newcomers to the industry- that VM in this case has the classical interpreter/JIT meaning, rather than some sort of containerisation.

Well done. You might find investigating Meta-2 and possibly Tree Meta rewarding for the compiler part.

Meta-2 will parse an input program and directly emit assembler-like mnemonics, so is an easy fit for what you've got. There's multiple implementations floating around, the best thing is usually to use one of these as a reference implementation and redo in the language etc. of your choice.

Tree Meta builds on that to parse the input into a tree in memory, which is then "unparsed" to (e.g.) your mnemonics. The notable part is that it also includes ways that the tree can be optimised by rules in the language description. Implementation information is fragmentary, there's multiple people scraping around trying to find stuff but some was lost (together with the rest of somebody's home and belongings) in one of the California fires.

Otherwise there's plenty of other compiler-writing toolkits, but I find the above easy to understand and they work for me.


hi thanks for the info MarkMLl I will look into some of the things you suggested. one thing I am bugged about is how I get my VM to handle outputting string consts. at the moment I can print chars, but there just pushed onto the stack in reserve order then pop off.

how would I deal with real strings of different lengths. I heard something about a string const pool, is this a different area from the byte code. or should my strings emitted to the byte code while assembling the source code, I am really struggling to think how to do it. think I have to do some digging on the net, but if you have an idea I am glad of it.

Hoped you like my VM so far.

Well, something like Stage2 would have hailed lists of individual characters (including pointer overheads) as the One True Way and John McCarthy their prophet, but to be honest I don't see any problem with declaring an opcode as incorporating a string parameter and leaving the rest to (the tool used to code) the interpreter.

But if you wanted the machine (i.e. binary) representation of each opcode+parameter to have a consistent length, what you could do would be examine your use cases and determine the optimal length (by analogy, Wirth and others looked at encoded numeric literals when designing RISC processors). So if you decided that almost all strings were shorter than 16 characters and looked at additional opcode overhead, you might decide that an opcode carried up to 12 characters with it and that the compiler would accommodate longer strings either by concatenating 12-character chunks or by appending individual characters.



[0] Message Index

Go to full version