Write compiler using python

Writing a Python compiler for practice

You can also edit them manually if you know what you’re looking for. test.py change.py Ouput of change.py: You can also add code to the body this way (or replace elements in the usual way of replacing list elements): and then just recompile and exec: You can replace function definitions or single lines that way. Note that you would also have to write a large amount of code to implement all built-in types and functions!

Writing a Python compiler for practice

Recently I’ve been reading quite a bit about CPUs and architectures; mainly opcodes, Integrated Circuits, etc. I’ve been a python developer for a few years, and I’d like to get some practice in writing machinecode.

I thought for fun I’d compile a very simple python script into machinecode as a way to practice it. The script is as follows:

a = 2 b = 3 c = a + b print c 

I’m writing the compiler in python, because I’m not as good at C as I am at python. I’ve looked a round a little, and I have the following python libraries at my disposal, which might help, i.e.

I still have to find the opcodes for Intel Core i5, but that should be easy to.

My question is as follows:

1) How do I write the opcode to the file? In other words, assume the opcode for setting a register to contain the value 2 is 0010 how do I write this is as the first four numbers in the program’s first line of execution?

2) How do I tell the OS, either OS X or Ubuntu, to load the program into physical memory? I’m assuming that the first thing a compiler does is write instructions for the OS onto the resulting binary file?

3) Any resources that you might know of that can help me would be appreciated.

That is quite a project you are planning there. In addition to learning how a compiler works, you also need to read up on loadable file formats like ELF, and tons of information on operating-system details.

I would suggest that you would emit an assembly file as output of your compiler. Then you could use an existing assembler to convert the file into machine code. In fact, this is what most C compilers (including GCC) do «under the surface».

EDIT: The output of a compiler or an assembler is typically an object file . This is later combined with other object files by a linker. Writing the entire tool chain, compiler, assembler, linker, and other associated tools would easily multiple man-years. In this light, I don’t think that you should see a straight-forward solution like using an existing assembler and linker as cheating.

Читайте также:  Читы для steam для css

Compiling python isn’t easy. You could look at pypy which has a just-in-time compiler.

Another option is to start with the python bytecode that is saved in a .pyc file if a python program is run by the standard Cpython interpreter. This has a limited amount of instructions for which you’d have to generate assembly/executable code for your CPU.

Note that you would also have to write a large amount of code to implement all built-in types and functions!

Reading and writing cfg files in python, It’s pretty simple: import libconf with io.open (‘file.cfg’, encoding=’utf-8′) as f: cfg = libconf.load (f) It can be installed easily via pip ( pip install libconf) and works well in virtual environments. Compared to other Python libconfig readers, it is both permissively licensed (MIT) and pure-Python (no C modules to compile).

How to write a Python debugger/editor

Sorry for the kind of general question. More details about what I want:

I want the user to be able to write some python code and execute it. Once there is an exception which is not handled, I want the debugger to pause the execution, show information about the current state/environment/stack/exception and make it possible to edit the code .

I only want to have the special code block editable where the exception occurred and nothing else (for now). I.e. if it occurred inside a for loop, I only want to have the code block inside the for loop editable. It should be the latest/most recent code block which is in the user editor scope (and not inside some other libs or the Python libs). Under this conditions, it is always clear what code block to edit.

I already tried to investigate a bit how to do this, though I feel a bit lost.

The Python traceback doesn’t give me directly the code block, just the function and the code line. I could calculate that back but that seems a bit hacky to me. Better (and more natural) would be if I could somehow get a reference to the code block in the AST of the code.

To get the AST (and operate on it, i.e. manipulate/edit), I probably will use the compiler (which is deprecated?) and/or the parser module. Or the ast module. Not sure though how I could recompile special nodes / code blocks in the AST. Or if I only can recompile whole functions.

Playing around with ast and compile (built-in) it seems that you could possibly use the NodeTransformer to modify some nodes. You can also edit them manually if you know what you’re looking for.

print 'Dumb Guy' x = 4 + 4 print x * 3 
import ast with open('test.py') as f: expr = f.read() e = ast.parse(expr) e.body[0].values[0].s = 'Cool Guy' # Replace the string e.body[1].targets[0].id = 'herring' # Change x to herring e.body[2].values[0].left.id = 'herring' # Change reference to x to reference to herring c = compile(e, '', 'exec') exec(c) 

You can also add code to the body this way (or replace elements in the usual way of replacing list elements):

p = ast.parse('print "Sweet!"', mode='single') e.body.extend(p) 

and then just recompile and exec:

Читайте также:  Метод add множества python

You can replace function definitions or single lines that way. A function definition will have its own body, so if you added some function (or loop) you could access it with

e.body[N].body # Replace N with the index of the FunctionDef object 

However, the only way that I know of to execute a single ast object ( _ast.Print or _ast.Assign or whatever) is to do something like this:

e2 = ast.parse('', mode='exec') e2.body.append(e.body[0]) exec(compile(e2, '', 'exec')) 

which seems a bit hackish to me. As far as lines go — each object in the AST has a lineno attribute, so if you can retrieve the line number from the exception you can fairly easily figure out which statement threw the exception.

Of course this doesn’t really solve the problem of rewinding the stack to the pre-exception state, which is what you really want to do, it sounds like. However, it might be possible to do such a thing via pdb.

I wonder if the HAP Remote Debugger for Python might be of any use to you? I don’t think that they have live editing, but some of the debugging aspects might be useful nonetheless.

From what I have figured out in the meantime, this is not possible. At least not block-wise. It is possible to recompile the code per function but not per code-block because Python generates the code objects per function.

So the only way is to write an own Python compiler.

C — How to write a very basic compiler, It’s totally OK to write a compiler in Python or Ruby or whatever language is easy for you. Use simple algorithms you understand well. The first version does not have to be fast, or efficient, or feature-complete. It …

Python: How to write a script to compile and run a C++ program via CMake?

Assuming I have the following project structure,

| CMakeLists.txt | run_experiments.py +---libs \---src main.cpp main.h 

how to make run_experiments.py compile the program via CMake and run it multiple times with different command-line arguments? What I’ve tried:

# run_experiments.py import os os.system("mkdir bin build") os.chdir("build") os.system('cmake -G"Unix Makefiles" ..') os.system("make") # and so on. 

But it already looks ugly, and I’m looking for the most elegant and cross-platform solution (Windows with MinGW and Linux, for example).

UPD: Added my CMakeLists.txt , which is CLion default-generated:

cmake_minimum_required(VERSION 3.16) project(test_tokenizing) set(CMAKE_CXX_STANDARD 14) add_executable(test_tokenizing src/main.cpp src/main.h) 

If the C++ code is not dynamically generated or something, I would recommend you build it upfront, before starting your python program and merely execute it from the python program.

If you have a good reason to build it from the python code, I created you a little example at https://github.com/kyotov/experiments. Something like this works for run_experiements.py:

import os import subprocess def main(): os.makedirs('build', exist_ok=True) subprocess.check_call('cmake -B build -G "NMake Makefiles"', shell=True) subprocess.check_call('cmake --build .', shell=True, cwd='build') subprocess.check_call('experiments 1 2 3', shell=True, cwd='build') subprocess.check_call('experiments 2 3 4', shell=True, cwd='build') if __name__ == '__main__': main() 

I used the following for main.cpp :

#include int main(int argc, char **argv)

And when I run python run_experiements.py I get this output:

(base) C:\kamen\clion\experiments>python run_experiments.py -- Configuring done -- Generating done -- Build files have been written to: C:/kamen/clion/experiments/build [100%] Built target experiments experiments 1 2 3 experiments 2 3 4 

To get this to work, you will need to setup your environment correctly to find the tools you need. In my case I used Miniconda3 for python and then «C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvars64.bat» to get my compiler and cmake in the path.

Читайте также:  Java чат galaxy на

I did not have a Linux machine handy, but if you make the -G conditional and use Unix Makefiles it will likely work on Linux (and elsewhere) too. Note that you don’t have to bother with the .exe on Windows, because the shell is smart enough to start the program even if you don’t specify it.

Python’s subprocess module will enable you to run external programs like CMake and the binary that you build with it. The executable name, however, will be different on Windows and Linux (with Windows having a .exe extension). You should be able to specify ./test_tokenizing to run your program and the system should find it providing you specify the current working directory cwd .

import os import platform import subprocess import sys def main(*args): current_directory = os.getcwd() build_dir = os.path.join(current_directory, 'build') if not os.path.exists(build_dir): os.makedirs(build_dir) if platform.system() == 'Windows': exe = os.path.join(build_dir, 'test_tokenizing.exe') else: exe = os.path.join(build_dir, 'test_tokenizing') cmake_args = ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + build_dir] cmake_args += ['-G', 'Unix Makefiles'] subprocess.run(['cmake', ".."] + cmake_args, cwd=build_dir, check=True) subprocess.run(['cmake', '--build', '.'], cwd=build_dir, check=True) p = subprocess.run([exe, *args], cwd=build_dir, capture_output=True, text=True) print(p.stdout) if __name__ == '__main__': args = [] if len(sys.argv) > 1: args += sys.argv[1:] main(*args) 

(Requires Python 3.7+ for capture_output and text in subprocess.run ).

Here you can pass command line arguments to the script like this:

python run_experiments.py hello world 

and the arguments are forwarded to the function main() , which will in turn call your C++ executable with them.

Just in case ./test_tokenizing doesn’t work on some exotic OS or you want to include some platform specific compiler/build flags, I’ve put a check for the platform above and specified the full path to your C++ executable.

Writing a Python compiler for practice, The script is as follows: a = 2 b = 3 c = a + b print c I’m writing the compiler in python, because I’m not as good at C as I am at python. I’ve looked a round a little, and I have the following python libraries at my disposal, which might help, i.e.

Источник

Оцените статью