-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New and improved Python bindings #1629
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thanks for your brilliant work!
The overall design fits my taste but there are a few problems to be fixed. See my comments below.
if dllname not in loaded_dlls: | ||
lib_file = path / dllname | ||
|
||
if str(path.parent) == '.' or lib_file.exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str(path.parent) == '.'
seems not equavelent to original logic? I suppose it should be str(path)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understood it, the original logic was trying to figure out whether this is a relative library path (i.e. either a filename without a parent directory, or just not an absolute path). Testing for parent == "." is the way to determine this. Did you encounter any problem loading the library this way? (I admit I haven't tested it thoroughtly)
I am not fully comfortable with how the library lookup code ended up, but considering the previous implementation I think it looks very nice after all..
bindings/python/unicorn/unicorn.py
Outdated
|
||
def __new__(cls, arch: int, mode: int): | ||
# prevent direct instantiation of unicorn subclasses | ||
assert cls is Uc, f'{cls.__name__} is not meant to be instantiated directly' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kills the possibility of inheriting Uc
class, which I did in implementing multitask. A quick PoC:
from unicorn import *
class A(Uc):
def __init__(self, arch: int, mode: int):
super().__init__(arch, mode)
a = A(UC_ARCH_X86, UC_MODE_32)
I understand you are trying to return a subclass to handle arch-specific things, however I doubt why not choose pimpl
design? e.g. Uc
holds a reference to an arch-specific object instead of being a subclass. IMHO, hacking with __new__
is a bit dangerous, especially we care about the compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not think of any reasonable usecase for extending the Uc
class, but if you think it is necessary I can patch it to allow user-defined subclasses. Regarding the multitask implementation, I don't know how it looks like - but perhaps we could integrate this into this model, still "hiding" the subclasses from the user..?
About using __new__
to create the required subclass: that is not a hack - that is a common way to use this special method. The idea here was to organize everything in OOP (with all the benefits coming from it) and at the same time don't let the user be bothered about which class to instantiate - they just init Uc
, which picks the right subclass for them. The user should not be aware (or care) of the specific subclass.
The incapsulation in the PIMPL design makes things more complicated and packs the code with trivial wrapper calls (i.e. forwarding the the exact same parameters to a method with exact same name). This design pattern is more reasonable for inflexible languages like C++, rather than Python. However, I don't see how it helps with the multithreading implementation because we'll still have a predefined list of subclasses to incapsulate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This multitask code is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this code quite a bit in order to find a way to keep the clean OOP approach and still allow users to extend the Uc
class. As far as I can tell we have two options for that:
- Require external
Uc
subclasses to be implemented as "pimpl", assuming this is probably a rare use case anyway. Looking at the multitask code, it seems that such a change would not affect it too much. - Require external
Uc
subclasses to be decorated with something like@ucsubclass
. That way is a bit hacky and involves some meta-programming black magic, but it looks clean from user perspective. Taking your example above:
@ucsubclass
class A(Uc):
def __init__(self, arch: int, mode: int):
super().__init__(arch, mode)
I lean towards the 2nd option; what do you think?
bindings/python/unicorn/unicorn.py
Outdated
) | ||
|
||
|
||
__all__ = ['Uc', 'UcContext'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a bit aggressive as stated above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Importing *
from a library is considered a bad practice and should be avoided.
As far as I can tell the Unicorn bindings library is organized in way that __init__.py
already imports all the needed modules and "flatten" the import path (i.e. you can import everything directly from unicorn
; if you can't import - it means you shouldn't).
Keep in mind that the previous implementation kept everything in one big file, so there was no consideration of a clean namespace. Having said that, I think that moving to a new major version (uc2) is a good oportunity to make such changes that could break some scripts - but at the same time will be trivial to fix by the users by adding a few more imports.
return cls(*param) | ||
|
||
|
||
class UcAArch32(Uc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stated further in the review below, my thought is that, such the arch-specific class might be used as a member of the Uc
object. This in fact is pimpl
design instead of your factory
design if I understand it correctly.
See my PoC in #1633 |
@wtdcode, any thoughts or comments, considering this comment and this commit? |
we have been thinking about merging this after 2.0.1 release. can you add some sample code under bindings/python/ to demonstrate the new API? |
The external facing API was not modified, it is only the internal implementation that was changed. Except minor additions (e.g. some archs now support more regs) nothing really changed in terms of functionality. |
oh i thought you added some more Pythonic API
|
merged and thanks for your patient. I'm doing a careful comparison to ensure nothing breaks. |
It passes all local tests, if that counts. |
Our tests for Python bindings are not complete and thus I'm looking into the history to manually port missing features. |
Well, every time there was an update to the Python binding I changed my code to accommodate. Anyway, if something is missing or incompatible, let me know and I'll fix it. |
Re-implemented Unicorn Python bindings while remaining compatible with the original bindings API.
Highlights:
Changes:
UcError
now provides additional information, when applicable, to provide the users with better error context (e.g. the target address and access size for invalid memory accesses)