IR Module¶
The IR (Intermediate Representation) module provides data structures for representing C/C++ declarations in a parser-agnostic format.
ir
¶
Intermediate Representation (IR) for C/C++ declarations.
This module defines the IR that all parser backends produce. The writer
consumes this IR to generate Cython .pxd files.
Design Principles¶
- Parser-agnostic: Works with pycparser, libclang, tree-sitter, etc.
- Intuitive composition: Types compose naturally
(e.g.,
const char*becomesPointer(CType("char", ["const"]))) - Complete coverage: Represents everything Cython
.pxdfiles can express
Type Hierarchy¶
Type expressions form a recursive structure:
- :class:
CType- Base C type (int,unsigned long, etc.) - :class:
Pointer- Pointer to another type (int*,char**) - :class:
Array- Fixed or flexible array (int[10],char[]) - :class:
FunctionPointer- Function pointer type
Declaration Types¶
- :class:
Enum- Enumeration with named constants - :class:
Struct- Struct or union with fields - :class:
Function- Function declaration - :class:
Typedef- Type alias - :class:
Variable- Global variable - :class:
Constant- Compile-time constant or macro
Example¶
Parse a header and inspect declarations::
from autopxd.backends import get_backend
from autopxd.ir import Struct, Function
backend = get_backend()
header = backend.parse("struct Point { int x; int y; };", "test.h")
for decl in header.declarations:
if isinstance(decl, Struct):
print(f"Found struct: {decl.name}")
Header
dataclass
¶
Container for a parsed C/C++ header file.
This is the top-level result returned by all parser backends. It contains the file path and all extracted declarations.
::
from autopxd.backends import get_backend
from autopxd.ir import Struct, Function
backend = get_backend()
header = backend.parse(code, "myheader.h")
print(f"Parsed {len(header.declarations)} declarations from {header.path}")
for decl in header.declarations:
if isinstance(decl, Function):
print(f" Function: {decl.name}")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to the original header file. |
required |
declarations
|
list[Declaration]
|
List of extracted declarations (structs, functions, etc.). |
list()
|
included_headers
|
set[str]
|
Set of header file basenames included by this header (populated by libclang backend only). Example ------- |
set()
|
Source code in autopxd/ir.py
CType
dataclass
¶
A C type expression representing a base type with optional qualifiers.
This is the fundamental building block for all type representations.
Qualifiers like const, volatile, unsigned are stored separately
from the type name for easier manipulation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The base type name (e.g., |
required |
qualifiers
|
list[str]
|
Type qualifiers (e.g., |
list()
|
Source code in autopxd/ir.py
Pointer
dataclass
¶
Pointer to another type.
Represents pointer types with optional qualifiers. Pointers can be
nested to represent multi-level indirection (e.g., char**).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pointee
|
Union[CType, Pointer, Array, FunctionPointer]
|
The type being pointed to. |
required |
qualifiers
|
list[str]
|
Qualifiers on the pointer itself (e.g., |
list()
|
Source code in autopxd/ir.py
Array
dataclass
¶
Fixed-size or flexible array type.
Represents C array types, which can have a fixed numeric size, a symbolic size (macro or constant), or be flexible (incomplete).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
element_type
|
Union[CType, Pointer, Array, FunctionPointer]
|
The type of array elements. |
required |
size
|
Optional[Union[int, str]]
|
Array size - an integer for fixed size, a string for symbolic/expression size (e.g., |
None
|
Source code in autopxd/ir.py
FunctionPointer
dataclass
¶
Function pointer type.
Represents a pointer to a function with a specific signature. Used for callbacks, vtables, and function tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
return_type
|
Union[CType, Pointer, Array, FunctionPointer]
|
The function's return type. |
required |
parameters
|
list[Parameter]
|
List of function parameters. |
list()
|
is_variadic
|
bool
|
True if the function accepts variable arguments (ends with |
False
|
Source code in autopxd/ir.py
Field
dataclass
¶
Struct or union field declaration.
Represents a single field within a struct or union definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The field name. |
required |
type
|
TypeExpr
|
The field's type expression. Examples -------- Simple field:: x_field = Field("x", CType("int")) # int x Pointer field:: data = Field("data", Pointer(CType("void"))) # void* data Array field:: buffer = Field("buffer", Array(CType("char"), 256)) # char buffer[256] |
required |
Source code in autopxd/ir.py
Parameter
dataclass
¶
Function parameter declaration.
Represents a single parameter in a function signature. Parameters may be named or anonymous (common in prototypes).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Optional[str]
|
Parameter name, or None for anonymous parameters. |
required |
type
|
Union[CType, Pointer, Array, FunctionPointer]
|
The parameter's type expression. Examples -------- Named parameter:: x_param = Parameter("x", CType("int")) # int x Anonymous parameter:: anon = Parameter(None, Pointer(CType("void"))) # void* Complex type:: callback = Parameter("fn", FunctionPointer(CType("void"), [])) |
required |
Source code in autopxd/ir.py
EnumValue
dataclass
¶
Single enumeration constant.
Represents one named constant within an enum definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The constant name. |
required |
value
|
Optional[Union[int, str]]
|
The constant's value - an integer for explicit values, a string for expressions (e.g., |
None
|
Source code in autopxd/ir.py
Enum
dataclass
¶
Enumeration declaration.
Represents a C enum type with named constants. Enums may be named or anonymous (used in typedefs or inline).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Optional[str]
|
The enum tag name, or None for anonymous enums. |
required |
values
|
list[EnumValue]
|
List of enumeration constants. |
list()
|
is_typedef
|
bool
|
True if this enum came from a typedef declaration. |
False
|
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Named enum:: color = Enum("Color", [ EnumValue("RED", 0), EnumValue("GREEN", 1), EnumValue("BLUE", 2), ]) Anonymous enum (typically used with typedef):: anon = Enum(None, [EnumValue("FLAG_A", 1), EnumValue("FLAG_B", 2)]) |
None
|
Source code in autopxd/ir.py
Struct
dataclass
¶
Struct or union declaration.
Represents a C struct or union type definition. Both use the same
IR class with is_union distinguishing between them.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Optional[str]
|
The struct/union tag name, or None for anonymous types. |
required |
fields
|
list[Field]
|
List of member fields. |
list()
|
methods
|
list[Function]
|
List of methods (for C++ classes only). |
list()
|
is_union
|
bool
|
True for unions, False for structs. |
False
|
is_cppclass
|
bool
|
True for C++ classes (uses |
False
|
is_typedef
|
bool
|
True if this came from a typedef declaration. |
False
|
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Simple struct:: point = Struct("Point", [ Field("x", CType("int")), Field("y", CType("int")), ]) Union:: data = Struct("Data", [ Field("i", CType("int")), Field("f", CType("float")), ], is_union=True) C++ class with method:: widget = Struct("Widget", [ Field("width", CType("int")), ], methods=[ Function("resize", CType("void"), [ Parameter("w", CType("int")), Parameter("h", CType("int")), ]) ], is_cppclass=True) Anonymous struct:: anon = Struct(None, [Field("value", CType("int"))]) |
None
|
Source code in autopxd/ir.py
Function
dataclass
¶
Function declaration.
Represents a C function prototype or declaration. Does not include the function body (declarations only).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The function name. |
required |
return_type
|
TypeExpr
|
The function's return type. |
required |
parameters
|
list[Parameter]
|
List of function parameters. |
list()
|
is_variadic
|
bool
|
True if the function accepts variable arguments. |
False
|
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Simple function:: exit_fn = Function("exit", CType("void"), [ Parameter("status", CType("int")) ]) With return value:: strlen_fn = Function("strlen", CType("size_t"), [ Parameter("s", Pointer(CType("char", ["const"]))) ]) Variadic function:: printf_fn = Function( "printf", CType("int"), [Parameter("fmt", Pointer(CType("char", ["const"])))], is_variadic=True ) |
None
|
Source code in autopxd/ir.py
Typedef
dataclass
¶
Type alias declaration.
Represents a C typedef that creates an alias for another type. Common patterns include aliasing primitives, struct tags, and function pointer types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The new type name being defined. |
required |
underlying_type
|
TypeExpr
|
The type being aliased. |
required |
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Simple alias:: size_t = Typedef("size_t", CType("long", ["unsigned"])) Struct typedef:: point_t = Typedef("Point", CType("struct Point")) Function pointer typedef:: callback_t = Typedef("Callback", FunctionPointer( CType("void"), [Parameter("data", Pointer(CType("void")))] )) |
None
|
Source code in autopxd/ir.py
Variable
dataclass
¶
Global variable declaration.
Represents a global or extern variable declaration. Does not include local variables (which are not exposed in header files).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The variable name. |
required |
type
|
TypeExpr
|
The variable's type. |
required |
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Extern variable:: errno_var = Variable("errno", CType("int")) Const string:: version = Variable("version", Pointer(CType("char", ["const"]))) Array variable:: lookup_table = Variable("table", Array(CType("int"), 256)) |
None
|
Source code in autopxd/ir.py
Constant
dataclass
¶
Compile-time constant declaration.
Represents #define macros with constant values or const
variable declarations. Only backends that support macro extraction
(e.g., libclang) can populate macro constants.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The constant name. |
required |
value
|
Optional[Union[int, float, str]]
|
The constant's value - an integer, float, or string expression. None if the value cannot be determined. |
None
|
type
|
Optional[CType]
|
For typed constants ( |
None
|
is_macro
|
bool
|
True if this is a |
False
|
location
|
Optional[SourceLocation]
|
Source location for error reporting. Examples -------- Numeric macro:: size = Constant("SIZE", 100, is_macro=True) Expression macro:: mask = Constant("MASK", "1 << 4", is_macro=True) Typed const:: max_val = Constant("MAX_VALUE", 255, type=CType("int")) String macro:: version = Constant("VERSION", '"1.0.0"', is_macro=True) |
None
|
Source code in autopxd/ir.py
SourceLocation
dataclass
¶
Location in source file for error reporting and filtering.
Used to track where declarations originated, enabling:
- Better error messages during parsing
- Filtering declarations by file (e.g., exclude system headers)
- Source mapping for debugging
::
loc = SourceLocation("myheader.h", 42, 5)
print(f"Declaration at {loc.file}:{loc.line}")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
str
|
Path to the source file. |
required |
line
|
int
|
Line number (1-indexed). |
required |
column
|
Optional[int]
|
Column number (1-indexed), or None if unknown. Example ------- |
None
|
Source code in autopxd/ir.py
ParserBackend
¶
Bases: Protocol
Protocol defining the interface for parser backends.
All parser backends must implement this protocol to be usable with autopxd2.
Backends are responsible for translating from their native AST format
(pycparser, libclang, etc.) to the common :class:Header IR format.
Available Backends¶
pycparser- Pure Python C99 parser (default)libclang- LLVM clang-based parser with C++ support
Example¶
::
from autopxd.backends import get_backend
# Get default backend
backend = get_backend()
# Get specific backend
libclang = get_backend("libclang")
# Parse code
header = backend.parse("int foo(void);", "test.h")
Source code in autopxd/ir.py
name
property
¶
Human-readable name of this backend (e.g., "pycparser").
supports_macros
property
¶
Whether this backend can extract #define constants.
supports_cpp
property
¶
Whether this backend can parse C++ code.
parse(code, filename, include_dirs=None, extra_args=None)
¶
Parse C/C++ code and return the IR representation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
code
|
str
|
Source code to parse. |
required |
filename
|
str
|
Name of the source file. Used for error messages and |
required |
include_dirs
|
Optional[list[str]]
|
Directories to search for |
None
|
extra_args
|
Optional[list[str]]
|
Additional arguments for the preprocessor/compiler. Format is backend-specific. |
None
|
Returns:
| Type | Description |
|---|---|
Header
|
Parsed header containing all extracted declarations. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If parsing fails due to syntax errors. |