Pinned
1,447 contributions in the last year
Less
More
Contribution activity
April 2021
Created 73 commits in 1 repository
Created 1 repository
Created a pull request in google/iree that received 5 comments
Adding -iree-hal-materialize-interfaces2 to finally use hal.interface (mostly) as it was intended
hal.interface was intended to allow us to combine dispatch I/O into a smaller set of bindings but due to limitations of things like iree.placeholder …
+1,838
−328
•
5
comments
Opened 38 other pull requests in 1 repository
google/iree
1
closed
35
merged
2
open
- Adding a getopt port that will work on bare-metal/Windows/wasm/etc.
- Exposing IREE_ASSERT as part of the public API.
- Adding a little demo library to show both sides of the ABI.
- Adding ARM-32 and x86-32 arch to elf loader.
- Fixing binding lengths size mismatch.
- Defining a C-compatible benchmark header.
- Moving FlattenMemRefSubspanPass to Common/.
- Adding skeleton VMVX module.
-
Moving the HAL module shims into
iree/vm/. -
Small tweaks to reduce the total size of
iree_hal_dispatch_cmd_t. - Combining allocations of HAL heap buffer's handle and storage.
- Small improvements to enable buffer aliasing/importing.
- Adding experimental synchronous executor using inline command buffers.
- Adding inline command buffers to the HAL API and a basic implementation.
- Initial embedded ELF module loader.
- Cleaning up HAL timeouts and tunneling iree_hal_device_submit_and_wait through the stack.
- Adding a note about where executable imports would be specified.
- Force all constants to be outlined and stored in the constant pool.
-
Specify an alignment on
vm.rodataand use it in flatbuffers. - Adding a static library loader.
- Enabling hal.allocator.wrap.byte_buffer to actually wrap.
- Adding a color to dispatch slice/shard trace zones.
- Adding support for !vm.list<?> variant types.
- Changing HAL executable formats from fourccs to strings.
- Adding hal.device.query/iree_hal_device_query_*.
- Some pull requests not shown.
Reviewed 46 pull requests in 1 repository
google/iree 46 pull requests
- Merge google -> main
- Build simple_embedding_run in c API.
- Add new SavedModelToIreeABIPass.
- Rename FileToc struct to iree_file_toc_t
- IREE Java TFLite Bindings
- Fix pointer size detection
- Fix format specifiers for 32-bit architectures
- Add conversions from vm list operations to emitc
- Initial embedded ELF module loader.
- Adding experimental synchronous executor using inline command buffers.
- Updating documentation to include 2021 Q2 objectives
- Adding inline command buffers to the HAL API and a basic implementation.
- Cleaning up HAL timeouts and tunneling iree_hal_device_submit_and_wait through the stack.
-
Small tweaks to reduce the total size of
iree_hal_dispatch_cmd_t. - Fix task.c failure on 32-bit systems.
- Remove attribute that is pass internal.
- Fix the last charactor drop in iree_status_to_string
- Enabling hal.allocator.wrap.byte_buffer to actually wrap.
- Remove unnecessary NOLINT
- Adding simple transient buffer packing.
- Changing HAL executable formats from fourccs to strings.
- Adding hal.device.query/iree_hal_device_query_*.
- Adding an error message to iree.unreachable.
- Limit fusion of linalg.generic/indexed_generic ops to avoid redundant computation
- Legalize ui64 to i32 in Flow type converter
- Some pull request reviews not shown.
Created an issue in dvidelabs/flatcc that received 37 comments
force_align on vectors missing (or: how to align vectors in flatcc?)
[ Summary: flatcc should eventually support force_align on vectors - meanwhile use vectors of structs and force_align structs, or use low level bui…
37
comments
Opened 28 other issues in 1 repository
google/iree
24
open
4
closed
- Spin a bit when more work is known to be coming soon.
- Optimize task stealing code path in task system.
- Identify and fix off-by-one in task scheduler tile sharding.
- Add task worker flag for whether a wake is pending and mask off if so.
- Add a benchmark tool for executable libraries.
- Collapse multi-dimensional tile loops that don't do anything.
- Lower vectors into VMVX.
- Get work distribution working properly for VMVX (and others).
- Upstream improvements for scf.for folding on loop ranges.
- Implement the F32/F64 VM instruction extensions.
- Add a few dedicated bytecode->module shims for faster calls.
- Add fused VM ops for common sequences of addressing arithmetic.
- Use AttributeInterface to mark attributes that should be serialized by the VM for runtime reflection.
- Investigate whether we need to preserve SIMD registers on various architectures (x64 in particular).
- Implement task executor thread donation.
- VM ref register lifetime could use some tightening to release refs sooner.
- Make the task executor create additional threads from an initial worker thread.
- Allocate task command buffers from the block pool.
- Reuse non-reusable command buffer arenas for task queue submission alloc.
- Inline small constants into command buffers.
- Propagate HAL buffer attributes throughout the program.
-
Add an
iree_hal_cpu_info_tstruct for runtime codegen architecture selection. - Support flow.tensor.slice and flow.tensor.update during transient allocation
-
Add an
iree.align(or upstream) op for alignment math with folding/canonicalizations - Implement simple cross-platform/arch ELF loader for executables
- Some issues not shown.
