Commits · 10d9da486518a0128f17156caf023c5339b96087 · cld / ml / tvm

Nov 30, 2017
- Consider variable range information during simplification of tensorize expressions (#674) · 10d9da48
  Salem Derisavi authored 7 years ago
  
  10d9da48
- [CUDA] Enable int64 (#683) · cf81f9f9
  Tianqi Chen authored 7 years ago
  
  * [CUDA] Enable int64 * [PYTHON] Fix rpc tutorial with opencl * OK * update
  cf81f9f9
Nov 29, 2017
- [ANDROID][RPC] Remove binary distro jar (#677) · b1af1e1b
  Tianqi Chen authored 7 years ago
  
  * [RPC][JVM] Remove binary dist gradle from repo * fix header
  b1af1e1b
Nov 28, 2017
- [ARITH] Upgrade CanonicalSimplify to Simplify Mod (#676) · 2bb1d8e4
  Tianqi Chen authored 7 years ago
  
  2bb1d8e4
Nov 25, 2017
- [PASS] Allow compact checking when strides is available (#669) · b55361b4
  Tianqi Chen authored 7 years ago
  
  * [PASS] Allow compact checking when strides is available * remove assert compact
  b55361b4
Nov 23, 2017
- Documentation correction (#665) · 70ccc8b6
  Siva authored 7 years ago
  
  Readability.
  70ccc8b6
Nov 21, 2017
- [PASS/SETUP] Fix minior issues (#663) · 9c0da90f
  Tianqi Chen authored 7 years ago
  
  * [PASS/SETUP] Fix minior issues * fix lint
  9c0da90f
- [CONTRIB] MPS DNN Dense (#615) · 46e6cae5
  Sheng Zha authored 7 years ago
  
  * mps * update
  46e6cae5
Nov 18, 2017
- [RUNTIME] support limited save without cross compile (#659) · 3479b9ab
  Lianmin Zheng authored 7 years ago
  
  3479b9ab
Nov 16, 2017
- Compat for opencl mode between cpu mode and gpu mode (#655) · db743028
  haolongzhangm authored 7 years ago
  
  some host opencl runtime may at cpu mode, but remote client opencl runtime at gpu mode, compat it
  db743028
Nov 14, 2017
- [UNROLL] New unroll option (#647) · a2aa154c
  Tianqi Chen authored 7 years ago
  
  a2aa154c
Nov 13, 2017
- [PASS] Fix vthread when extern access touching (#636) · 4d2fc952
  Tianqi Chen authored 7 years ago
  
  4d2fc952
Nov 12, 2017
- [CODEGEN] Enable closure with no argument (#635) · b07ceff5
  Tianqi Chen authored 7 years ago
  
  b07ceff5
- [PASS] Update coproc sync (#634) · f1aabedc
  Tianqi Chen authored 7 years ago
  
  f1aabedc
Nov 11, 2017
- [PASS] Enhance LiftAttrScope (#632) · e4b40b53
  Tianqi Chen authored 7 years ago
  
  * [PASS] Enhance LiftAttrScope * update vt
  e4b40b53
- [NNPACK] Add argument nthreads (#631) · 182a7852
  ziheng authored 7 years ago
  
  182a7852
Nov 09, 2017

inline AMD GPU functions (#625) · 8fea0879

eqy authored 7 years ago

* Support vector operations for AMD (llvm IR)

* fix whitespace

* update comments, docstring

* inline AMD GPU functions

8fea0879

Nov 08, 2017

Support vector operations for AMD (llvm IR) (#623) · cedd3900

eqy authored 7 years ago

* Support vector operations for AMD (llvm IR)

* fix whitespace

* update comments, docstring

cedd3900

Nov 07, 2017

remove minimum 32-bit restriction (#621) · 08e4d085

eqy authored 7 years ago

Change minimum 32-bit restriction for floating point types to 8-bit.
This change is to enable reduced precision types that may use vector operations underneath the hood (cases #lanes > 1 such as half4).

08e4d085

Nov 06, 2017
- add tanh dispatch (#619) · c7101537
  masahi authored 7 years ago
  
  c7101537
Nov 03, 2017
- [DLPack] Upgrade dlpack to 0.2 (#609) · 8214d6ca
  Tianqi Chen authored 7 years ago
  
  8214d6ca
Nov 02, 2017
- [INTRIN] Enable popcount (#606) · 685f78d0
  Yuwei Hu authored 7 years ago
  
  * enable popcount intrin * fix lint * add test * fix python3
  685f78d0
Oct 26, 2017
- [ROCM] remove fma dispatch (#591) · 20144de2
  masahi authored 7 years ago
  
  * removed fma dispatch * added comments to explain why remove fma * fix lint * use fmuladd intrin for fma dispatch
  20144de2
- [ROCM] View llvm ir and gcn asm with module.get_source(...) (#590) · 6a5d6165
  masahi authored 7 years ago
  
  * view llvm ir and gcn asm with module.get_source(...) * fix lint
  6a5d6165
- [BUFFER] Smarter slice to detect compactness (#587) · a76851d7
  Tianqi Chen authored 7 years ago
  
  * [BUFFER] Smarter slice to detect compactness * move simplify of begins early
  a76851d7
Oct 24, 2017
- [CODEGEN] Fix CPU compute attribute (#582) · da27cfec
  Tianqi Chen authored 7 years ago
  
  da27cfec
Oct 22, 2017
- [PASS] More robust UnrollLoop configuratin (#576) · 0f1e0ff0
  Tianqi Chen authored 7 years ago
  
  0f1e0ff0
- [SCHEDULE] Detect duplicate IterVar in reorder (#575) · 1791b121
  Wei Chen authored 7 years ago
  
  1791b121
Oct 20, 2017

[ROCM] Working math function support for ROCm backend, a bug fix in LLVM based codegen (#570) · 326edd76

masahi authored 7 years ago

* added math function support

* bug fix extern func call in llvm based codegen

lint fix

fix build

bug fix extern func call in llvm based codegen

* moved rocm bitcodes detection to python

326edd76

Oct 17, 2017
- [CODEGEN] Use correct math intrin for metal (#562) · 60510a47
  Tianqi Chen authored 7 years ago
  
  60510a47
Oct 16, 2017
- [ARITH] More caninical simplfy (#561) · 621337d5
  Tianqi Chen authored 7 years ago
  
  * [ARITH] More caninical simplfy * [DEBUG] Use HalideIR with trace logging
  621337d5
- [CODEGEN] Allow link additional module (#559) · 6894d42b
  Tianqi Chen authored 7 years ago
  
  * [CODEGEN] Allow link additional module * fix py3 * add register back
  6894d42b
Oct 15, 2017
- [CODEGEN] Bugfix multiple condition generation (#558) · 163c4795
  Tianqi Chen authored 7 years ago
  
  163c4795
- [CODEGEN] Force not inline compute core for better debug (#557) · 10faa893
  Tianqi Chen authored 7 years ago
  
  * [CODEGEN] Force not inline compute core for better debug * also support llvm4
  10faa893
Oct 14, 2017
- [Refactor] Introduce target generic dispatch system (#556) · eb761f36
  Tianqi Chen authored 7 years ago
  
  * [TVM] Introduce target generic dispatch system * fix target warning
  eb761f36
- [CODEGEN] Detect broadcast(cast(x)) pattern in FMA (#551) · 592a1f65
  ziheng authored 7 years ago
  
  * [CODEGEN] Detect broadcast(cast(x)) pattern in FMA * [CODEGEN] Improve * [CODEGEN] Fix
  592a1f65
Oct 13, 2017
- added support for rocm gpu autodetect (#549) · ed783689
  Aditya Atluri authored 7 years ago
  
  * added support for rocm gpu autodetect * changed type casting from old style to static_cast * fixed code to generate gfx specific code object * fixed namespaces
  ed783689
- add msvc in cc (#531) · 87c929f5
  Hu Shiwen authored 7 years ago
  
  87c929f5
- [CODEGEN] Skip unrolled hint, export symbol on win32 (#547) · 74b0ca86
  Tianqi Chen authored 7 years ago
  
  74b0ca86
Oct 12, 2017
- fixed rocm runtime. set default gcn arch to be gfx803 (#544) · 624c37df
  masahi authored 7 years ago
  
  624c37df