Commits · f1aabedc9ebab199267aefb048c2fc367373c4ed · cld / ml / tvm

Nov 12, 2017
- [PASS] Update coproc sync (#634) · f1aabedc
  Tianqi Chen authored 7 years ago
  
  f1aabedc
- [TUTORIAL] use OpenCL on ARM board (#633) · 32b0fff2
  Lianmin Zheng authored 7 years ago
  
  32b0fff2
Nov 11, 2017
- [PASS] Enhance LiftAttrScope (#632) · e4b40b53
  Tianqi Chen authored 7 years ago
  
  * [PASS] Enhance LiftAttrScope * update vt
  e4b40b53
- [NNPACK] Add argument nthreads (#631) · 182a7852
  ziheng authored 7 years ago
  
  182a7852
Nov 09, 2017
- android gemm for topi/recipe (#628) · 35485307
  Yizhi Liu authored 7 years ago
  
  35485307
- inline AMD GPU functions (#625) · 8fea0879
  eqy authored 7 years ago
  
  * Support vector operations for AMD (llvm IR) * fix whitespace * update comments, docstring * inline AMD GPU functions
  8fea0879
Nov 08, 2017

WIP: Add how_to readme to install tvm with nnpack support (#610) · 90067e64

Erwan BERNARD authored 7 years ago

* feat(docs) add how_to for tvm install with nnpack support

* feat(docs) change python package paragraph

* feat(doc) remove unsure sentence

* add comments on nnpack usage vs TVM

* remove mxnet nnpack tips for nthread change

90067e64

Support vector operations for AMD (llvm IR) (#623) · cedd3900

eqy authored 7 years ago

* Support vector operations for AMD (llvm IR)

* fix whitespace

* update comments, docstring

cedd3900

conv2d_56_64_128 mark==1 bug fixed (#624) · 25847a4f
Leyuan Wang authored 7 years ago

25847a4f

Nov 07, 2017

remove minimum 32-bit restriction (#621) · 08e4d085

eqy authored 7 years ago

Change minimum 32-bit restriction for floating point types to 8-bit.
This change is to enable reduced precision types that may use vector operations underneath the hood (cases #lanes > 1 such as half4).

08e4d085

Nov 06, 2017
- add tanh dispatch (#619) · c7101537
  masahi authored 7 years ago
  
  c7101537
- [TOPI] fix weight layout in conv2d_transpose (#616) · c1008ec4
  Yuwei Hu authored 7 years ago
  
  c1008ec4
Nov 03, 2017
- [DLPack] Upgrade dlpack to 0.2 (#609) · 8214d6ca
  Tianqi Chen authored 7 years ago
  
  8214d6ca
- [TOPI] modify conv2d_transpose schedule (#613) · a152a9cb
  Yuwei Hu authored 7 years ago
  
  a152a9cb
Nov 02, 2017
- [INTRIN] Enable popcount (#606) · 685f78d0
  Yuwei Hu authored 7 years ago
  
  * enable popcount intrin * fix lint * add test * fix python3
  685f78d0
Nov 01, 2017
- Fixed build with metal on MacOS with case-sensitive FS (#601) · 3bb2eef5
  Cyril Lashkevich authored 7 years ago
  
  3bb2eef5
Oct 30, 2017
- vgg16 workload error fixed (#598) · 3c895464
  Leyuan Wang authored 7 years ago
  
  3c895464
Oct 27, 2017
- [TOPI] Support ceil_mode in pooling (#593) · 88662130
  Tianqi Chen authored 7 years ago
  
  88662130
Oct 26, 2017
- add helpful message to topi test (#592) · 2f2170f4
  masahi authored 7 years ago
  
  2f2170f4
- [ROCM] remove fma dispatch (#591) · 20144de2
  masahi authored 7 years ago
  
  * removed fma dispatch * added comments to explain why remove fma * fix lint * use fmuladd intrin for fma dispatch
  20144de2
- [ROCM] View llvm ir and gcn asm with module.get_source(...) (#590) · 6a5d6165
  masahi authored 7 years ago
  
  * view llvm ir and gcn asm with module.get_source(...) * fix lint
  6a5d6165
- [BUFFER] Smarter slice to detect compactness (#587) · a76851d7
  Tianqi Chen authored 7 years ago
  
  * [BUFFER] Smarter slice to detect compactness * move simplify of begins early
  a76851d7
Oct 25, 2017
- [TOPI] add conv2d_transpose_nchw (#586) · 5f79521b
  Yuwei Hu authored 7 years ago
  
  5f79521b
Oct 24, 2017
- [PYTHON] Allow no de-allocation when exit (#583) · 25f95766
  Tianqi Chen authored 7 years ago
  
  25f95766
- [CODEGEN] Fix CPU compute attribute (#582) · da27cfec
  Tianqi Chen authored 7 years ago
  
  da27cfec
- [DOCS] Fix tag_scope example (#581) · 18e4a1bd
  Wei Chen authored 7 years ago
  
  18e4a1bd
Oct 23, 2017

Update topi/cuda schedules to use target.max_num_threads (#577) · 12218358

masahi authored 7 years ago

* update topi/cuda schedules to use target.max_num_threads

* allow num_thread to be larger than cuda.max_num_threads

* remove get_max_num_threads and make it inline

12218358

Oct 22, 2017
- [PASS] More robust UnrollLoop configuratin (#576) · 0f1e0ff0
  Tianqi Chen authored 7 years ago
  
  0f1e0ff0
- add friendly tips when not found cl and link (#574) · 69759c0c
  Hu Shiwen authored 7 years ago
  
  * add friendly tips when not found cl and link * fix lint
  69759c0c
- [SCHEDULE] Detect duplicate IterVar in reorder (#575) · 1791b121
  Wei Chen authored 7 years ago
  
  1791b121
Oct 20, 2017

[ROCM] Working math function support for ROCm backend, a bug fix in LLVM based codegen (#570) · 326edd76

masahi authored 7 years ago

* added math function support

* bug fix extern func call in llvm based codegen

lint fix

fix build

bug fix extern func call in llvm based codegen

* moved rocm bitcodes detection to python

326edd76

Oct 19, 2017

[PYTHON] Improve equality wrapper (#567) · ab858e3f

Wei Chen authored 7 years ago

use `object.__eq__`(default object identity comparison) as default
implementation of same_as. This should be OK since `EqualOp` and
`NotEqualOp` are pure Python object, `object.__eq__` is sufficient.

ab858e3f

Oct 17, 2017
- [PYTHON] Improve equal sugar (#564) · 9a2f01ab
  Tianqi Chen authored 7 years ago
  
  * [PYTHON] Improve equal sugar * fix comment
  9a2f01ab
- [CODEGEN] Use correct math intrin for metal (#562) · 60510a47
  Tianqi Chen authored 7 years ago
  
  60510a47
Oct 16, 2017
- [ARITH] More caninical simplfy (#561) · 621337d5
  Tianqi Chen authored 7 years ago
  
  * [ARITH] More caninical simplfy * [DEBUG] Use HalideIR with trace logging
  621337d5
- [FIX] Fix target warning (#560) · 9e8bae25
  ziheng authored 7 years ago
  
  * [FIX] Fix target warning * [FIX] Deduplicate options * Fix * Fix
  9e8bae25
- [CODEGEN] Allow link additional module (#559) · 6894d42b
  Tianqi Chen authored 7 years ago
  
  * [CODEGEN] Allow link additional module * fix py3 * add register back
  6894d42b
Oct 15, 2017
- [CODEGEN] Bugfix multiple condition generation (#558) · 163c4795
  Tianqi Chen authored 7 years ago
  
  163c4795
- [CODEGEN] Force not inline compute core for better debug (#557) · 10faa893
  Tianqi Chen authored 7 years ago
  
  * [CODEGEN] Force not inline compute core for better debug * also support llvm4
  10faa893
Oct 14, 2017
- [Refactor] Introduce target generic dispatch system (#556) · eb761f36
  Tianqi Chen authored 7 years ago
  
  * [TVM] Introduce target generic dispatch system * fix target warning
  eb761f36